Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

frequency strings 2 files

by philbertcheese86 (Initiate)
on Jul 05, 2012 at 22:48 UTC ( #980175=perlquestion: print w/replies, xml ) Need Help??
philbertcheese86 has asked for the wisdom of the Perl Monks concerning the following question:

Hi I'm new to perl (2 weeks). I'm understanding everything in my textbook just fine, but the projects our teacher gives us are way harder than anything we've covered. He wants us to do the following.

Write a Perl program that will read from two files specified on the command line. Both files will be a list of positive integers (one entry per line) and terminated with a zero. Your program should (ignoring the zeros) output, in the given order, each of the numbers in the second file followed by a single space and then the number of times that value appears in the first file.

If testfile1 has (5 3 2 7 5 4 3 2 5 0) in it, and testfile2 has (5 2 3 1 0) running the program will look like

$ ./frequency testfile1 testfile2 5 3 2 2 3 2 1 0

Here's my attempt at a solution

#!/usr/bin/perl -w #use strict; open(FILE2, "<testfile2"); #giving testfile2 data a filehandle @array2 = <FILE2>; #saving filehandle to array pop @array2; #getting rid of 0 at end %hash2 = @hash2{@array2}; #storing contents of testfile2 without 0 to #a hash open(FILE1, "<testfile1"); #giving testfile1 data a filehandle @array1 = <FILE1>; #saving filehandle to array pop @array1; #getting rid of 0 at end foreach $number (keys %hash2) { #go through each of the hashs' keys #and assign them to $number for duration of loop some keys might be #used more than once if ($number eq search(@array1)) {#check to see if string $number #equals each iteration of subroutine search being passed values of #@array1. Would have used another foreach loop here, but parentheses #and curly brace placement won't allow it. $hash2{number} += 1; } } sub search { #subroutine search gets one element of @array1 at a #time and returns it. Then moves on to the next value in that array. foreach (@_) {} return $_; } #Ideally here would like to print the array followed by the hash #elements print (@array, %hash2);

I get no error messages. I really can't emphasize enough that the 13 pages on hashes in the llama book do not address anything nearly as complex as searching through every key of a hash AND comparing those keys to each element of an array. More specifically several things are not done to the same hash. There is an example that does this:

foreach $person (sort keys %book) { if ($books{person}) { print $person has $books{$person} items\n"; } }

so I have a similar method checking to see if the keys of my hash match each value in an array. I appreciate the solutions provided by others, but what is wrong with my solution?

Replies are listed 'Best First'.
Re: frequency strings 2 files
by choroba (Bishop) on Jul 05, 2012 at 23:20 UTC
    Open the second file, read the numbers into an array. Then, initialize a hash with keys from the array:
    undef @hash{@array};
    Then open the first file. Read it line by line, for each number, if it exists as a key in the hash, then increment the hash value.

    Finally, go over the array and print the corresponding hash value for each member (you might skip the print if the value is not defined as in your example).

    What part do you have problems with?

      #!/usr/bin/perl open(FILE2, "<testfile2"); @lines2 = <File2>; %hash2 = @hash{@lines2}; open(FILE1, "<testfile1"); @lines1 = <FILE1>; foreach (@lines1) if ($_ == %hash2); %hash2 +=1

      That's as far into your advice as I could get before getting lost. Our book hasn't shown us the @hash{@array} anywhere in examples yet.<\p>

        In @hash{@lines2} you are referring to the hash %hash which you haven't initialized, so %hash2 would end up empty if your program ran - but it doesn't because of other errors, as you may know.

        If your book hasn't shown you @hash{@array} and you are not sure what it means or how to use it, you might be better off solving your problem some other way. There are many ways to solve the problem. I don't see a need for %hash2 myself, nor even for @lines2. Rather than reading all the lines from file 2 into an array then puting the values into a hash, you can read all the lines in a while or foreach loop and deal with them one at a time: "in the given order".

        That's not a mess... THIS is a mess:

        open$x,pop;@x=map{chomp;$_}<$x>;pop@x;map{chomp;$x{$_}++}<>;print map{ +$_,$",$x{$_}||0,$/x2}@x

        when run with your test files yields:

        >perl testfile1 testfile2
        5 3
        2 2
        3 2
        1 0


Re: frequency strings 2 files
by Cristoforo (Curate) on Jul 05, 2012 at 23:56 UTC
    It will be no good for someone to give a solution to you without you working through the solution on your own.

    You should see your teacher if you are having problems. That is the only way he will know if he is covering the material well enough for the students.

    I panicked(?) when I got a programming assignment at school and I went to see the teacher. He didn't give a solution, but re-assured me that the assignment wasn't that difficult. He might have also given some idea where to head for a solution.

    I would keep count of the numbers (and their repetitions) in a hash for file 1, then open the second file and print out the counts in the from file 1 for each number in file 2.

Re: frequency strings 2 files
by ww (Archbishop) on Jul 06, 2012 at 00:05 UTC

    Perhaps your teacher has more faith in your ability to do your homework than you do -- in which case, the best cure will be doing even better than the teacher expects (but that does NOT mean asking the Monks to do the work).

    Or, perhaps you've given the homework more consideration than you've shown us. That's a bad mistake, because we expect you to show some effort -- code, warnings and or error messages (verbatim) or failure modes.

    And describing a "mess" does not satisfy that criterion. Show us.

    And, you have a perfectly good approach already described by choroba, above and an alternate, if the teacher is expecting you to use material that hasn't been explicitly lectured or covered (perhaps implicitly) in assigned reading.

Re: frequency strings 2 files
by xyzzy (Pilgrim) on Jul 06, 2012 at 04:21 UTC

    Zerost: until you are comfortable enough to code blindfolded, every program you write should start with the following two lines:

    #!/usr/bin/perl -w use strict;

    First: go over the problem, separate it into individual components. what do you need to do? what values do you need to keep track of? what are the steps that need to be taken? this is basic programming 101 stuff. before you write a single line of code, you need to be able to explain the exact procedure step by step (in pseudocode, English, pictures, whatever).

    Second: look at the specific features of the problem and the language and how they work together. The files are given in a certain order, maybe that is telling you something. Perl has a handy mechanism to read input line by line, maybe you should use that.

    Third: you have an incredibly useful resource at your disposal if you run into problems. It is called perldoc, and you can run if from the shell (it even works with windoze implementations of perl) or browse it online. It has sections for every topic and very detailed explanations for every built-in function, every operator, every special variable. READ IT!

    Finally: do not ask the monks to do your assignment. Even if they give you a solution, it will most likely be obfuscated or use an overly-elegant, roundabout way to do it, and if your prof looks at it he will instantly know that you have no idea how or why it works. However, if you read the docs, used strict and warnings, tried an approach that seems to be valid and your code still does things you do not expect or understand, post it as a specific question, such as "Errors reading from filehandle" or "Hash keys aren't matching", provide your code, your expected output, your actual output, and your reasoning as best as you can understand the issue. Then you actually will get an answer explaining how that specific function or operator works, and you will learn how to use the language.

    At the risk of being ostracized from the community, I will give you two huge hints.

    1. You can solve the problem with just two variables: a filehandle and a hash
    2. If you try to modify the value of a key that doesn't exist, it is autovivified (automatically created)

    $,=qq.\n.;print q.\/\/____\/.,q./\ \ / / \\.,q.    /_/__.,q..
    Happy, sober, smart: pick two.
Re: frequency strings 2 files
by aaron_baugher (Curate) on Jul 06, 2012 at 02:24 UTC

    Your attempt does sound overly complex, since I can't see a need for more than one hash, and (maybe) one array. I have no idea what you're doing with 4 hashes, and "reversing a hash" really doesn't make sense, since hashes aren't ordered.

    Anyway, without seeing your attempt, the best I can do is say that my simplest solution would be to make a hash from the first file, with the numbers as keys and the number of times seen as values. Then I'd read lines from the second file, and print each number along with its value from the hash. That could introduce some inefficiency, since numbers from the first file that aren't in the second file (like 7 in your example) will still be put in the hash. choroba's solution avoids that by going through the second file first and creating an array to store the order, but that means extra storage for the array. Which is better would depend on how many values in the first file aren't found in the second file, but either way should work.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: frequency strings 2 files
by 2teez (Vicar) on Jul 06, 2012 at 06:46 UTC

    Assignments are given to 'rob in' what you have been taught or to provoke you to read more or reserach further, asking questions.
    I will give a working code (one of the several ways), but with some lines replaced with comments. Find, understand and fill in the blank spaces and you have your assignment answered.

    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my ( ____, ____ ) = @ARGV; ## get the filenames from the CLI my %number_freq; ## declare an hash to use open my $fh, '<', $file2 or die "can't open cos:$!"; while (<$fh>) { chomp; next if $_ == 0; ## Using each value read from file handler as an hash key, initial +ize the hash value for ## each key to 0 __________________________________________; } close $fh or die "can't close: $!"; open $fh, '<', $file1 or die "can't open cos:$!"; while (<___>) { ## fill in this chomp; next if $_ == 0; ## If the hash key exists increase it's value by 1. __________________________________________; } close $fh or die "can't close: $!"; print Dumper \%number_freq; ## used to display the result
    In addition to this, in place of using Data::Dumper module with the print function to output your answer, you will have to write out your own print output.
    Please, do not attempt to run this without filling in the blank spaces.
    Hope this helps

Re: frequency strings 2 files
by cavac (Deacon) on Jul 07, 2012 at 17:18 UTC

    First of all, you might invest a few minutes in formatting your code, which will make it much easier to follow (even for you). You might want to try out perltidy.

    For a step by step guide on doing some simple operations on multiple files, take a look at 976888.

    If you are not sure what your program is doing, there are many ways to debug it. Print statements (also take a look at Data::Dumper), the perl debugger (use SuperSearch), ...

    Sorry for any bad spelling, broken formatting and missing code examples. During a slight disagreement with my bicycle (which i lost), i broke my left forearm near the elbow. I'm doing the best i can here...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://980175]
Approved by ig
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2018-06-24 00:12 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.