Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Count and List Items in a List

by Tech77 (Novice)
on Nov 01, 2005 at 14:53 UTC ( #504596=perlquestion: print w/ replies, xml ) Need Help??
Tech77 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I have just started using Perl at work to help out with tasks here and there so I am new to this. Recently, I created this script to grab a list of records from a database. It actually works! I'm pretty thrilled.
open (FILE, "userstab.txt") || die "Cannot open file.\n"; open (NEWFILE, ">results.txt") || die "Cannot find or open file for ed +iting.\n"; while (<FILE>) { @zapschool = split (/\t/); if (m/zaps/i) { print NEWFILE "$zapschool[4]\n"; } } close FILE; close NEWFILE;

This gives me a list of all the entries in a column, but now I need to figure out how to use Perl to go through my new list, results.txt, and create a new list that holds the names and a count of how many times a partcular name appears on the list.

For example, my list contains college and university names. Many of them appear multiple times. I'd like to create a list where each unique record appears only once but with a frequency count of how many times it appears.

Can you offer some guidance on how to proceede? I'm not necessarily looking for finshed code, but more a point in the right direction so I can do it myself. Thank you.

Comment on Count and List Items in a List
Download Code
Re: Count and List Items in a List
by japhy (Canon) on Nov 01, 2005 at 15:01 UTC
    When you think "unique" and "frequency", think of a hash. Its keys are always unique, and you can use its values as a place to store the number of times each key shows up:
    for my $word (@list) { $frequency{$word}++; }
    Now the %frequency hash holds each word (but only once!) and $frequency{$some_word} is the number of times $some_word appeared in @list.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Count and List Items in a List
by Perl Mouse (Chaplain) on Nov 01, 2005 at 15:24 UTC
    use strict; use warnings; my %count; open my $data, "<", "userstab.txt" or die "open: $!"; open my $result, ">", "results.txt" or die "open: $!"; while (<$data>) { chomp; my $zapschool = (split /\t/)[4]; $count{$zapschool}++; print $result "$zapschool\n" if /zaps/; } close $data or die "close: $!"; close $result or die "close: $!"; while (my ($school, $count) = each %count) { printf "%s appears %d times\n", $school, $count; }
    Perl --((8:>*
Re: Count and List Items in a List
by perlfan (Curate) on Nov 01, 2005 at 15:36 UTC
    Assuming you get "results.txt" into an array, "@lists":
    my %counts = (); map {$counts{$_}++} @list;
    And the hash "%counts" will contain the item as a key and its count as the key's value. This is really a short version of the first post, but map is a neat function :). pF

      Except map is painfully slow in void context:

      This is perl, v5.6.1 built for MSWin32-x86-multi-thread Rate map foreach map 953/s -- -25% foreach 1278/s 34% --
      This is perl, v5.8.0 built for MSWin32-x86-multi-thread Rate map foreach map 1705/s -- -25% foreach 2288/s 34% --
      This is perl, v5.8.0 built for i386-freebsd Rate map foreach map 847/s -- -29% foreach 1190/s 40% --

      This may have been fixed since.

        This is perl, v5.8.6 built for i686-linux Rate foreach map foreach 47168/s -- -2% map 48260/s 2% --

        That said, I always prefer code that does what it says and says what it does. To me, for/foreach evaluates code foreach element in a list, while map maps (or transforms) one list into another. If you want to iterate over a list, use for or foreach. If you want to transform from one list to another, use map.

Re: Count and List Items in a List
by ikegami (Pope) on Nov 01, 2005 at 16:09 UTC
    while (<FILE>) { @zapschool = split (/\t/); if (m/zaps/i) { print NEWFILE "$zapschool[4]\n"; } }
    is slower than
    while (<FILE>) { if (m/zaps/i) { @zapschool = split(/\t/); print NEWFILE "$zapschool[4]\n"; } }
    which can be made more readable as
    while (<FILE>) { if (m/zaps/i) { my $zapschool = (split(/\t/))[4]; print NEWFILE "$zapschool\n"; } }
    And the solution is
    my %count; while (<FILE>) { if (m/zaps/i) { my $zapschool = (split(/\t/))[4]; ++$count{$zapschool}; } } foreach (keys(%count)) { print NEWFILE "$_: $count{$_}\n"; }
      Hey, This is great! Thank you, and thanks to all the other folks who responded. I did this based on the code:
      open (FILE, "userstab.txt") || die "Cannot open file.\n"; open (NEWFILE, ">results.txt") || die "Cannot find or open file for ed +iting.\n"; my %count; while (<FILE>) { if (m/zaps/i) { my $zapschool = (split(/\t/))[4]; ++$count{$zapschool}; } } foreach (keys(%count)) { print NEWFILE "$_: $count{$_}\n"; } close NEWFILE; close FILE; print "Done!\n";
      WooHoo!
Re: Count and List Items in a List
by holli (Monsignor) on Nov 01, 2005 at 16:13 UTC
    You better move that array splitting into your conditional, like so:
    if (m/zaps/i) { @zapschool = split (/\t/); print NEWFILE "$zapschool[4]\n"; }
    That will give you a performance boost because in your code all line get split, here just the neccessary lines.

    Please note (good to impress your boss :) that you can do this in a oneliner:
    C:\>perl -naF/\t/ -e "print qq($F[4]\n) if /zaps/" infile>outfile
    or the counting:
    C:\>perl -naF/\t/ -e "$hash{$F[4]}++ if /zaps/;END{print qq($_\t$hash{ +$_}\n) for sort keys %hash}" infile>outfile


    holli, /regexed monk/
Re: Count and List Items in a List
by Tech77 (Novice) on Nov 02, 2005 at 19:15 UTC
    Wow! Thanks everyone for all the responses. I'm going to try all these variations and I need to get more familiar with hashes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://504596]
Approved by monkfan
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2015-07-06 03:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (69 votes), past polls