Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Counting incidents of names in a file

by Bishma (Beadle)
on Feb 13, 2002 at 04:47 UTC ( [id://145072]=perlquestion: print w/replies, xml ) Need Help??

Bishma has asked for the wisdom of the Perl Monks concerning the following question:

This is probably pretty simple, but my brain can't seem to come up with a solution and my O'Reily Books are inacessable at the moment.

I have a pipe delineated list of the type
name_x|score|date name_y|score|date name_y|score|date name_z|score|date etc...
and I need to be able to count how many times each name appears in the file so that my output would look something like
name_x = 1 name_y = 2 name_z = 1



I can't even come up with any good search keywords for this (so I can look for past answers and not have to bother anyone). Any help is appreciated.

Replies are listed 'Best First'.
Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 04:54 UTC
    Ask yourself this:

    How can I keep track of unique "keys" using a native perl data structure?

    %IthinkYouKnowTheAnswer;

      Yeah, but I really don't like hashes. I like to keep my data in the order I want it to be in. It's a completely irrational and unfounded prejudice, I know, but it's still there.
        That's like saying I like Carpentry but I don't like drills. Then you spend your time trying to bore a hole with your screwdriver, the cabinet takes for ever to build and it's not all that sturdy.
        Or, I'm going to write a novel, but I'm not going to use adjectives.
        Hashes are one of the basic tools of the language. You wouldn't code a large C project without pointers, would you?
        Problems that would be innefficient using arrays like existance checks and counting occurances are quick and painless with hashes. And order is as simple as:
        foreach (sort keys %hash) { my $item = $hash{$_}; ... }
        Not much worse than:
        foreach my $item (@array) { ... }
        Plus there is no effort involved in inserting and delete and maintaining order.
        I usually judge the progress of junior perl programmers by their use of hashes. When they stop trying to use arrays to do the job of a hash, they've leveled up in perl. (BTW, regexp are the second tier, then map/grep)
        Of course this is all just my opinion,

        -pete
        Entropy is not what is used to be.

        If your concern is just in keeping the names in the same order in which they're seen, there are two approaches. The easiest is to look into Tie::IxHash. This is a variant of the hash that preserves the order of keys as they are inserted.

        The second way, that doesn't require installing a new module, is to have your loop also push all newly-discovered names onto an array, then use the array to iterate over the hash rather than the keys keyword.

        Don't be so quick to dismiss the basic constructs that Perl provides. They are here for a reason, and when you ask a very basic question you have to expect that your initial answers are going to be pointers to the these basic elements. At the very least, if you are going to ask such a basic question then you should state up front why you don't want to use the basic solution.

        --rjray

Re: Counting incidents of names in a file
by dvergin (Monsignor) on Feb 13, 2002 at 05:17 UTC
    The solution of the Unnamed One is correct but a little terse if you are still learning. Here's a more spelled-out version of the same general idea:
    #!/usr/bin/perl -w use strict; my %hash; # Do it for (<DATA>) { my ($name, $score, $date) = split /\|/; $hash{$name}++; } #Show it for (keys %hash) { print "$_ = $hash{$_}\n"; } __DATA__ name_x|score|date name_y|score|date name_z|score|date name_x|score|date name_z|score|date name_z|score|date
    And just to explain what is happening with AM's solution: $name{(split/\|/)[0]} += 1; (split/\|/) returns a list which is then subscripted to get the zeroth element which is then used as the key for the %name hash. The value associated with that key (which may be magically created if it didn't exist before) is increased by one using the += assignment operator.

    Just a word of encouragement about hashes. They are a wonderful tool for many purposes. And it is commonly said that you are not really programming in Perl until you can think in terms of hashes. Your point about their lacking a fixed order is well taken, but once you begin using hashes, you may be pleasantly surprised to discover how often that doesn't matter.

      Ok, the hash makes sense. Thanks for you help. Now (since I know next to nothing about hashes and I don't have my books) I need to ask another question. I kept my question simple for ease of understanding, but now I need to get more indepth.

      My data set also contain a "class" element like so:
      name_x|score|date|class1 name_y|score|date|class2 name_y|score|date|class2 name_a|score|date|class2 name_b|score|date|class3 name_z|score|date|class1 name_b|score|date|class3 name_x|score|date|class1 name_b|score|date|class3 name_c|score|date|class2 name_c|score|date|class3 name_c|score|date|class3 ...and so on
      I need 3 seperate lists (actually html tables) based on the class (3 possible classes) and I need the lists in decending order by number of incidents of the names. so we get:
      _class1_ name_x = 2 name_z = 1 _class2_ name_y = 2 name_c = 1 _class3_ name_b = 3 name_c = 2
      I know this is getting a little complex, but I'm lost.
      Thanks again.
        This looks like feature creep to me, or should we say, wanting others to do all the work. Please post some code to show that you have attempted to solve this problem on your own. The other monks it appears were very generous with your first question, but coming back with no attempt to solve on your own is not a good idea. That said I am sure someone will post a solution because most of us just can't help ourselves. :)
        Simply modify my last code to include the classes.
        use strict; use Data::Dumper; my @all; my %occurances; while (<DATA>) { chomp $_; push @all, [ split(/\|/,$_) ]; # I should explain what is going on here # the -1 is going to get the last element # in a list. In this case the first [-1] # is the last list of elements added # from the DATA section below with the push # function. The second -1 # is the last element of that list which # is the class. The next key in the # occurances hash is the name_? value # which is extracted from the last array # (-1) pushed onto the @all and the first # element (0) of the annoymous array in # in that (-1) location of @all. $occurances{$all[-1][-1]}{$all[-1][0]}++; # class # name_? } print Dumper(\@all); print Dumper(\%occurances); __DATA__ name_x|score|date|class1 name_y|score|date|class2 name_y|score|date|class2 name_a|score|date|class2 name_b|score|date|class3 name_z|score|date|class1 name_b|score|date|class3 name_x|score|date|class1 name_b|score|date|class3 name_c|score|date|class2 name_c|score|date|class3 name_c|score|date|class3
Re: Counting incidents of names in a file
by trs80 (Priest) on Feb 13, 2002 at 06:46 UTC
    Here is a solution that doesn't create any temp values and it keeps all your information in order in an array. A hash is in my opinion the easier way to count the occurances.
    use Data::Dumper; my @all; my %occurance; while (<DATA>) { chomp; push @all, [ split(/\|/,$_) ]; $occurance{$all[-1][0]}++; } print Dumper(\@all); print Dumper(\%occurance); __DATA__ name_x|score|date name_y|score|date name_y|score|date name_z|score|date name_z|score|date


    I included the Data::Dumper part for monks that haven't used it yet so they can see how it can be used to confirm content without doing a foreach or similar operation to see content of a hash or array.
(Duplicate: to be deleted) Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 05:01 UTC
    while( <> ){ $name{(split/\|/)[0]} += 1; } foreach( keys $name ){ print "$_ = $name{$_}\n"; }
Re: Counting incidents of names in a file
by Anonymous Monk on Feb 13, 2002 at 05:03 UTC
    while( <> ){ $name{(split/\|/)[0]} += 1; } foreach( keys %name ){ print "$_ = $name{$_}\n"; }
Re: Counting incidents of names in a file
by Bishma (Beadle) on Feb 13, 2002 at 22:46 UTC
    Ok, with your help I think I managed to find a solution. It's a little sloppy, but I'll clean it up later.
    @classes = qw/ class1 class2 class3/; foreach (@classes) { my %kboard; print "_ $_ _<BR>"; for ($i = 0; $i <= $#scoredata; $i++) { @logdata = split /\|/, $scoredata[$i]; if ($logdata[4] eq $_) { $kboard{$logdata[0]}++; } } for (keys %kboard) { print "$_ = $kboard{$_}<BR>\n"; } print "<BR><BR>"; }

    @scoredata is what I read my data file into. Thanks again everyone.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://145072]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2024-04-25 02:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found