Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

count sort & output

by mkent (Acolyte)
on Dec 19, 2002 at 02:45 UTC ( [id://221023]=perlquestion: print w/replies, xml ) Need Help??

mkent has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone tell me how to take a hash and sort it such that

1) matching values will be counted and displayed once with a total count, like if www.yahoo.com appears 20 times one line is outputted like "www.yahoo.com - 20"

2) before output, the hash is sorted by number, with the highest number first, and descending?

3) the output is 10 lines at a time such that a user on a web page can click on "next 10" or "previous 10" in order to scroll forward and backward by pages?

Replies are listed 'Best First'.
Re: count sort & output
by sauoq (Abbot) on Dec 19, 2002 at 04:57 UTC
    my @patterns = qw( a b c d e f g h i j k l m n o p q r s t u v w x y z + ); my %hash; my $data = do { local $/; <DATA> }; for (@patterns) { $hash{$_}++ for $data =~ /\Q$_\E/g; } my $counter = 0; for (sort {$hash{$b} <=> $hash{$a}} keys %hash) { print "$_ - $hash{$_}\n"; unless (++$counter % 10) { print "Press Enter"; <STDIN> } } __END__ Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal.
    -sauoq
    "My two cents aren't worth a dime.";
    

      A less verbose way to do something similar:

      my $data = do { local $/; <DATA> }; my $patt = join '|' => map quotemeta, 'a' .. 'z'; my %freq; $freq{$1}++ while $data =~ /($patt)/go; # etc...

      Of course, counting occurrences of mere letters should normally be done using tr. ;-)

      Also, mkent may want to save intermediate (sorted) results to avoid duplicating work.

        Of course I agree that counting letters should be done with tr, so assuming that the strings could be longer and are not required to be of the same length, you're code has a potential subtle bug.

        If both "yahoo.com" and "yahoo.commies.org" are to be searched for, its chances of finding the second are dependent on its position in the regular expression relative to the first. You would have to sort your strings by length and build your regular expression with the longest ones first.

        My code has its own sort of bug(?) in that both "www.yahoo.com" and "yahoo.com" might be found in the string "www.yahoo.commies.org"... however, it works correctly given the poorly stated requirements.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: count sort & output
by Enlil (Parson) on Dec 19, 2002 at 03:30 UTC
    1. by nature keys in hashes are unique. so if your key is www.yahoo.com, it can only appear once in your hash. so just increment $hash{$key} each time the key appears. (i.e. $hash{$key}++;).

    2. You can use sort.

    @descending_values = sort {$hash{$b} <=> $hash{$a} } keys %hash;

    3. Show us what you have already, and we can help you from there.

    -enlil

Re: count sort & output
by pg (Canon) on Dec 19, 2002 at 03:10 UTC
    An OO-style solution:
    test.pl: use hot_web; use strict; my $hot = new hot_web; $hot->count("www.yahoo.com","200"); #set to 200 $hot->count("www.yahoo.com","+20"); #add 200 $hot->count("www.yahoo.com"); #increase by 1 $hot->count("www.google.com","3000"); #set 20 3000 $hot->count("www.google.com","+300"); #add 300 $hot->display; hot_web.pm: package hot_web; use strict; sub new { my $self = {}; bless $self; return $self; } sub count { my $self = shift; my $url = shift; if (@_) { my $count = shift; if ($count =~ /^\+/) { $self->{$url} += $count; } else { $self->{$url} = $count; } } else { $self->{$url} ++; } } sub display { my $self = shift; print "Hot Webs:\n"; foreach (sort {$self->{$a} < $self->{$b}} keys %{$self}) { print "$_ appears: $self->{$_} time(s)\n"; } } 1;
      if (defined($hash{$key}) { $hash{$key} ++; } else { $hash{$key} = 1; }
      That's actually more work than necessary.
      All you need is     $hash{$key}++; because if the element doesn't yet exist, it will be created and initialized ("autovivified") to undef, then the undef will be nummified to 0, then the 0 will be incremented to 1.

      %hash = ("www.yahoo.com", 10, "www.google.com", 2000); foreach (keys %hash) { $rev_hash{$hash{$_}} = $_; } foreach (sort keys %rev_hash) { print "$rev_hash{$_} appears: $_ time(s)\n"; }
      No, that's bad. What if both yahoo and google have a value of 100?
      All you're really trying to do it print out the entries sorted by value, numerically.
      (And, btw, you were sorting lexically, which is also wrong.)
      %hash = ( www.yahoo.com => 10, www.google.com => 2000, ); for ( sort { $hash{$a} <=> $hash{$b} } keys %hash ) { print "$_ appears: $hash{$_} time(s)\n"; }

      jdporter
      ...porque es dificil estar guapo y blanco.

      I don't want to put down the effort you put into this, but using OO for this a task seems like overkill. How is
      $hot->count("www.yahoo.com","200"); #set to 200 $hot->count("www.yahoo.com","+20"); #add 200 $hot->count("www.yahoo.com"); #increase by 1
      any better than a hash?
      $hot{"www.yahoo.com"} = 200; $hot{"www.yahoo.com"} += 20; $hot{"www.yahoo.com"}++

      Note also how the latter makes it immediately obvious what's going on - you don't have to look up a classes' behaviour to know what it does.

      Lastly, if I did take the OO route for some reason - maybe because I need to pass this around a lot of places in my program, and maybe it does a lot more on-the-fly calculation than just counting mentions, or something - I'd prefer an interface like this:

      $hot->count("www.yahoo.com")->set(200); $hot->count("www.yahoo.com")->add(20); $hot->count("www.yahoo.com")->inc;

      Makeshifts last the longest.

Re: count sort & output
by VSarkiss (Monsignor) on Dec 19, 2002 at 03:30 UTC

    Your questions are very vague, on the order of "How do I make my froboz do a wingle without linking the thingamabob?" It's impossible to do everything you ask for with just a hash, so please fill in the picture. Adding some details will go a long way -- look in the Tutorials section for How to ask a question.

    As to the specific question you asked, hashes are not sorted, by nature, unless you impose a sort operation on the keys or values. You can use Tie::IxHash, for example, to do something like you're implying in your item 2. The other two items require code besides just the hash.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://221023]
Approved by newrisedesigns
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-06-16 08:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.