Beefy Boxes and Bandwidth Generously Provided by pair Networks chromatic writing perl on a camel
Don't ask to ask, just ask
 
PerlMonks  

summarization of list

by gman (Friar)
on Mar 14, 2006 at 21:22 UTC ( [id://536755]=perlquestion: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.

gman has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

I am stumpt on where to start with this.
I have a huge list of numbers that correlate to group names.
Please read the comments in the code.
Thanks in advanced! for any help you have to offer.

#!/usr/local/bin/perl -w use strict; my %numbers = ( '210' => 'rg1000', '212' => 'rg1000', '214' => 'rg1000', '215' => 'rg1000', '218' => 'rg1003', '221' => 'rg1000', '222' => 'rg1003', '223' => 'rg1003', '224' => 'rg1003' ); # if the first 2 digit match and point to the same rg then print only +the first two digits with the rg foreach my $item (keys %numbers) { print "$item\n"; } #desired output # 21 => rg1000 # 218 => rg1003 # 22 => rg1003 # 221 => rg1000
UPDATE:
In looking back, it looks like ikegami (thanks!) has it correct. Thanks to all that replied.

Replies are listed 'Best First'.
Re: summarization of list
by ikegami (Patriarch) on Mar 14, 2006 at 22:03 UTC

    I'm not sure how you pick the 'rg' for a given two digit prefix. (e.g. Why is it 22 => rg1003 and not 22 => rg1000.) Do you pick the most popular 'rg' for the given prefix? If so, the following will do the trick:

    my %pop; ++$pop{substr($_, 0, 2)}{$numbers{$_}} foreach keys %numbers; my %most_pop; foreach my $short (keys %pop) { my $short_pop = $pop{$short}; my ($most_pop) = sort { $short_pop->{$b} <=> $short_pop->{$a} } keys %$short_pop; $most_pop{$short} = $most_pop; } my %filtered; foreach (keys %numbers) { my $key = $_; my $rg = $numbers{$_}; my $short = substr($_, 0, 2); my $most_pop = $most_pop{$short}; if ($rg eq $most_pop) { $filtered{$short} = $most_pop; } else { $filtered{$key} = $rg; } } print("$_ => $filtered{$_}\n") foreach sort keys %filtered;
Re: summarization of list
by sgifford (Prior) on Mar 14, 2006 at 22:09 UTC
    Sorting the keys will make this easier. Then keep track of the last number and the last group. While you're iterating through the keys, for each 2-digit group append the output to a string instead of printing it, and keep track of whether all of the groups so far match. When you get to the next group, decide if you can print one line for the group, and otherwise print the accumulated lines.
Re: summarization of list
by Zaxo (Archbishop) on Mar 14, 2006 at 22:13 UTC

    You can simplify the problem by inverting the hash:

    my %rg; for (sort { $a<=>$b } keys %numbers ) { push @{$rg{$_}}, $_; }
    That takes care of finding all the keys that match an rgnnnn value. Now all you need to do is loop over the numbers for each rg key, deciding what to print.

    After Compline,
    Zaxo

Re: summarization of list
by ayrnieu (Beadle) on Mar 14, 2006 at 22:16 UTC
    #! /usr/bin/env perl use strict; use warnings; my %numbers = qw/210 rg1000 212 rg1000 214 rg1000 215 rg1000 218 rg1003 221 rg1000 222 rg1003 223 rg1003 224 rg1003/; my %rgs; push @{$rgs{substr($_,0,2) . ':' . $numbers{$_}}}, $_ for keys %numbers; # The first <=> will warn. for (sort { $a <=> $b or @{$rgs{$b}} <=> @{$rgs{$a}} } keys %rgs) { /^(\d{2}):(.+)/; printf "%d => $2\n", @{$rgs{$_}} == 1 ? $rgs{$_}->[0] : $1; } __END__ $ ./mad 21 => rg1000 218 => rg1003 22 => rg1003 221 => rg1000
      @{$rgs{$_}} == 1? What if I added 225 => rg1000? I get some strange results.

        Updated: .. actually, looking at it again, my original program complies with the spec. But you have probably found a problem with the spec.

Re: summarization of list
by GrandFather (Saint) on Mar 14, 2006 at 22:24 UTC

    The following works for the sample data given.

    use warnings; use strict; my %numbers = ( '210' => 'rg1000', '212' => 'rg1000', '214' => 'rg1000', '215' => 'rg1000', '218' => 'rg1003', '221' => 'rg1000', '222' => 'rg1003', '223' => 'rg1003', '224' => 'rg1003' ); my %groups; my %hits; for (sort keys %numbers) { my $part = substr $_, 0, 2; my $group = $numbers{$_}; if (exists $hits{"$part,$group"}) { # We've seen this prefix with this group before delete $groups {$hits{"$part,$group"}}; delete $groups {$hits{"$part,$group"}.",$group"}; $groups{"$part,$group"} = "$part => $group"; } elsif (exists $hits{$part}) { # new group for existing prefix $groups{"$_,$group"} = "$_ => $group"; $hits{"$part,$group"} = $_; } else { #new prefix $hits{$part} = $_; $hits{"$part,$group"} = "$_,$group"; $groups{"$_,$group"} = "$_ => $group"; } } foreach my $item (sort keys %groups) { print "$groups{$item}\n"; }

    Prints:

    21 => rg1000 218 => rg1003 22 => rg1003 221 => rg1000

    DWIM is Perl's answer to Gödel
Re: summarization of list
by gman (Friar) on Mar 15, 2006 at 07:24 UTC
    Thanks to all that have responed.
    Here is some more helpfull information that I probably
    should have stated from the start.
    The list is given to me in an excel spread sheet. Useing Spreadsheet::ParseExcel::Simple I pull out the
    data and put it in a hash of hash of hash.
    $VAR1 = 'rg0005'; $VAR2 = { '718' => { '559' => 'rg0005', '554' => 'rg0005', '942' => 'rg0005',
    the inner 3 digits is what I have to summarize on.
    So the list is large, but I think I have a starting point.
    Sorting it correctly seems to be the key.
    Any more sugestions are welcome!
    If I get it I will post the code.
    Thanks again
      You still haven't explained what you're trying to do. It's only because some of the other replies belong to near psychics that you have something to work with.

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of

        Ok, let me see if I can explain what I am trying to do.
        I am taking in a list that has a format of:
        npa,nxx,trunkname
        Think phone numbers in north American dial plan.
        the system I am feeding can take the whole 6 digits, but
        will match on lesser digits. So Instead of always
        inputting all 6 digits and pointing that to a trunk group.
        I can say 33021 points to trunk 1000
        And 330215 points to trunk 10002
        Meaning that all other nxx combinations except 330215
        will match trunk 1000.
        Hope this make sense.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://536755]
Approved by GrandFather
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.