http://www.perlmonks.org?node_id=950581

mlebel has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I appologies ahead of time if this might seem a bit confusing since it is confusing to me.. feel free to ask for clarifications if i didn't explain myself properly!

So basically, I have a list of numbers (called "NumbersList") which looks like this:

This is a numbers List 10 23432 20 23424 60 45567 20 56756 30 91857 50 29349 10 93729 80 82374 20 82757 30 92785 50 71674 70 81747 20 83758 30 89275 10 19594 60 09214 20 09347 50 83725 90 91845 20 76402 30 90184 The numbers should appear as random

Out of this list of numbers I need to pick out the lines that match the numbers that are inside the next list (called "LookingForTheseNumbersList"). This list looks like this:

20 30 50

The end result of what I am trying to achieve would be a list that would be like this: (printed output)

20 23424 20 56756 20 82757 20 83758 20 09347 20 76402 30 91857 30 92785 30 89275 30 90184 50 29349 50 71674 50 83725

Now beware that the printed output is simply what i would see on the screen. upon each loop passes, I need to run calculations on each "sections" of numbers. for example, the "20" section, I will need to add up all the numbers to the right of them and print the results, then proceed to adding up all the numbers for the "30" section, so on so forth...

I know I am close but for the life of me, I can't figure out how to make it work...This is what the code that does all the work looks like:

#!/usr/bin/perl -w use strict; my $FALSE = 0; my $TRUE = 1; my $Flag = $TRUE; my $NumbersList = "<" . "NumbersList"; my $LookingForTheseNumbersList = "<" . "LookingForTheseNumbersList" +; open NUMBERSLIST, "NumbersList" or die $!; open LOOKINGFORTHESENUMBERSLIST, "LookingForTheseNumbersList" or die $ +!; if ($Flag == $TRUE) { foreach my $Line (<NUMBERSLIST>) { #print "Numbers list number = : $Line"; &FlagEqualsTrue($Line); } } sub FlagEqualsTrue { (my $LookingForNumber) = @_; foreach my $Line (<LOOKINGFORTHESENUMBERSLIST>) { chomp $Line; next if ($Line =~ m/^These/); next if ($Line =~ m/^This/); next if !($Line =~ m/^$LookingForNumber/); print "$Line\n"; } } exit;

Maybe I am approaching this wrong but I am not advanced in perl enough yet to be able to tell.. any thoughts?

Thanks, M

Replies are listed 'Best First'.
Re: Generating a list of numbers from other lists
by Marshall (Canon) on Jan 29, 2012 at 15:46 UTC
    I made a hash out of the numbers to search for list. Then ran through the main list, kept the relevant matching lines, then sorted that list by the first number. While doing that I kept a running sum of each section in %sums. I wasn't sure what the output should look like.

    Update: added the sum calculation. And I now see that there is some text that you want to skip, use a regex for that. Anyway you have another solution.

    #!/usr/bin/perl -w use strict; $|=1; my @numberList = qw (20 30 50); #you get this from a file my %numberHash = map{$_ => 1}@numberList; my @result; my %sums; while (<DATA>) { chomp; my ($first, $second) = split; if ($numberHash{$first}) { push @result, $_; $sums{$first}+= $second; } } @result = sort{ my($firstA) = split(' ',$a); my($firstB) = split(' ',$b); $firstA <=> $firstB }@result; foreach (@result) { my ($first) = split; print "$_ $sums{$first}\n"; } #not sure what format you want output in =prints 20 23424 332444 20 56756 332444 20 82757 332444 20 83758 332444 20 09347 332444 20 76402 332444 30 91857 364101 30 92785 364101 30 89275 364101 30 90184 364101 50 29349 184748 50 71674 184748 50 83725 184748 =cut __DATA__ 10 23432 20 23424 60 45567 20 56756 30 91857 50 29349 10 93729 80 82374 20 82757 30 92785 50 71674 70 81747 20 83758 30 89275 10 19594 60 09214 20 09347 50 83725 90 91845 20 76402 30 90184
      To avoid the expensive split in a sort block, you could separate the results in lists of their own using a second hash:
      my (%result, %sums); while (<DATA>) { chomp; my ($first, $second) = split; if ($numberHash{$first}) { push @{$result{$first}}, $second; $sums{$first}+= $second; } } foreach my $first (sort { $a <=> $b } keys %result) { foreach my $second (@{$result{$first}}) { print "$first $second $sums{$first}\n"; } }
      Just in case the OP ever wants to process tens of megabytes that way :)
        I certainly appreciate your point! And it is well taken.

        I try to modulate my response to the skill level reflected by the original question (with an imperfect heuristic). Sometimes I am more effective at this than at other times. The HoA (Hash of Array) syntax can often be a bit confusing for beginners. I just avoided this "what heck does push @{$result{$first}}"? mean by not doing that. Anyway, I think the collective "Monk Wisdom" has done well for this OP - there a several alternatives, all of which will probably work out just fine.

Re: Generating a list of numbers from other lists
by choroba (Cardinal) on Jan 29, 2012 at 15:31 UTC
    #!/usr/bin/perl use warnings; use strict; my $number_list = '950581.NumberList'; my $looking_for = '950581.LookingFor'; my %remember; open my $NUMBER_LIST, '<', $number_list or die "$number_list: $!"; while (<$NUMBER_LIST>) { next if /Th|^$/; # Skip text and empty lines my ($key, $value) = split; push @{ $remember{$key} }, $value; } close $NUMBER_LIST; open my $LOOKING_FOR, '<', $looking_for or die "$looking_for: $!"; while (<$LOOKING_FOR>) { chomp; for my $value (@{ $remember{$_} }) { print "$_ $value\n"; # Do your calculations here... } } close $LOOKING_FOR;
Re: Generating a list of numbers from other lists
by Not_a_Number (Prior) on Jan 29, 2012 at 19:43 UTC

    Somewhat late, but putting it all together, this is one solution:

    use Modern::Perl; use List::Util qw/ max sum /; use autodie; open my $fh1, '<', 'LookingForTheseNumbersList'; chomp ( my @wanted = <$fh1> ); close $fh1; open my $fh2, '<', 'NumbersList'; my %found; while ( <$fh2> ) { if ( /(\d+)\s+(\d+)/ and $1 ~~ @wanted ) { push @{ $found{$1} }, $2; } } close $fh2; for my $k ( sort { $a <=> $b } keys %found ) { my @nums = @{ $found{$k} }; say "All($k): " . join ' ', @nums; # Do some calculations, eg: say "Max($k): " . max @nums; say "Sum($k): " . sum @nums; say "Avg($k): " . ( sum @nums ) / @nums; say ''; # Print newline }

    Update; removed '.tmp' suffix from input file names

Re: Generating a list of numbers from other lists
by jwkrahn (Abbot) on Jan 29, 2012 at 21:47 UTC
    #!/usr/bin/perl use warnings; use strict; my $NumbersList = 'NumbersList'; my $LookingForTheseNumbersList = 'LookingForTheseNumbersList'; open LOOKINGFORTHESENUMBERSLIST, '<', $LookingForTheseNumbersList or d +ie "Cannot open '$LookingForTheseNumbersList' because: $!"; my %data; while ( <LOOKINGFORTHESENUMBERSLIST> ) { if ( /^(\d+)/ ) { $data{ $1 } = []; } } close LOOKINGFORTHESENUMBERSLIST; open NUMBERSLIST, '<', $NumbersList or die "Cannot open '$NumbersList' + because: $!"; while ( <NUMBERSLIST> ) { if ( /^(\d+)\s+(\d+)/ && exists $data{ $1 } ) { push @{ $data{ $1 } }, $2; } } close NUMBERSLIST; for my $number ( sort { $a <=> $b } keys %data ) { print "$number $_\n" for @{ $data{ $number } }; } exit 0;
Re: Generating a list of numbers from other lists
by johngg (Canon) on Jan 29, 2012 at 23:33 UTC

    If you read the "looking for" file into an array you can use it to construct a regular expression that selects only the lines you want and captures the two numeric elements. It also means you don't have to do any sorting because the array is already in the order you desire. I use a single HoH (hash of hashes) to hold the lines and also the sum of the second terms.

    use strict; use warnings; use 5.010; use Data::Dumper; open my $lookFH, q{<}, \ <<EOD or die qq{open: << HEREDOC: $!\n}; 20 30 50 EOD chomp( my @lookFor = <$lookFH> ); my $lookForRE = do{ local $" = q{|}; qr{^(@lookFor)\s+(\d+)} }; my %accumulate; open my $nosFH, q{<}, \ <<EOD or die qq{open: << HEREDOC: $!\n}; 10 23432 20 23424 60 45567 20 56756 30 91857 50 29349 10 93729 80 82374 20 82757 30 92785 50 71674 70 81747 20 83758 30 89275 10 19594 60 09214 20 09347 50 83725 90 91845 20 76402 30 90184 EOD while ( <$nosFH> ) { next unless m{$lookForRE}; chomp; push @{ $accumulate{ $1 }->{ lines } }, $_; $accumulate{ $1 }->{ sum } += $2; } foreach my $group ( @lookFor ) { say join q{ }, $_, $accumulate{ $group }->{ sum } for @{ $accumulate{ $group }->{ lines } } } print Data::Dumper->Dumpxs( [ \ %accumulate ], [ qw{ *accumulate } ] );

    Here is the output. I have included the Data::Dumper->Dumpxs() representation of the hash to show the resulting data structure.

    I hope this is helpful.

    Cheers,

    JohnGG

Re: Generating a list of numbers from other lists
by mlebel (Hermit) on Jan 30, 2012 at 01:51 UTC

    Wow, thanks guy's, amazing amounts of posts trying to help me...

    Sorry, jwkrahn and johngg, I just saw your posts now after trying the answers above and posting this with my findings.

    Ok, so it seems i might have not explained my self properly and provided an adequate example. Although, you guy's did a great job at the answer.

    I decided to go with choroba's example since it was the simplest for me to understand. As a test script it worked. However when i incorporated this into my real script, it didn't work. After some troubleshooting, it appears that the "my ($Key, $Value) = split;" might have something to do with it.

    I thought i did a good job at providing "real" code, but i apparantly failed : -)<cr> So here is my "real code" this codes fits inside a script so this is why I had the "If ($Flag == $TRUE)" in there. (I mention it incase it makes a difference in the end.) this code fits within the if and the }.<cr> So here is a sample of the real "TmpIPFile":

    Source Destination Packets Bytes 15.254.32.120 10.2.9.2 5 504 79.15.122.235 208.43.3.154 21 2092 79.15.122.235 63.245.217.113 21 2232 79.15.122.235 209.15.236.80 10 1310 79.15.122.235 46.37.179.218 34 4065 63.97.127.34 10.2.9.2 4 471 79.15.122.235 63.141.200.24 19 1811 79.15.122.235 72.251.219.10 437 56713 79.15.122.235 96.7.122.206 215 23318 79.15.122.235 209.200.154.225 77 6257 79.15.122.235 64.94.107.23 13 3436 79.15.122.235 64.74.126.22 23 1527 17.149.36.162 10.2.9.3 14 3416 79.15.122.235 184.25.187.120 49 5772 79.15.122.235 205.251.242.166 21963 32615009 79.15.122.235 12.239.198.71 26 2946 79.15.122.235 184.85.247.120 145 18458 79.15.122.235 184.235.49.15 10 2001 79.15.122.235 207.171.163.162 19 1393 79.31.21.75 10.2.9.2 11 3993 209.68.19.130 10.2.9.2 33 15941 79.15.122.235 64.94.107.16 4 1375 79.15.122.235 207.67.0.233 29 3742 72.247.242.235 10.2.9.2 7 3750 79.15.122.235 64.145.92.232 9 2364 79.15.122.235 208.88.180.89 28 4490 79.15.122.235 94.100.188.227 10 1979 17.149.36.15 10.2.9.3 14 3404 79.15.122.235 128.175.60.118 280 15120 65.54.81.34 10.2.9.2 42 23068 79.15.122.235 209.236.72.16 102 9765 79.15.122.235 65.55.33.50 18 5479 79.15.122.235 17.149.36.197 54 7279 67.148.147.64 10.2.9.2 553 274036 79.15.122.235 204.245.63.99 42 9826 79.15.122.235 207.46.206.74 104 10498 67.148.147.65 10.2.9.3 57 39207 17.172.232.80 10.2.9.3 27 6768 79.15.122.235 217.212.238.134 320 43460 8.8.8.8 10.2.9.6 84 8006 79.15.122.235 74.125.226.176 214 46874 79.15.122.235 23.12.158.224 299 30331 79.15.122.235 68.67.159.207 48 13422 79.15.122.235 208.122.28.12 33 2857 Accounting data age is 0w1d Box#

    Here is the "TmpLookingForIPFile" file:

    10.2.9.2 10.2.9.3 10.2.9.4 10.2.9.5

    And now the script that I used from choroba...:

    #!/usr/bin/perl use warnings; use strict; my $FALSE = 0; my $TRUE = 1; my $Flag = $TRUE; my $number_list = "TmpIPFile"; my $looking_for = "TmpLookingForIPFile"; my $DestDevice = "Box"; my %remember; if ($Flag == $TRUE){ open my $NUMBER_LIST, '<', $number_list or die "$number_list: $!"; while (<$NUMBER_LIST>) { next if /^sh|^\s*Source|^$DestDevice|^Accounting|^|^$|/; # Sk +ip text and empty lines my ($key, $value,) = split; push @{ $remember{$key} }, $value; } close $NUMBER_LIST; open my $LOOKING_FOR, '<', $looking_for or die "$looking_for: $!"; while (<$LOOKING_FOR>) { chomp; for my $value (@{ $remember{$_} }) { print "$_ $value\n"; # Do your calculations here... } } close $LOOKING_FOR; }

    So in the end, I should be getting the list but instead I get nothing. Here is the list I would want from this output (p.s. I need the "# Do your calculations here..." part since this will do the calculations that i need within the loop) The Calculations will be done on the Bytes field of the appropriate ip's:

    5.254.32.120 10.2.9.2 5 504 63.97.127.34 10.2.9.2 4 471 79.31.21.75 10.2.9.2 11 3993 72.247.242.235 10.2.9.2 7 3750 65.54.81.34 10.2.9.2 42 23068 7.148.147.64 10.2.9.2 553 274036 17.149.36.162 10.2.9.3 14 3416 17.149.36.15 10.2.9.3 14 3404 67.148.147.65 10.2.9.3 57 39207 17.172.232.80 10.2.9.3 27 6768 8.8.8.8 10.2.9.6 84 8006

    I hope that it's only a slight modification required to achieve my goal..

    Any Takers?

      next if /^sh|^\s*Source|^$DestDevice|^Accounting|^|^$|/; # X here
      Do you really expect a line without a beginning?

        Ha.. good catch

        That was suppose to be ^^M

        It never crossed my mind that it wouldn't paste properly in here from the script, but it makes sense that it wouldn't. I guess I could always add chomp; on the line right above it and remove the ^^M from there..

Re: Generating a list of numbers from other lists
by mlebel (Hermit) on Jan 30, 2012 at 01:36 UTC

    Wow, thanks guy's, amazing amounts of posts trying to help me...

    Sorry, jwkrahn and johngg, I just saw your posts now after trying the answers above and posting this with my findings.

    Ok, so it seems i might have not explained my self properly and provided an adequate example. Although, you guy's did a great job at the answer.

    I decided to go with choroba's example since it was the simplest for me to understand. As a test script it worked. However when i incorporated this into my real script, it didn't work. After some troubleshooting, it appears that the "my ($Key, $Value) = split;" might have something to do with it.

    I thought i did a good job at providing "real" code, but i apparantly failed : -)<cr> So here is my "real code" this codes fits inside a script so this is why I had the "If ($Flag == $TRUE)" in there. (I mention it incase it makes a difference in the end.) this code fits within the if and the }.<cr> So here is a sample of the real "TmpIPFile":

    Source Destination Packets Bytes 15.254.32.120 10.2.9.2 5 504 79.15.122.235 208.43.3.154 21 2092 79.15.122.235 63.245.217.113 21 2232 79.15.122.235 209.15.236.80 10 1310 79.15.122.235 46.37.179.218 34 4065 63.97.127.34 10.2.9.2 4 471 79.15.122.235 63.141.200.24 19 1811 79.15.122.235 72.251.219.10 437 56713 79.15.122.235 96.7.122.206 215 23318 79.15.122.235 209.200.154.225 77 6257 79.15.122.235 64.94.107.23 13 3436 79.15.122.235 64.74.126.22 23 1527 17.149.36.162 10.2.9.3 14 3416 79.15.122.235 184.25.187.120 49 5772 79.15.122.235 205.251.242.166 21963 32615009 79.15.122.235 12.239.198.71 26 2946 79.15.122.235 184.85.247.120 145 18458 79.15.122.235 184.235.49.15 10 2001 79.15.122.235 207.171.163.162 19 1393 79.31.21.75 10.2.9.2 11 3993 209.68.19.130 10.2.9.2 33 15941 79.15.122.235 64.94.107.16 4 1375 79.15.122.235 207.67.0.233 29 3742 72.247.242.235 10.2.9.2 7 3750 79.15.122.235 64.145.92.232 9 2364 79.15.122.235 208.88.180.89 28 4490 79.15.122.235 94.100.188.227 10 1979 17.149.36.15 10.2.9.3 14 3404 79.15.122.235 128.175.60.118 280 15120 65.54.81.34 10.2.9.2 42 23068 79.15.122.235 209.236.72.16 102 9765 79.15.122.235 65.55.33.50 18 5479 79.15.122.235 17.149.36.197 54 7279 67.148.147.64 10.2.9.2 553 274036 79.15.122.235 204.245.63.99 42 9826 79.15.122.235 207.46.206.74 104 10498 67.148.147.65 10.2.9.3 57 39207 17.172.232.80 10.2.9.3 27 6768 79.15.122.235 217.212.238.134 320 43460 8.8.8.8 10.2.9.6 84 8006 79.15.122.235 74.125.226.176 214 46874 79.15.122.235 23.12.158.224 299 30331 79.15.122.235 68.67.159.207 48 13422 79.15.122.235 208.122.28.12 33 2857 Accounting data age is 0w1d Box#

    Here is the "TmpLookingForIPFile" file:

    10.2.9.2 10.2.9.3 10.2.9.4 10.2.9.5

    And now the script that I used from choroba...:

    #!/usr/bin/perl use warnings; use strict; my $FALSE = 0; my $TRUE = 1; my $Flag = $TRUE; my $number_list = "TmpIPFile"; my $looking_for = "TmpLookingForIPFile"; my $DestDevice = "Box"; my %remember; open my $NUMBER_LIST, '<', $number_list or die "$number_list: $!"; while (<$NUMBER_LIST>) { next if /^sh|^\s*Source|^$DestDevice|^Accounting|^ |^$|/; # Skip text and empty lines my ($key, $value,) = split; push @{ $remember{$key} }, $value; } close $NUMBER_LIST; open my $LOOKING_FOR, '<', $looking_for or die "$looking_for: $!"; while (<$LOOKING_FOR>) { chomp; for my $value (@{ $remember{$_} }) { print "$_ $value\n"; # Do your calculations here... } } close $LOOKING_FOR;

    So in the end, I should be getting the list but instead I get nothing. Here is the list I would want from this output (p.s. I need the "# Do your calculations here..." part since this will do the calculations that i need within the loop):

    5.254.32.120 10.2.9.2 5 504 63.97.127.34 10.2.9.2 4 471 79.31.21.75 10.2.9.2 11 3993 72.247.242.235 10.2.9.2 7 3750 65.54.81.34 10.2.9.2 42 23068 7.148.147.64 10.2.9.2 553 274036 17.149.36.162 10.2.9.3 14 3416 17.149.36.15 10.2.9.3 14 3404 67.148.147.65 10.2.9.3 57 39207 17.172.232.80 10.2.9.3 27 6768 8.8.8.8 10.2.9.6 84 8006

    I hope that it's only a slight modification required to achieve my goal..

    Content restored by GrandFather