Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Combining Duplicate entries in an Array

by mmartin (Monk)
on Feb 22, 2012 at 18:43 UTC ( #955598=perlquestion: print w/ replies, xml ) Need Help??
mmartin has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Ok, so this is driving me NUTS!!! I have this array (below) which is a simple 2D-Array. Which contains at position [x][0] will contain a Identifier number like "94.22.10.141.72" and the element in the same spot but at [x][1] will contain a single integer value, i.e. any number 0,1,2...100...etc.

Here is what the array looks like after running the following:
for (my $x = 0; $x <= $#tmp_numClients; $x++) { print "$AP_numClients[$x][0] \t $AP_numClients[$x][1]\n"; } __________ OUTPUT ____________ 19.25.55.11.144.0 5 19.25.55.14.16.0 12 19.25.59.200.208.0 8 19.25.59.204.160.0 7 19.25.60.5.176.0 4 19.25.60.15.48.0 0 19.25.60.17.240.0 3 19.25.60.18.96.0 5 19.25.115.138.224.0 30 19.25.115.141.32.0 4 26.109.108.64.144.0 1 38.153.162.89.0.0 1 38.153.162.89.0.1 0 38.153.162.89.96.0 0 38.153.162.89.96.1 0 38.153.162.95.64.0 0 38.153.162.95.64.1 0 58.152.64.24.192.0 1 58.152.64.24.192.1 0 58.152.64.46.48.0 3 58.152.64.46.48.1 0 58.152.94.71.0.0 1 58.152.94.71.0.1 0
Now if you notice in the printout, some of the ID elements are almost the exact same ID except that if there is a similar one, the first one will end in "0" and the second one will end in "1".
What I've been trying to do is loop through the array and if the next element is the same as the current one then add the integer values together from the 2nd column and make it one element in a new array.

So some pseudo code would be (might be easier to understand this way):

my @newArray; for (my $x = 0; $x <= $# AP_numClients; $x++)<br> { $AP_numClients[$x][0] =~ s/\.[0-1]$//; #--> remove the last di +git and "." if it ends in 0 or 1 # If current element matches the next element, then add their val +ues together... if ($AP_numClients[$x][0] =~ /$AP_numClients[$x+1][0]/) { $newArray[$x][0] = $AP_numClients[$x][0]; $newArray[$x][1] = $AP_numClients[$x][1] + $AP_numClients +[$x+1][1]; $x++; } else { $newArray[$x][0] = $AP_numClients[$x][0]; $newArray[$x][1] = $AP_numClients[$x][1]; } }

I'm sure you get the gist of what I'm trying to say, but one more time... Take you through the 'code':
-- Loop
-- remove last "." and digit "0" or "1"
-- if this element matches the next element then add their values and set to new array
-- else just set the current element to the new array element
-- next


I hope I'm being clear enough...

If you need me to explain more let me know...
Any suggestions would be great.


Thanks in Advance,
Matt

Comment on Combining Duplicate entries in an Array
Select or Download Code
Re: Combining Duplicate entries in an Array
by BrowserUk (Pope) on Feb 22, 2012 at 19:04 UTC

    If you step through the array backward, you can delete (splice) out the redundant elements safely.

    This will do it, but the code doesn't format nicely:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; my @a = map[ split ], <DATA>; substr( $a[$_-1][0],0,-1 ) eq substr( $a[$_][0],0,-1 ) and do{ $a[$_-1][1] += $a[$_][1]; splice @a, $_, 1; } for reverse 1 .. $#a; pp \@a; __DATA__ 19.25.55.11.144.0 5 19.25.55.14.16.0 12 19.25.59.200.208.0 8 19.25.59.204.160.0 7 19.25.60.5.176.0 4 19.25.60.15.48.0 0 19.25.60.17.240.0 3 19.25.60.18.96.0 5 19.25.115.138.224.0 30 19.25.115.141.32.0 4 26.109.108.64.144.0 1 38.153.162.89.0.0 1 38.153.162.89.0.1 0 38.153.162.89.96.0 0 38.153.162.89.96.1 0 38.153.162.95.64.0 0 38.153.162.95.64.1 0 58.152.64.24.192.0 1 58.152.64.24.192.1 0 58.152.64.46.48.0 3 58.152.64.46.48.1 0 58.152.94.71.0.0 1 58.152.94.71.0.1 0

    Output:

    C:\test>junk [ ["19.25.55.11.144.0", 5], ["19.25.55.14.16.0", 12], ["19.25.59.200.208.0", 8], ["19.25.59.204.160.0", 7], ["19.25.60.5.176.0", 4], ["19.25.60.15.48.0", 0], ["19.25.60.17.240.0", 3], ["19.25.60.18.96.0", 5], ["19.25.115.138.224.0", 30], ["19.25.115.141.32.0", 4], ["26.109.108.64.144.0", 1], ["38.153.162.89.0.0", 1], ["38.153.162.89.96.0", 0], ["38.153.162.95.64.0", 0], ["58.152.64.24.192.0", 1], ["58.152.64.46.48.0", 3], ["58.152.94.71.0.0", 1], ]

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Hey BrowserUk, thanks for the reply.

      Cool I'll give that a try, thanks! As I was writing my original post I began thinking about possibly being a better idea to loop in reverse. So thanks for that...


      Thanks Again,
      Matt


      Ok, so I modified you example a little bit. Didn't change anything except the way it's structured. The way you did it makes sense but is a little hard for me to read, and possibly maintain later if I ever have to change something.

      Here is what I modified to:
      for (my $x = $#temp; $x >= 0; $x--) { if ( substr($temp[$x-1][0], 0, -1) eq substr($temp[$x][0], 0, -1)) { $temp[$x-1][1] += $temp[$x][1]; splice @temp, $x, 1; } }
      Does the same thing just a little more long winded and simpler for a noob..


      Thanks Again for Your Help,
      Matt

Re: Combining Duplicate entries in an Array
by aaron_baugher (Deacon) on Feb 22, 2012 at 19:27 UTC

    This may be easier done with a hash, and that way it also doesn't matter if the "duplicates" come right after each other. Pseudo-code:

    create a hash foreach key and value pair in 2d array drop .\d from key if key exists in hash increase the key's value by the current value from array else add current key/value to hash loop through hash, printing keys and values

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.

      Hey Aaron, thanks for the reply.

      My experience with hashes is limited, that's why I decided to go with an array to do this stuff.
      But that array is one of about 5 total arrays that contain info like that. But my end result will be a hash with all 5 arrays combined using the "ID" as a key.

      I've already been able to combine those arrays into a hash, I just had to get rid of those "Duplicates" first cause it can get pretty confusing with a hash and all the brackets..

      But thanks again,
      Matt

Re: Combining Duplicate entries in an Array
by Eliya (Vicar) on Feb 22, 2012 at 19:34 UTC

    BrowserUk's approach is fine (of course!), but just to elaborate what was wrong with yours:

    • Your match ($AP_numClients[$x][0] =~ /$AP_numClients[$x+1][0]/) is the wrong way round.  The shorter (sub)string should be on the pattern side — i.e. swap $x with $x+1 index.
    • You should generally quotemeta search strings in a pattern, if they contain meta-characters ("." here).
    • The @newArray would better be populated using push instead of with the same index ($x) that you use for the original array (as you have it, you'd get "undef" holes in the new array).
      Hey Eliya, thanks for your reply.

      Oooohh that's it... I don't know why but for some reason "push" didn't come to me as I was doing this. Maybe I shoulda got more sleep last night haha... And the "." in a pattern match I did have a line in my script on one of my many many many different attempts that did a regex search and replace, that looked for "." and replaced it with "\." but got a strange result I think. But oh well seems to be working correctly with BrowserUk's example.

      Great, well thanks for the clarification, I really appreciate it!


      Thanks Again,
      Matt


Re: Combining Duplicate entries in an Array
by JavaFan (Canon) on Feb 22, 2012 at 21:17 UTC
    Untested code:
    my @copy = $AP_NumClients[0]; for (my $i = 1; $i < @AP_NumClients; $i++) { my $id = $AP_NumClients[$i][0]; if ($id =~ s/\.1$/.0/ && $id eq $copy[-1][0]) { $copy[-1][1] += $AP_NumClients[$i][1] } else { push @copy, $AP_NumClients[$i]; } } @AP_NumClients = @copy;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://955598]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2014-12-20 11:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (95 votes), past polls