Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

IP Address consolidation

by yasysad (Novice)
on Aug 20, 2001 at 14:12 UTC ( [id://106172]=perlquestion: print w/replies, xml ) Need Help??

yasysad has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks ..
I am a SysAd who has the task of consolidating a large file with IP Addresses... and what better than PERL for the same ..
I have been through the FAQs and got a few pointers with file operations, etc. What really kills me is the algorithm to do so as IP numbers have their own maths.
Well, here's hoping for enlightenment ..
Input File format : From address, To Address, Name
202.1.2.0,202.2.255.255,a
202.3.0.0,202.3.255.255,a
202.4.0.0,202.4.0.255,b

The above eg. shows that line 2 is the continuation of line one and as the name is the same, the o/p needs to be :

Output File format : From address, To Address, Name
202.1.2.0,202.3.255.255,a
202.4.0.0,202.4.0.255,b

I have tried using a lot of if's and ands, but I'm not able to pick 2 lines and check throughout the file. Thanks a lot.

Replies are listed 'Best First'.
Re: IP Address consolidation
by tadman (Prior) on Aug 20, 2001 at 14:48 UTC
    If you have the capacity, which you should, it would be fairly straightfoward to load all the files into memory, and then write them out. Since these are grouped by name, why not use a Hash of Arrays (HoA):
    my %data; foreach my $file (@file_list) { open (INPUT, $file) || warn "Could not read $file\n"; while (<INPUT>) { chomp; my ($start,$end,$name) = split (/,/); push (@{$data{$name}}, "$start,$end"); } close (INPUT); } foreach (sort keys %data) { print "@{$data{$_}},$_\n"; }
    If you have overlapping entries in the different files, then you will have to check on insert. This could be done with a Hash of Hashes (HoH):
    use Socket; my %data; foreach my $file (@file_list) { open (INPUT, $file) || warn "Could not read $file\n"; while (<INPUT>) { chomp; my ($start,$end,$name) = split (/,/); $start = inet_aton($start); $end = inet_aton($end); if (defined $data{$name}{$start}) { # Resolve conflict? } else { $data{$name}{$start} = $end; } } close (INPUT); } foreach my $name (sort keys %data) { foreach my $start (sort keys %{$data{$name}}) { print join (',', inet_ntoa($start), inet_ntoa($end), $name), "\n"; } }
    The reason for using inet_aton (ASCII to Number) from the Socket module is to simplify comparisons. "202.1.2.0" and "202.01.002.0" are equivalent, and removing redundant zeros is a lot more complicated than just "packing" them into their native format (4 bytes). They are easily unpacked with the complementary inet_ntoa (Number to ASCII), and should always come out clean with no extraneous zeros.

    Additionally, if you want to sort them, which I'm doing here with the regular sort operator, they will sort ASCII-betically, which should put them in order. Numeric sorts are more complicated, especially those with multiple points.

    Update:
    • Fixed inet_ntoa calls in first loop.
      I tried your code and got this error at this line
      $start = inet_ntoa($start);
      the error output is :
      Bad arg length for Socket::inet_ntoa, length is 10, should be 4 at ipconsnew.pl line 16, <INPUT> line 1.

      I am working on ActivePerl on Windows NT .. Am I doing anything wrong ??
      Thanks tadman .. the functions are a revelation .. I may be able to work on them for the resolve conflict ..
      actually, it's the algorithm I was looking for ..
Re: IP Address consolidation
by tadman (Prior) on Aug 20, 2001 at 15:03 UTC
    Your update helped clarify one particular thing here. A little post-processing can help set things straight. Since all the data is stored in organized structures, it can be cleaned up before being printed out. In this case, joining adjacent blocks is a no-brainer.

    The idea is that since IP addresses are just numbers, you can do math on them to add and subtract. In this case, what you want to do is add one to the "end" to see if it matches the next "start". You could write your own add function, but this is a little tedious, with up to three possible carries. Instead, it is much more efficient to render the IP address as a simple 32-bit number and work with it that way. This can be done with unpack which will extract the "raw" 32-bit value of an inet_aton operation. A "N"-type pack is a network-order 32-bit number, a standard way of transporting numbers across the Internet.

    So, once unpacked, you add one, and feed the result back into inet_aton which will give you a new repacked address. This can be extracted, if you like, into the pretty human-readable version we've come to know, using inet_ntoa.

    This code merely compares the data in the hash for any adjacent matches, and when it finds them, puts the end from the second as the end of the first, and deletes the second.
    # This function returns the 32-bit value # of the IP address for numeric comparisons. sub addr_value { return unpack("N", $_[0]); } foreach my $name (sort keys %data) { my $carry; foreach my $start (sort keys %{$data{$name}}) { # Skip keys deleted after keys was calculated next unless defined $data{$name}{$start}; # Add one to the end to determine the next start my $next_start = inet_aton(addr_value($end)+1); # If this block is adjacent to the next one... if (defined $data{$name}{$next_start}) { $carry ||= $start; # ...end this block where that block ended... $data{$name}{$carry} = $data{$name}{$next_start}; # ...and delete that block. delete $data{$name}{$next_start}; } else { # No match, so reset the $carry undef $carry; } } }
    This is just off the top, so your mileage may vary.
Re: IP Address consolidation
by dga (Hermit) on Aug 20, 2001 at 20:27 UTC

    This little program has all the elements needed. It will acculumate IP ranges by the id name and expand the range to include the min and max ip addresses. It does not check for holes between ranges in the same id which if the input data is sane is not needed.

    #!/usr/bin/perl use strict; my($start, $end, $id, %range); while(<>) { chop; ($start, $end, $id)=split(','); $start=pack("C4", (split('\.',$start))); $end=pack("C4", (split('\.',$end))); if($range{$id}) { my($os, $oe)=@{$range{$id}}; $os=$start if($start lt $os); $oe=$end if($end gt $oe); @{$range{$id}}=($os, $oe); } else { @{$range{$id}}=($start, $end); } } foreach my $r ( sort keys %range ) { printf "%vd,%vd,%s\n", @{$range{$r}},$r; }

    The addresses and apparently v strings in general are strings so you have to use string comparison operators on them. But, due to the conversion, 10.20.30.40 is smaller than 10.20.30.100 which as normal strings would not be true. the %vd prints out dotted decimal %vb prints out a bit string %vX prints out an IPv6 address.

Re: IP Address consolidation
by claree0 (Hermit) on Aug 20, 2001 at 14:24 UTC

    I think you need to give us some more detail on what you are trying to do - e.g. what makes line 2 a follow-on from from line 1?

    What comparisons are you doing?

    Clare

      My apologies ..
      The clarifications are thus :
      1. IP Address file has the range of IP Addresses held by the name that appears in the same line.
      2. I need to join adjacent ranges belonging to the same name; thus reducing the number of lines in the file.
      3. In the first example, Line 2 follows Line 1 because 202.3.0.0 is the next IP Address after 202.2.255.255.
      Similarly, if the "From" of one line is the next IP of the "TO" of the previous line, and the name is the same, the record needs to be replaced by the "From" of the first line, the "To" of the following line, and the common "Name"

      Hope this helps

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://106172]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-03-29 13:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found