Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Challenge Problem: Merging Network Addresses

by Dominus (Parson)
on Oct 12, 2001 at 01:09 UTC ( [id://118346]=perlquestion: print w/replies, xml ) Need Help??

Dominus has asked for the wisdom of the Perl Monks concerning the following question:

I'm posing this problem because it's reasonably (someone I knew really needs to do it) and because I couldn't think of a good way to do it in Perl. I wrote a solution in C in about a hundred lines, but the algorithm depended on using a pair of linked lists, and there didn't seem to be any efficient translation to Perl or to Perl's data structures.


An ISP has a file that lists all their networks, in CIDR format. (If you don't know what this is, see the explanation below.) However, some of these blocks are parts of larger networks. Where all the subnetworks of a larger network appear in the input, they should be removed and replaced with a line representing the large network.

For example, if these lines appear in the input:

209.152.214.112/30
209.152.214.116/31
209.152.214.118/31
then they should be replaced with this line:
209.152.214.112/29
because the output 209.152.214.112/29 is exactly equal to the union of the three inputs.

If those three lines had been these instead:

209.152.214.112/30
209.152.214.116/32
209.152.214.118/31
then the output would be the same as the input, because there are no complete larger networks that are unions of these. 209.152.214.112/29 is ruled out because the original input did not include the address 209.152.214.117.

If the input contained these lines instead:

209.152.214.112/31
209.152.214.116/31
209.152.214.118/31
then the output would be
209.152.214.112/31
209.152.214.116/30
These two networks cannot be merged into 209.152.214.112/29 because the original input does not include the address 209.152.214.114.

See below for more information about CIDR format network addresses.

You may assume that the input is in order, sorted by ascending IP address. You may assume that if a.b.c.d/n appears in the input, then the low-order (32-n) bits of a.b.c.d will all be 0. For example, 1.2.3.252/29 will never appear; if this network is in the input, it will be represented as 1.2.3.248/29. The input has one network per line, and the output should be in the same format as the input. The output should be minimal, so that if the output is fed back into your program, it is edmitted unchanged because there is nothing more to do.

The ISP has a file with thousands of networks, so the program should be reasonably efficient.

Sample inputs and outputs are available from my web site.


Explanation of CIDR format network addresses

A network number represents a contiguous sequence of IP addresses. It has the form a.b.c.d/m where a, b, c, and d are between 0 and 255, and m is between 2 and 32. a.b.c.d indicates a 32-bit number, as is usual with IP addresses. The netblock a.b.c.d/n represents the network comprising all the IP addresses whose first m bits match the first m bits of the number a.b.c.d, and whose remaining bits have any values at all. For example:

130.91.6.0/32
represents the IP address 130.91.6.0 only.

130.91.6.0/31
represents the two addresses 130.91.6.0 and 130.91.6.1.

130.91.6.0/30
represents the four addresses 130.91.6.0, 130.91.6.1, 130.91.6.2, and 130.91.6.3.

130.91.6.12/30
represents the four addresses 130.91.6.12, 130.91.6.13, 130.91.6.14, and 130.91.6.15.

Sometimes two netblocks can be combined into a single one. For example, the two netblocks

192.241.36.8/30
192.241.36.12/30
happen to represent the same set of addresses as
192.241.36.8/29
If the program input contains the first two netblocks, the program should recognize that and combine them into a single netblock for output.

See Subnetting and CIDR for a more detailed explanation.

--
Mark Dominus
Perl Paraphernalia

Replies are listed 'Best First'.
Re: Challenge Problem: Merging Network Addresses
by merlyn (Sage) on Oct 12, 2001 at 01:14 UTC
    Use Net::CIDR's cidr2range, sort the ranges, combine the ones that are overlapping or adjacent, then use range2cidr to go back.

    -- Randal L. Schwartz, Perl hacker


    update Shoot, I totally missed cidradd in the same module, which does precisely that!
    $ perl use Net::CIDR qw(cidradd); print join (", ", cidradd(qw{ 209.152.214.112/30 209.152.214.116/31 209.152.214.118/31 })), "\n" ^D 209.152.214.112/29 $ perl use Net::CIDR qw(cidradd); print join (", ", cidradd(qw{ 209.152.214.112/31 209.152.214.116/31 209.152.214.118/31 })), "\n" ^D 209.152.214.112/31, 209.152.214.116/30
      That's a great answer. Thanks a lot.

      I can believe that Net::CIDR is the fastest way to do this in Perl. Unfortunately, it may be too slow. My test program took about 2.5 minutes to process test file nm5.in. The C program is about 120-150 times faster.

      That's a bummer. I was hoping that there would be a Perl solution that would be fast enough to be useful.

      --
      Mark Dominus
      Perl Paraphernalia

        I was hoping that there would be a Perl solution that would be fast enough to be useful

        Of course if you patched the module with XS and your C code there would be :-) Just a random thought to increase your workload for the benefit of everyone (except perhaps you).

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Challenge Problem: Merging Network Addresses
by fjonckers (Novice) on Oct 12, 2001 at 01:39 UTC
    maybe you could translate the addresses first to their binary value.. you can then do calculations with them (bitwise) finally translate back to IP's just a thought.. Filip
      I'd thought about doing it similarly, but in a bit more detail:

      o Convert the addresses to binary strings, and remove the bits that represent the network (so a /32 becomes a 32 char long string, a /24 24 bits)
      o For each line, see if there is a matching line, except with opposite LSB - if there is, delete them and replace it with a string one digit shorter.
      o Repeat until you run out of matches.

      the hatter

Re: Challenge Problem: Merging Network Addresses
by gbarr (Monk) on Oct 13, 2001 at 03:17 UTC
    After some discussion with merlyn on IRC, I came up with the following.
    use Socket; my $str = ""; while(<>) { chomp; my ($ip,$bits) = m,^(\d+(?:\.\d+){3})/(\d+)$, or next; $str .= ";" . unpack("B*",inet_aton($ip)); substr($str,-$bits) = "#" x $bits if $bits = 32 - $bits; } $str .= ";"; 1 while $str =~ s/;([01]+)0(#*);\1.\2;/;${1}#${2};/g; my @blocks = $str =~ /[10#]+/g; foreach my $block (@blocks) { my $bits = 32 - ($block =~ tr/#/0/); print inet_ntoa(pack("B*", $block)),"/$bits\n"; }

    Which actually, I think, shows some errors in 3 .out files on your site

    --- nm3b.out Thu Oct 11 21:54:06 2001 +++ nm3b.xx Sat Oct 13 00:08:51 2001 @@ -1,4 +1,3 @@ -130.91.7.0/31 -130.91.7.2/31 +130.91.7.0/30 255.255.255.254/31 255.255.255.255/32 --- nm7.out Thu Oct 11 21:54:07 2001 +++ nm7.xx Sat Oct 13 00:09:10 2001 @@ -1,4 +1,3 @@ -1.2.3.0/28 -1.2.3.16/28 +1.2.3.0/27 6.6.6.0/27 6.6.7.0/26 --- nm8.out Thu Oct 11 21:54:07 2001 +++ nm8.xx Sat Oct 13 00:09:10 2001 @@ -1,5 +1,4 @@ -1.2.3.0/28 -1.2.3.16/28 -5.4.3.1/31 +1.2.3.0/27 +5.4.3.0/31 6.6.6.0/27 6.6.7.0/26

    Update:

    That code does not handle the following case

    101.33.45.27/30 101.33.34.0/24

    use Socket; my @blocks; while(<DATA>) { chomp; my ($ip,$bits) = m,^(\d+(?:\.\d+){3})/(\d+)$, or next; push @blocks, unpack("B*",inet_aton($ip)); substr($blocks[-1],-$bits) = "#" x $bits if $bits = 32 - $bits; } my $str = ";" . join(";", sort @blocks) . ";"; 1 while $str =~ s/;([01]+)(#*);\1[01#]*;/;${1}${2};/g; 1 while $str =~ s/;([01]+)0(#*);\1.\2;/;${1}#${2};/g; my @blocks = $str =~ /[10#]+/g; foreach my $block (@blocks) { my $bits = 32 - ($block =~ tr/#/0/); print inet_ntoa(pack("B*", $block)),"/$bits\n"; }
Re: Challenge Problem: Merging Network Addresses
by runrig (Abbot) on Oct 16, 2001 at 03:22 UTC
    Better late than never, eh? This uses a hash to store all the ranges, has to do an explicit sort and so doesn't depend on the input being sorted. It runs in about 10 seconds on your nm5.in sample file (which is a good deal better than the 3 1/2 minutes it took me to run Net::CIDR and end up with non-optimal results). I haven't tried real hard to break it, and it doesn't check for valid data, so buyer beware.

    There was a similar (generic) range merging script posted awhile ago here on PM that used a hash and an accumulating count, I can't find it at the moment or remember who posted it, but credit to them, whoever they may be :)

    Updated with tye's suggestion from this node.

    use strict; use warnings; use Socket qw(inet_ntoa inet_aton); my @masks = map { pack("B*", substr("1" x $_ . "0" x 32, 0, 32)) } 0..32; my @bits2rng = map { 2**(32 - $_) } 0..32; my %rng2bits = map { $bits2rng[$_] => $_ } 0..32; my %ips; while (<>) { my ($ip, $mask) = split "/"; my $start = inet_aton($ip) & $masks[$mask]; my $end = pack("N", unpack("N", $start) + $bits2rng[$mask]); $ips{$start}++; $ips{$end}--; } my ($start, $total); for my $ip (sort keys %ips) { $start = $ip unless $total; unless ($total+=$ips{$ip}) { my $diff = unpack("N", $ip) - unpack("N", $start); while ($diff) { (my $zeros = unpack("B*", $start)) =~ s/^.*1//; my $range; for my $i (32-length($zeros)..32) { $range=$bits2rng[$i], last if $bits2rng[$i]<=$diff; } print inet_ntoa($start), "/", $rng2bits{$range}, "\n"; $start = pack("N", unpack("N", $start)+$range); $diff -= $range; } } }
    Update: See also Net::CIDR::Lite ?? (Merge CIDR addresses).
Re: Challenge Problem: Merging Network Addresses
by merlyn (Sage) on Oct 13, 2001 at 03:28 UTC
(tye)Re2: Challenge Problem: Merging Network Addresses
by tye (Sage) on Oct 23, 2001 at 00:24 UTC
Re: Challenge Problem: Merging Network Addresses
by mattr (Curate) on Oct 12, 2001 at 13:17 UTC
    Or you could post your code and somebody could try it with Inline::C that would be neat..

    Move SIG!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://118346]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2024-03-19 06:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found