<?xml version="1.0" encoding="windows-1252"?>
<node id="118346" title="Challenge Problem: Merging Network Addresses" created="2001-10-11 21:09:17" updated="2005-08-14 18:55:13">
<type id="115">
perlquestion</type>
<author id="3737">
Dominus</author>
<data>
<field name="doctext">
&lt;p&gt;I'm posing this problem because it's reasonably (someone I knew
really needs to do it) and because I couldn't think of a good way to
do it in Perl.  I wrote a solution in C in about a hundred lines, but
the algorithm depended on using a pair of linked lists, and there
didn't seem to be any efficient translation to Perl or to Perl's data
structures.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;An ISP has a file that lists all their networks, in CIDR format.  (If
you don't know what this is, see the explanation below.)  However,
some of these blocks are parts of larger networks.  Where all the
subnetworks of a larger network appear in the input, they should be
removed and replaced with a line representing the large network.&lt;/p&gt;

&lt;p&gt;For example, if these lines appear in the input:

&lt;blockquote&gt;
        209.152.214.112/30&lt;br&gt;
        209.152.214.116/31&lt;br&gt;
        209.152.214.118/31&lt;br&gt;
&lt;/blockquote&gt;

then they should be replaced with this line:

&lt;blockquote&gt;
        209.152.214.112/29
&lt;/blockquote&gt;

because the output 209.152.214.112/29 is exactly equal to the union of
the three inputs.&lt;/p&gt;

&lt;p&gt; If those three lines had been these instead:

&lt;blockquote&gt;
        209.152.214.112/30&lt;br&gt;
        209.152.214.116/32&lt;br&gt;
        209.152.214.118/31&lt;br&gt;
&lt;/blockquote&gt;

then the output would be the same as the input, because there are no
complete larger networks that are unions of these.  
209.152.214.112/29 is ruled out because the original input did not
include the address 209.152.214.117.&lt;/p&gt;



&lt;p&gt;If the input contained these lines instead: 

&lt;blockquote&gt;
        209.152.214.112/31&lt;br&gt;
        209.152.214.116/31&lt;br&gt;
        209.152.214.118/31&lt;br&gt;
&lt;/blockquote&gt;

then the output would be 

&lt;blockquote&gt;
        209.152.214.112/31&lt;br&gt;
        209.152.214.116/30&lt;br&gt;
&lt;/blockquote&gt;

These two networks cannot be merged into 209.152.214.112/29 because
the original input does not include the address 209.152.214.114.&lt;/p&gt;

&lt;p&gt;See below for more information about CIDR format network addresses.&lt;/p&gt;

&lt;p&gt;You may assume that the input is in order, sorted by ascending IP
address.  You may assume that if
&lt;i&gt;a&lt;/i&gt;.&lt;i&gt;b&lt;/i&gt;.&lt;i&gt;c&lt;/i&gt;.&lt;i&gt;d&lt;/i&gt;/&lt;i&gt;n&lt;/i&gt; appears in the input,
then the low-order (32-&lt;i&gt;n&lt;/i&gt;) bits of
&lt;i&gt;a&lt;/i&gt;.&lt;i&gt;b&lt;/i&gt;.&lt;i&gt;c&lt;/i&gt;.&lt;i&gt;d&lt;/i&gt; will all be 0.  For example,
1.2.3.252/29 will never appear; if this network is in the input, it
will be represented as 1.2.3.248/29.  The input has one network per
line, and the output should be in the same format as the input.  The
output should be minimal, so that if the output is fed back into your
program, it is edmitted unchanged because there is nothing more to
do.&lt;/p&gt;

&lt;p&gt;The ISP has a file with thousands of networks, so the program should
be reasonably efficient.&lt;/p&gt;

&lt;p&gt;Sample inputs and outputs
[http://www.plover.com/~mjd/misc/merge-networks/|are available from my
web site].&lt;/p&gt;

&lt;hr&gt;



&lt;h2&gt;Explanation of CIDR format network addresses&lt;/h2&gt;

&lt;p&gt;A network number represents a contiguous sequence of IP addresses.
It has the form &lt;i&gt;a&lt;/i&gt;.&lt;i&gt;b&lt;/i&gt;.&lt;i&gt;c&lt;/i&gt;.&lt;i&gt;d&lt;/i&gt;/&lt;i&gt;m&lt;/i&gt; where
&lt;i&gt;a&lt;/i&gt;, &lt;i&gt;b&lt;/i&gt;, &lt;i&gt;c&lt;/i&gt;, and &lt;i&gt;d&lt;/i&gt; are between 0 and 255, and
m is between 2 and 32.  &lt;i&gt;a&lt;/i&gt;.&lt;i&gt;b&lt;/i&gt;.&lt;i&gt;c&lt;/i&gt;.&lt;i&gt;d&lt;/i&gt; indicates
a 32-bit number, as is usual with IP addresses.  The netblock
a.b.c.d/n represents the network comprising all the IP addresses whose
first &lt;i&gt;m&lt;/i&gt; bits match the first &lt;i&gt;m&lt;/i&gt; bits of the number
&lt;i&gt;a&lt;/i&gt;.&lt;i&gt;b&lt;/i&gt;.&lt;i&gt;c&lt;/i&gt;.&lt;i&gt;d&lt;/i&gt;, and whose remaining bits have any
values at all.  For example:&lt;/p&gt;

&lt;p&gt;
&lt;blockquote&gt;
        130.91.6.0/32&lt;br&gt;
&lt;/blockquote&gt;

represents the IP address 130.91.6.0 only.
&lt;/p&gt;

&lt;p&gt;&lt;blockquote&gt;
        130.91.6.0/31&lt;br&gt;
&lt;/blockquote&gt;

represents the two addresses 130.91.6.0 and 130.91.6.1.  &lt;/p&gt;

&lt;p&gt;&lt;blockquote&gt;
        130.91.6.0/30&lt;br&gt;
&lt;/blockquote&gt;

represents the four addresses 130.91.6.0, 130.91.6.1, 130.91.6.2, and
130.91.6.3.&lt;/p&gt;

&lt;p&gt;&lt;blockquote&gt;
        130.91.6.12/30&lt;br&gt;
&lt;/blockquote&gt;

represents the four addresses 130.91.6.12, 130.91.6.13, 130.91.6.14,
and 130.91.6.15.&lt;/p&gt;

&lt;p&gt;Sometimes two netblocks can be combined into a single one.  For
example, the two netblocks

&lt;blockquote&gt;
        192.241.36.8/30&lt;br&gt;
        192.241.36.12/30&lt;br&gt;
&lt;/blockquote&gt;

happen to represent the same set of addresses as 

&lt;blockquote&gt;
        192.241.36.8/29&lt;br&gt;
&lt;/blockquote&gt;

If the program input contains the first two netblocks, the program
should recognize that and combine them into a single netblock for output.&lt;/p&gt;

&lt;p&gt;See [http://www.freesoft.org/CIE/Course/Subnet/|Subnetting and CIDR]
for a more detailed explanation.&lt;/p&gt;



&lt;p&gt;
--&lt;br&gt;&lt;font size="-2"&gt;
&lt;a href="mailto:mjd-www-perlmonks+@plover.com"&gt;Mark Dominus&lt;/a&gt;&lt;br&gt;
&lt;a href="http://perl.plover.com"&gt;Perl Paraphernalia&lt;/a&gt;&lt;br&gt;&lt;/font&gt;</field>
</data>
</node>
