Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

IP prefixes: match IP against "best" network

by shmem (Chancellor)
on Jun 29, 2007 at 11:17 UTC ( #624076=CUFP: print w/replies, xml ) Need Help??

This is a fast storage and lookup method to match IP-addresses against IP-prefixes (or networks).

In a recent SoPW, Fangorn asked about caching of Net::Patricia objects, to make that cache available to multiple processes.

I had a brief look at Net::Patricia and what all that was about and thought "hmm, tree lookup? don't we have that close to the perl core?"

Turns out that DB_File, using a database of type DB_BTREE, fits that purpose very well. Memory requirements are minimal, and lookup time is no more than 5 micromiliseconds worst case (tested with 1.000.000 randomly generated prefixes).

DB_BTREE provides partial match on keys of an ordered binary tree. To use that feature effectively, network addresses are converted into bit strings, zipped (as in fly zip) with their prefix netmask bit string. The resulting string is used as key to the DB_BTREE table.

Example: having the CIDR notation 10.223.2.0/23, we get

net: 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 key: 1010101011101110111110111111111110101010101011000000000000000000

We get thus a 64 bit key (well, 64 char string :-). Upon search, the searched IP adress is zipped with the netmask 0xffffffff in the same manner, and starting with the leftmost 2 bits, a partial match against the tree is done. DB_BTREE returns the next key which is equal or greater than the key at hand. The resulting key is splitted back into network and netmask, and it is checked whether the ip matches the resulting range. If so, the value is pushed to a list. While the search key matches the returned key, further bits (2 at a time) are added. The search is repeated until there's a mismatch or no new key is returned, which means we're done.

The last element in the resulting list is the network prefix matching the given IP, with the most bits set in its mask.

#!/usr/bin/perl use DB_File; use Fcntl; use Getopt::Std; # 120000 2.217429, avg. 16 lookups, 360.434319750957/sec my $maxprefix = 1000000; # ip prefix limit my %o; getopts('f:ast',\%o); unless (@ARGV || grep { $_ } @o{qw(t s)}) { die <<EOH; usage: $0 -s [-f file] | [-a ] [ -t | ipaddress ] -s : setup -f : use values from file for setup ("xxx.xxx.xxx.xxx/nn nnnnn") -t : average timing -a : show all matching prefixes EOH } # unless @ARGV || grep {defined $_ } values %o; my $filename = 'astable.db'; my $x; # DB_File handle if ($o{s}) { if(! -e $filename) { $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_B +TREE or die "cannot open $filename: $!\n"; } else { die "file '$filename' exists, won't overwrite!\n"; } } else { $x = tie %h, "DB_File", $filename, O_RDONLY, 0666, $DB_BTREE or die "cannot open $filename: $!\n"; } # set up AS table. To perform a binary search, # we build a bit string as a sequence of alternating bits # of mask and network address, i.e. having 10.223.2.0/23, we get # # net: 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 + 0 # mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 +0 # key: 10101010111011101111101111111111101010101010110000000000000000 +00 # # for ip searches, we set the "mask part bits" of an ip all to 1, and +do # a incremental partial match against the keys in the table. if ($o{s}) { # setup # generate some random IP addresses and masks # set all non-mask bits to 0 and store the generated # IP address (minus 1, if last byte is non-zero) # in the files 'ips' for further use in lookup timing unless (-f 'ips') { open IP,'>', 'ips'; } else { die "file 'ips' exist, won't overwrite!\n"; } my $have_file; if ($o{f}) { $have_file = open I,'<', $o{f} or die "Can't read '$o{f}': $!\ +n"; } my $c; while ($. <= $maxprefix) { my (@bytes,$mask,$value); if($have_file) { $_ = <I> or last; chomp; (@bytes[0..3],$mask,$value) = m!(\d+).(\d+).(\d+).(\d+)/(\ +d+)\s+(.*)!; } else { @bytes = map { int rand 255 } 0..3; $mask = int rand (30) + 1; # no null mask :) } my @nbits = map {split//,unpack "B*",pack "C",$_ } @bytes; my @mbits = ((1) x $mask, (0) x (32 - $mask)); my $key = ''; my @b = (); for (0..31) { my $mbit = shift @mbits; my $n = shift @nbits; my $nbit = $mbit ? $n : 0; $key .= $mbit . $nbit; push @b, $nbit; } redo if $h{$key}; my $prefix = join('.',unpack "C4",pack"B32",join'',@b).'/'.$ma +sk; $bytes[3] += 1 if $bytes[3] == 0; # XXX should check network # boundaries here... my $ip = join('.',@bytes); print IP $ip,"\n"; $h{$key} = $value || $prefix; $.++ unless $have_file; } untie %h; exit 0; } #setup my $c; # lookup counter if($o{t}) { # average open I, '<', 'ips'; eval { require Time::HiRes }; $@ and die "Sorry, no average here - Time::HiRes missing"; Time::HiRes->import(qw(gettimeofday tv_interval)); my $t0 = [gettimeofday()]; my $t1 = [gettimeofday()]; my $l; while(<I>) { chomp; my $net = match($_); $l += $c; unless($. % 1000) { my $e = tv_interval ( $t1, [gettimeofday()]); my $g = tv_interval ( $t0, [gettimeofday()]); print "$. $e, avg. ",int($l/$.); print ' lookups, ',$./$g,"/sec\n"; $t1 = [gettimeofday()]; } } my $elapsed = tv_interval ( $t0, [gettimeofday()]); print "elapsed: $elapsed\n"; print "avg. lookup time: ",$elapsed / $.,"\n"; } else { /^\d+\.\d+\.\d+\.\d+$/ and print "$_ => ",match($_),"\n" for @ARGV +; } sub match { my $ip = shift; my $packedip = pack "C4",split/\./,$ip; my $numericip = unpack "N", $packedip; my @bits = split //,unpack "B*",$packedip; # reset cursor. $x->seq(my $foo, my $bar, R_FIRST); # now search until mismatch. my ($key, $ok, $v, @net); $c = 1; # lookup counter for(@bits) { my $lk = $key .= 1 . $_; #print "$key\n"; next if $ok and $ok =~ /^$key/; # shortcut $x->seq($lk,$v,R_CURSOR); # check if this key is a candidate push @net, $v if $v and in_range($numericip, $lk); unless ($lk =~ /^$key/ && $v && length ($lk) == 64) { if ($o{a}) { print " $_\n" for @net; } print "$c lookups - " if wantarray; return pop @net; # return net with longest mask } $ok = $lk; $c++; die "more than 32 lookups" if $c > 32; } # only one or no prefix found print "$c lookups - " if wantarray; pop @net; } sub in_range { my ($ip, $bits) = @_; my $net; $bits =~ s/(.)(.)/$net .= $2; $1/ge; my ($addr, $mask) = map { pack "B32",$_ } $net, $bits; my ($n,$m,$b) = map { unpack"N",$_} $addr,$mask,$addr|~$mask; $ip >= $n && $ip <= $b; }

The big advantages of this method over Net::Patricia:

  • no additional modules needed
  • tweakable
  • file based caching, so no memory hog
  • data structure available to multiple processes

Disadvantage:

  • slower by an order of magnitude

update: tweaked so you can feed it a real AS-table (e.g. ...ip_to_as/bgp.txt) to setup the database. Running this script against the aformentioned file, I get an average of 0.0021 secs / lookup.

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Replies are listed 'Best First'.
Re: IP prefixes: match IP against "best" network
by Fangorn (Initiate) on Jul 03, 2007 at 12:41 UTC

    Great one, I'm happy to have inspired You :).

    I used (some of) the code in my project, I hope to publish it soon.

    Here's my small script that's using aggregate to, well, aggregate prefixes. That's not perl, I hope it's not a blasphemy here :) . And it actually uses zsh syntax, but it should not be a problem to bashify it.

    #!/usr/local/bin/zsh INDEX=1 while [ $INDEX -lt 70000 ] do PREFIXES=`grep "[[:space:]]$INDEX$" example|awk '{print $1}'` if [ $PREFIXES ] then a=`echo $PREFIXES|aggregate` echo $a|awk '{print $1, "\t'$INDEX'"}' let INDEX=$INDEX+1 else let INDEX=$INDEX+1 fi done

    For me it crunched ~220000 prefixes to ~120000 so I think it's worth to use it. Not only will the data be smaller but also there will be less lookups (it only aggregates data prefixes of a single AS, so if there are overlapping prefixes with different AS numbers they will be preserved).

      HI, What Do I need to try this script? I am new to Perl and I find it fascinating.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://624076]
Approved by Corion
Front-paged by liverpole
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (2)
As of 2021-03-01 01:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?