Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Quickly determine which IP range an IP belongs to.

by punch_card_don (Curate)
on Nov 30, 2010 at 01:48 UTC ( #874413=perlquestion: print w/ replies, xml ) Need Help??
punch_card_don has asked for the wisdom of the Perl Monks concerning the following question:

Misty Monks,

I have a long list of IPs and IP ranges that correspond to various users. Then I'm given a single IP address and need to determine which user it is. For example, my user IPs might look like:

User 1 = abc.def.ghi.001 abc.def.ghi.002 abc.def.ghi.003
User 2 = jkl.mno.pqr.0/255.255.192.0
User 2 = stu.vwx.0.0/255.255.192.0 (yes, another range for User 2)

and so on for dozens or hundreds of users.

Then, given a single IP adress, determine which user it is. Prefereably without having to do something brute force like looping through every range with a subnet mask module to test against my IP.

All good ideas will be greatly appreciated.

Thanks




Time flies like an arrow. Fruit flies like a banana.

Comment on Quickly determine which IP range an IP belongs to.
Re: Quickly determine which IP range an IP belongs to.
by ikegami (Pope) on Nov 30, 2010 at 01:59 UTC
    1. Convert the addresses and masks into 32-bit numbers or 4-byte strings.
    2. For each range,
      1. "And" ("&") the specified IP with the range's mask
      2. If the result is the start IP of the masks's range, the address is in that range.

    If you have lots of ranges, you could group them by mask.

    $user_lookup{inet_aton($mask)}{inet_aton($start_addr)} = $user_num;
Re: Quickly determine which IP range an IP belongs to. (two keys)
by tye (Cardinal) on Nov 30, 2010 at 05:40 UTC

    The route I recently went that worked well for the mix of IPv4 addresses and ranges that I had was to, for IPv4 addresses, just use the IP address as a hash key. For IP ranges, I used the first 3 octets of the range as the hash key and the value was a list of ranges that I looped through (it was rare that I had more than one range having the same first three octets). I didn't have any ranges that covered more than one three-octet block, but if I had, I would split such ranges up.

    my %hash= ( '127.0.0.1' => 'localhost', '10.11.12.1' => 'router', '10.11.12.10' => 'desktop', '10.11.12.' => [ { min => 101, max => 120, desc => 'printers' }, { min => 250, max => 254, desc => 'tanks' }, ], ); sub findIP { my( $ip )= @_; my $desc= $hash{$ip}; return( $ip, $desc ) if $desc; my( $last )= $ip =~ s/(\d+)$//; my $ranges= $hash{$ip}; for my $range ( @{ $ranges || [] } ) { my( $min, $max )= @$range{'min','max'}; return( "$ip.$min-$max", $range->{desc} ) if $min <= $last && $last <= $max; } return; }

    - tye        

Re: Quickly determine which IP range an IP belongs to.
by Anonymous Monk on Nov 30, 2010 at 05:57 UTC
    As long as you don't mean blindingly fast and huge. You want Net::Netmask.
    #!/usr/bin/env perl use strict; use warnings; use Net::Netmask; # our owners and their ranges my @blocks = ( [ 'bob', '192.168.1.0/24' ], [ 'nancy', '192.168.1.12' ], [ 'ralph', '192.168.2.0/255.255.255.0' ], [ 'bob', '192.168.3.0/24' ], [ 'NOBODY', '0.0.0.0/0' ], ); # populate the table with tagged blocks for my $block (@blocks) { my ($owner, $range) = @$block; my $nb = Net::Netmask->new2($range); $nb->tag('owner', $owner); $nb->storeNetblock(); }; # do the lookup. while (my $ip = <DATA>) { chomp $ip; my $block = findNetblock($ip); print "$ip belongs to ", $block->tag('owner'), "\n"; } __DATA__ 192.168.1.45 192.168.1.12 192.168.2.4 192.168.3.33 192.168.4.12
    Output:
    $ ./net_netmask.pl 192.168.1.45 belongs to bob 192.168.1.12 belongs to nancy 192.168.2.4 belongs to ralph 192.168.3.33 belongs to bob 192.168.4.12 belongs to NOBODY
    This works rather well unless you're trying to grok the full internet route tables or such. (Somewhere there's a XS module that implements the routing table from BSD if you need really fast/big).
Re: Quickly determine which IP range an IP belongs to.
by salva (Monsignor) on Nov 30, 2010 at 07:05 UTC
    sort + binary search
Re: Quickly determine which IP range an IP belongs to.
by BrowserUk (Pope) on Nov 30, 2010 at 10:07 UTC

    I concur with salva. Once you expand your ranges and store them sorted in a packed binary file, lookup via binary search are very fast.

    I generated a file of 125,000 ips (an average of 125 allocated to each of 1000 users).

    Single lookups with a cold cache take about 1/100th of a second, which is probably quicker than you could load the file of ranges and populate a hash:

    C:\test>echo 92.103.2.204 | 874413.pl 92.103.2.204 allocated to user;552 1 lookups (1 found) in 0.011 seconds (0.011339/sec) C:\test>echo 63.203.110.110 | 874413.pl 63.203.110.110 allocated to user;532 1 lookups (1 found) in 0.012 seconds (0.012047/sec) C:\test>echo 109.224.107.185 | 874413.pl 109.224.107.185 allocated to user;139 1 lookups (1 found) in 0.012 seconds (0.012018/sec) C:\test>echo 122.228.116.36 | 874413.pl 122.228.116.36 allocated to user;485 1 lookups (1 found) in 0.013 seconds (0.012670/sec) C:\test>echo 0.0.0.0 | 874413.pl 1 lookups (0 found) in 0.007 seconds (0.007276/sec) C:\test>echo 127.0.0.1 | 874413.pl 1 lookups (0 found) in 0.007 seconds (0.007116/sec) C:\test>echo 2.103.2.204 | 874413.pl 1 lookups (0 found) in 0.008 seconds (0.008237/sec)

    Once the cache is hot--as would be the case if doing a large number of lookups--that speeds up considerably to well under 1 millisecond per:

    >perl -MList::Util=shuffle -nE"chomp(@i=shuffle<>); say +(split':')[0] for @i[0..1e3];last" ips.t +xt |874413.pl >nul 1001 lookups (1001 found) in 0.456 seconds (0.000455/sec)

    My crude test code:

    #! perl -slw use strict; use Time::HiRes qw[ time ]; open IPS, '<raw', 'ips.bin' or die $!; my $size = -s( *IPS ); $size /= 6; sub find { local $/ = \6; my $toFind = unpack 'N', pack 'C4', split '\.', shift; my( $lo, $hi ) = ( 0, $size ); my( $ip, $u ); while( $lo <= $hi ) { my $pos = ( $lo + $hi ) >> 1; seek IPS, $pos*6, 0; ( $ip, $u ) = unpack 'Nn', <IPS>; $hi = $pos - 1, next if $ip > $toFind; $lo = $pos + 1, next if $ip < $toFind; return join('.',unpack'C4',pack'N',$ip), $u if $ip == $toFind; } return; } my $start = time; my $found = 0; while( <> ) { chomp; my( $ip, $user ) = find( $_ ); #warn "$_ unassigned\n" and next unless $user; ++$found; printf "%s allocated to user;%u\n", $ip, $user; } my $taken = time() - $start; warn sprintf "%d lookups ($found found) in %.3f seconds (%.6f/sec)\n", + $., $taken, $taken / $.;

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Quickly determine which IP range an IP belongs to.
by punch_card_don (Curate) on Nov 30, 2010 at 12:55 UTC
    Wow - all excellent suggestions. Rather than roll my own, I'm leaning toward using Net::Netmask. I had seen it, but didn't realize it could do quite exactly that.

    Thanks, all.

Re: Quickly determine which IP range an IP belongs to.
by valdez (Monsignor) on Nov 30, 2010 at 14:10 UTC

    I would use Net::Patricia, from the documentation:

    This module uses a Patricia Trie data structure to quickly perform IP address prefix matching for applications such as IP subnet, network or routing table lookups. The data structure is based on a radix tree using a radix of two, so sometimes you see patricia implementations called "radix" as well. The term "Trie" is derived from the word "retrieval" but is pronounced like "try". Patricia stands for "Practical Algorithm to Retrieve Information Coded as Alphanumeric", and was first suggested for routing table lookups by Van Jacobsen. Patricia Trie performance characteristics are well-known as it has been employed for routing table lookups within the BSD kernel since the 4.3 Reno release.

    Ciao, Valerio

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://874413]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (20)
As of 2014-07-22 14:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (116 votes), past polls