Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Working with large amount of data

by massa (Hermit)
on Sep 21, 2009 at 10:56 UTC ( [id://796518]=note: print w/replies, xml ) Need Help??


in reply to Working with large amount of data

This:
#!/usr/bin/perl use strict; use Socket; unlink for </tmp/prefix.*.ips>; my %prefixes; while( <> ) { while( /((\d+)\.\d+\.\d+\.\d+)/g ) { open my $f, '>>', "/tmp/prefix.$2.ips"; $prefixes{$2}++; print $f inet_aton($1), "\n" } } for ( sort {$a <=> $b} keys %prefixes ) { my %addresses; open my $f, '<', "/tmp/prefix.$_.ips"; while( <$f> ) { chomp; $addresses{$_}++ } printf "%-20.20s => %d\n", inet_ntoa($_), $addresses{$_} for sort {$ +a <=> $b} keys %addresses } unlink "/tmp/prefix.$_.ips" for keys %prefixes __END__ 64.233.169.17 => 1 64.233.169.18 => 1 75.101.152.211 => 2 85.17.189.130 => 4 127.0.0.1 => 8 174.129.112.136 => 1 174.129.233.130 => 1
should work Ok (tested here with far less than a billion IPs). If you think it's still hogging the memory, you can just substritute
while( /((\d+)\.\d+\.\d+\.\d+)/g ) {
for
while( /((\d+\.\d+)\.\d+\.\d+)/g ) {
and be done with it.
[]s, HTH, Massa (κς,πμ,πλ)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://796518]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-03-28 19:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found