Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot


by claree0 (Hermit)
on Aug 22, 2001 at 17:00 UTC ( #106929=note: print w/ replies, xml ) Need Help??

in reply to Halve the difference
in thread Removing old records from log files

Well, I've made some mods to your sample code, and taken the trim time on my sample file from 2m26s to 0.4 seconds. Wow!

In the code below, I haven't included the subroutine to calculate the epoch-second date of each line 'cos it's longer htan the rest of the file!

Thank you, Tachyon!

#!/usr/local/perl -w use strict; my $file ='current.txt'; my $daystokeep = $ARGV[0]; my $secs_to_keep = $daystokeep * 3600 * 24; my $now = time(); my $earliest = $now - $secs_to_keep; my $file_size = -s $file; my $top = 0; my $bottom = $file_size; my $count = 0; my $max_tries = 100; open (OLD, "$file") or die $!; open (NEW, ">new.txt") or die $!; while (++$count) { my $middle = int (($top + $bottom) / 2); seek OLD, $middle, 0; my $partial = <OLD>; my $full = <OLD>; my $next = <OLD>; if (((linesecs($full)) < $earliest) && ((linesecs($next)) > $e +arliest)) { print NEW $next; print NEW while <OLD>; exit; } if ((linesecs($full)) < $earliest) { $top = $middle; } else { $bottom = $middle; } } close OLD; close NEW;

Comment on Yeehah!
Download Code
Replies are listed 'Best First'.
Re: Yeehah!
by tachyon (Chancellor) on Aug 22, 2001 at 19:17 UTC

    Wow, 36500% faster. That's a worthwhile saving. Glad it helped. It's always good to use a geometric search rather than a linear one when you have any form of sorted data that you can use the split the dif algoritm on. The number of tries to find the desired position is given by:

    print "Num items Geom avg Lin avg Lin:Geom\n"; for ( my $num_items = 2; $num_items < 2<<20; $num_items <<= 1 ) { # geometric my $geom_max = int(log($num_items)/log(2))+1; my $geom_avg = int(log(($num_items/2))/log(2))+1; # linear my $lin_max = $num_items; my $lin_avg = $num_items/2; printf "%8d %8d %8d %8d\n", $num_items, $geom_avg, $lin_avg, $lin_avg/$geom_avg; }
    Should wrap it in a module one day :-)




Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://106929]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2015-11-26 16:42 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (703 votes), past polls