Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
No such thing as a small change
 
PerlMonks  

Yeehah!

by claree0 (Hermit)
on Aug 22, 2001 at 13:00 UTC ( [id://106929]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to Halve the difference
in thread Removing old records from log files

Well, I've made some mods to your sample code, and taken the trim time on my sample file from 2m26s to 0.4 seconds. Wow!

In the code below, I haven't included the subroutine to calculate the epoch-second date of each line 'cos it's longer htan the rest of the file!

Thank you, Tachyon!

#!/usr/local/perl -w use strict; my $file ='current.txt'; my $daystokeep = $ARGV[0]; my $secs_to_keep = $daystokeep * 3600 * 24; my $now = time(); my $earliest = $now - $secs_to_keep; my $file_size = -s $file; my $top = 0; my $bottom = $file_size; my $count = 0; my $max_tries = 100; open (OLD, "$file") or die $!; open (NEW, ">new.txt") or die $!; while (++$count) { my $middle = int (($top + $bottom) / 2); seek OLD, $middle, 0; my $partial = <OLD>; my $full = <OLD>; my $next = <OLD>; if (((linesecs($full)) < $earliest) && ((linesecs($next)) > $e +arliest)) { print NEW $next; print NEW while <OLD>; exit; } if ((linesecs($full)) < $earliest) { $top = $middle; } else { $bottom = $middle; } } close OLD; close NEW;

Replies are listed 'Best First'.
Re: Yeehah!
by tachyon (Chancellor) on Aug 22, 2001 at 15:17 UTC

    Wow, 36500% faster. That's a worthwhile saving. Glad it helped. It's always good to use a geometric search rather than a linear one when you have any form of sorted data that you can use the split the dif algoritm on. The number of tries to find the desired position is given by:

    print "Num items Geom avg Lin avg Lin:Geom\n"; for ( my $num_items = 2; $num_items < 2<<20; $num_items <<= 1 ) { # geometric my $geom_max = int(log($num_items)/log(2))+1; my $geom_avg = int(log(($num_items/2))/log(2))+1; # linear my $lin_max = $num_items; my $lin_avg = $num_items/2; printf "%8d %8d %8d %8d\n", $num_items, $geom_avg, $lin_avg, $lin_avg/$geom_avg; }
    Should wrap it in a module one day :-)

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://106929]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.