Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^4: How to optimize a regex on a large file read line by line ?

by poj (Abbot)
on Apr 16, 2016 at 16:28 UTC ( [id://1160653]=note: print w/replies, xml ) Need Help??


in reply to Re^3: How to optimize a regex on a large file read line by line ?
in thread How to optimize a regex on a large file read line by line ?

Please give me your execution times with the same code

Using my own 200 million record 2Gb file, it takes 25 secs to get a count of lines only and 50 seconds with the regex included. (win 10 i5 3.3GHz/8GB AS v5.16.1)

#!perl use strict; my $testfile = '200-million-combos.txt'; unless (-e $testfile){ open OUT,'>',$testfile or die "$!"; my $record = '890123456'; for (1..200_000_000){ print OUT $record."\n"; } close OUT; } my $counter1 = 0; my $counter2 = 0; my $t0 = time; open FH, '<', $testfile or die "$!"; while (<FH>) { ++$counter1; if (/123456$/){ ++$counter2; } } close FH; my $dur = time-$t0;; print "$counter1 read in $dur secs\n";
poj

Replies are listed 'Best First'.
Re^5: How to optimize a regex on a large file read line by line ?
by John FENDER (Acolyte) on Apr 16, 2016 at 18:01 UTC
    Sound good to my hear. Which distribution/version are you using ?
      This is perl 5, version 16, subversion 1 (v5.16.1) built for MSWin32-x +64-multi-thread (with 1 registered patch, see perl -V for more detail) Binary build 1601 [296175] provided by ActiveState http://www.ActiveSt +ate.com Built Aug 30 2012 18:41:50
        12,6 min on my side with a newer perl, same distro like yours :
        :perl -v This is perl 5, version 22, subversion 1 (v5.22.1) built for MSWin32-x +64-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2015, Larry Wall Binary build 2201 [299574] provided by ActiveState http://www.ActiveSt +ate.com Built Jan 4 2016 12:12:58
        Could you give me your time with this code and the same file (http://mab.to/tbT8VsPDm) perl demo.pl
        open (FH, '<', "../Tests/10-million-combos.txt"); $counter=0; $counter2=0; while (<FH>) { if (/123456$/) {++$counter2;} } print "\n"; print "Num. Line : $. - Occ : $counter2\n"; close FH;
        Thanks.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1160653]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-19 03:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found