Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: Help me beat NodeJS

by rickyw59 (Novice)
on Feb 13, 2016 at 20:59 UTC ( [id://1155178]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Help me beat NodeJS
in thread Help me beat NodeJS

Wow thanks, I'll give this a shot. I will have to read some more on MCE, it looks very useful. I should have clarified, in the "parser" function in nodejs, I'm applying the same regex as perl to be fair. I've done the tests looking for a specific string (/some_string/) and I've done the regex in the above code (/".*?"|\S+/g), which captures everything in an array, since the lines are in this format: ' 1970-01-01 00:00:00 1.1.1.1 "A multi-word field" 2.2.2.2 '

Replies are listed 'Best First'.
Re^4: Help me beat NodeJS
by marioroy (Prior) on Feb 13, 2016 at 22:10 UTC

    Got it. I went ahead and updated both MCE demonstrations to account for pattern matching. The more expensive regex (/".*?"|\S+/g) pattern is processed only if given line matches the initial string pattern. That will likely run faster.

    Likewise, for Parallel::ForkManager.

    #!/usr/local/bin/perl use strict; use warnings; use Parallel::ForkManager; my $pm = new Parallel::ForkManager(24); my $dir = '/data/logs/*.log.gz'; my @files = sort(glob "$dir"); my $pattern = "some_string"; $pm->set_waitpid_blocking_sleep(0); for my $file( @files ) { $pm->start and next; open( my $fh, "-|", "/bin/zcat", $file ) or die "open error: $!\n" +; while ( my $line = <$fh> ) { if ( $line =~ /$pattern/ ) { my @matches = $line =~ /".*?"|\S+/g; print "$matches[0],$matches[1],$matches[3],$matches[4]\n"; } } $pm->finish; } $pm->wait_all_children;

    Regards, Mario

      If you just looking for a plain string not a regex, then it should be quicker to use index

      while (my $line = <$fh> ) { if (index($line,$pattern) != -1 ) { ... } }

        yes, ++

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1155178]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2024-04-19 07:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found