Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^6: Searching large files a block at a time

by marioroy (Parson)
on Aug 04, 2017 at 15:46 UTC ( #1196734=note: print w/replies, xml ) Need Help??

in reply to Re^5: Searching large files a block at a time
in thread Searching large files a block at a time

Another possibility is sending the result to the manager process via MCE->gather. MCE::Candy provides an ordered output iterator.

Unlike the previous demonstration, this one doesn't require MCE 1.830 minimally.

#!/usr/bin/perl use strict; use warnings; use MCE::Loop; use MCE::Candy; my $mbnum = $ARGV[0] or die "usage: $0 mbnum\n"; my @ldif_files = qw( /path/to/file1.ldif.bz2 /path/to/file2.ldif.bz2 /path/to/file3.ldif.bz2 ); MCE::Loop->init( chunk_size => 1, max_workers => scalar @ldif_files, gather => MCE::Candy::out_iter_fh(\*STDOUT) ); mce_loop { my ($mce, $chunk_ref, $chunk_id) = @_; my ($file, $ret) = ($chunk_ref->[0], ''); # Must localize $/ to not stall MCE, fixed in 1.830. # Localizing $/ is recommended, but fixed MCE if not. local $/ = ""; open my $fh, "-|", "/usr/bin/bzcat $file" or warn "open error ($file): $!"; if (defined fileno($fh)) { while (<$fh>) { if (/uid=$mbnum/m) { $ret = "## $file\n"; $ret .= $_; last; } } close $fh; } # The out_iter_fh iterator wants the chunk_id value. # Thus, all participating workers must call gather once only. # The manager process outputs the value for chunk_id 1 first, # then chunk_id 2, et cetera. MCE->gather($chunk_id, $ret); } \@ldif_files; MCE::Loop->finish;

Regards, Mario

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1196734]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (4)
As of 2021-11-28 21:31 GMT
Find Nodes?
    Voting Booth?

    No recent polls found