Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Reduce CPU utilization time in reading file using perl

by Laurent_R (Abbot)
on Sep 30, 2013 at 11:40 UTC ( #1056337=note: print w/replies, xml ) Need Help??


in reply to Reduce CPU utilization time in reading file using perl

If your file are sorted in accordance with the comparison key, then you can iterate through the two files in parallel. This can be very very fast. Just a couple of hours ago, I compared two 100-MB files this way, it took less than 3 seconds to run.

$time perl compare_files.pl real 0m2.378s user 0m1.384s sys 0m0.069s

Even if they are not sorted, this might still be the solution: first to sort both files and then read them in parallel. The only difficulty is to get the parallel reading really correct.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056337]
help
Chatterbox?
[Corion]: choroba: Yeah, but handing off the request to Dancer,Plack, Mojolicious,LWP is easy once I have the data filled into some structure ;))
[choroba]: Algorithm::Loops
[Corion]: choroba: I'm using that to generate the permutations, but I don't know how the user can pass the intended values to my function in a sane way
[Corion]: I have a prototype that permutes the get_parameters, but the values used for the get parameters should be different from the values used for the headers and potentially for parts of the URL
[Corion]: But yes, in general, my approach will be "split the URL into another set of parameters, generate an array of allowed values for each parameter and then NestedLoops() over the set"
[choroba]: hmm... so you need something like bag from Test::Deep, but not for checking, but for generation
[Corion]: This has the dual use of easily requesting sequential URLs and also being suitable for testing
[Corion]: For testing, I want to skip all tests with the same value(s) once one test fails to cut down on the number of failing tests
[Corion]: choroba: Yes, in a way I
[Corion]: ... I'm treating the incoming value sets as bags... Maybe I'll just put the burden on the user, at least in the first attempt at a full API. generate( headers => [{ 'Content-Type' => 'text/plain' }, {'Content-Type' => 'text/json' ] ), even ...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2017-01-17 08:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (152 votes). Check out past polls.