Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Multithreading a large file split to multiple files

by BrowserUk (Pope)
on May 14, 2018 at 22:07 UTC ( #1214510=note: print w/replies, xml ) Need Help??

in reply to Multithreading a large file split to multiple files

Can I make it run on multiple cores so it runs faster?

Short answer: no.

The logic of your code dictates the records in the input file are read in strict first to last sequence. Thus, any overhead from switching threads or sharing state is additional time to that required for processing.

Even the code towards the end of the loop, is dependent on state changes earlier in that loop.

And with 15GB of input, there isn't even any mileage in accumulating output in memory to avoid disk thrash.

It's doubtful if even MCE can help you with this.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit
  • Comment on Re: Multithreading a large file split to multiple files

Replies are listed 'Best First'.
Re^2: Multithreading a large file split to multiple files
by Marshall (Abbot) on May 15, 2018 at 08:53 UTC
    I agree that multiple cores will not help because there is a blocking point of the sequential read of the input file.

    I am not so sure about output buffering. I really don't know in this situation, but depending upon the file system and other factors like the intelligence of the disk controller, increasing the buffer size for write could make a difference?
    Just an idea to try. I would benchmark 64K vs standard size (which I guess is probably 4K) and see if there is any significant difference.

      Thank you for the suggestion. I'll try that out.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1214510]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2018-11-14 07:11 GMT
Find Nodes?
    Voting Booth?
    My code is most likely broken because:

    Results (163 votes). Check out past polls.