I want to write an application that reads from a file and once it's read enough (filled up a buffer of some size) does some processing on what it's read. A simple example of the processing might be to count the frequency of words in a text file. To take advantage of multiple cores, I'd like to have several threads (each responsible for words starting with a different letter, say) processing the buffered data in parallel. Once they have all finished, reading from the input file would resume, refilling the buffer, etc. until the entire (large) file has been processed. Once the entire file has been read and words counted, I'd like to print out the frequency of each word found.
I've read the thread tutorial and it seems like this should be pretty straightforward, but I'm not sure if the pattern is best suited to a queueing model, or whether the word frequency hashes should be shared data, or exactly how to manage the flow from single file-reader thread to parallel processing threads, back to file-reader, etc. and finally to single output-writer thread.
Any suggestions (or pointers to previous examples) for this kind of pattern?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
Outside of code tags, you may need to use entities for some characters:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||