laziness, impatience, and hubris | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
I was thinking I could stall the find process, in order to simply buffet, rather than maintain
. Well, any full lists, be they database or flat file. The problem with flat files is that the make lousy queues. (Great filos but lousy fifos.) Removing records/lines at the beginning of a file is (for all intents and purposes) impossible; and marking records done, means reading from the top each time to find the next piece of work to do. An O(n^2) process. Thus you would then need a second (pointer) file that tells you how far down the first file you've processed; and that file becomes a bottleneck of contention. As for file systems...I've often used (and advocated the use of) file systems for queues -- the producer creates small (often zero-length) files in a todo directory; consumers rename the first file they find in that directory into a /consumerN.processing/ directory whilst they process it; and then rename it into a done directory (or just delete it) once they finished. -- but again, given the size of your dataset, you'd have to very carefully manage the number of files you put into a single directory. And if you try to structure it, you're just moving the goal posts. And what happpens if your find/findfile process dies? Working out how far it got so you can avoid starting over is a problem. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
In reply to Re^5: Splitting up a filesystem into 'bite sized' chunks
by BrowserUk
|
|