in reply to Re^2: Splitting up a filesystem into 'bite sized' chunks
in thread Splitting up a filesystem into 'bite sized' chunks
I'm working on something that uses File::Find to send file lists to another thread (or two) that's using Thread::Queue. My major requirement is breaking down a 10Tb, 70million file monster filesystem
Given the size of your dataset, using an in-memory queue is a fatally flawed plan from both memory consumption and persistance/re-startability point of views.
I'd strongly advocate putting your file-paths into a DB of some kind and have your scanning processes remove them (or mark them done) as the processes them.
That way, if any one element of the cluster fails, it can be restarted and pick up from where it left off.
It also lends itself to doing incremental scans in subsequent passes.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: Splitting up a filesystem into 'bite sized' chunks
by Preceptor (Deacon) on Jul 10, 2013 at 19:35 UTC | |
by BrowserUk (Patriarch) on Jul 10, 2013 at 20:58 UTC | |
by Preceptor (Deacon) on Jul 10, 2013 at 21:16 UTC | |
by BrowserUk (Patriarch) on Jul 10, 2013 at 21:46 UTC | |
by Preceptor (Deacon) on Jul 10, 2013 at 22:59 UTC |
In Section
Seekers of Perl Wisdom