|Perl: the Markov chain saw|
Re^3: XML::Twig and threadsby BrowserUk (Pope)
|on Nov 26, 2012 at 16:36 UTC||Need Help??|
The first thing to say is that that is not valid XML. (A valid XML document must contain a single top level tag.)
That said, for the purposes of processing, that (arbitrary) XML rule works in our favour and makes writing a program that processes the large file in smallish chunks very simple:
In addition to that allowing the huge file to be processed very quickly in minimal memory, it would -- were the processing requirements of the individual chunks sufficiently taxing to warrant it -- enable multiple individual chunks to be processed in parallel with threading very easily.
But, if the example is anything like representative of the actual data, that above code will probably allow the entire file to be processed sufficiently quickly -- in a very casual test; less that 2 minutes -- that the need for considering threading disappears completely. The saving coming simply from processing the file in small chunks rather than en masse.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.