my apologies...I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution. Within the procXml sub, I simply slurp the file into a hash, then operate on the hash.
I was under the impression that because I was operating on the file contents in memory (i.e. the hash), it was a mostly CPU-bound process (minus slurping the input file and printing to the output file.
| [reply] [d/l] |
.I thought that since the procXml sub worked just fine, it would not be relevant to the discussion or potential solution.
You were mostly right. The only relevance it has is that nowhere in that code do I see any sign of locking (the keyword 'lock' does not appear), which means that multiple threads are writing to a shared hash and there is nothing to prevent them from corrupting data through collisions.
You may 'get away with it', but I wouldn't want to be responsible for when things go wrong.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP Neil Armstrong
| [reply] |
When a CPU runs in terms of literally billions of ops per second these days, and it is drawing its inputs from a large number of files, then ... it is I/O-bound because that’s what it is waiting on. Nanoseconds vs. milliseconds. The completion-time of this program, over the course of let us say one minute, will chiefly be regulated by its ability to perform input/output, not by the speed of the processor(s). If you were to place the program onto a CPU that ran twice as fast, all other things being equal, such a program would not complete in half the time. If it truly were CPU-bound, then it would not “slow down,” a-n-d drop out of CPU-utilization at the same time, as it is reported to be doing.
As you say in the (upvoted) earlier comment, this is a poorly thought-out program from the start. I would further guess that the hash might well have become enormous by that time, and that quite possibly the program has descended into “thrashing hell.” Something, and it can only be I/O, is utterly preventing the CPU from getting any work done during the second phase. Thrashing is about the only culprit that exists to explain that.
| [reply] |
| [reply] |
Don't get personal, sir. Neither “idiot,” nor “crap,” nor certainly “F***,” is appropriate language to use in this Monastery. Addressed to anyone. For any reason. I will trouble you henceforth to remember that very simple rule of human etiquette.
| [reply] |
What kind of I/O? Mechanical disk I/O? SSD I/O? Ethernet I/O? DRAM I/O? L1 cache I/O?
| [reply] |