Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^5: How to split file for threading?

by graff (Chancellor)
on Jul 05, 2015 at 22:36 UTC ( [id://1133295]=note: print w/replies, xml ) Need Help??


in reply to Re^4: How to split file for threading?
in thread How to split file for threading?

The layout, and the patience needed to render it via the keyboard, are impressive, but I presume that the numeric values came from a reasonably well-built benchmarking script, and if you could share (at least an outline of) that, it would be very helpful (... I think ... because, given my current ignorance about a multi-core environment, I assume that different conditions may yield different points at which adding more threads degrades performance).

As for writing a program to produce that sort of chart (two data sets with distinct y-axes but a common x-axis), I expect that's already been done at least a few times (and I suppose most people would just load the data into MS-Excel to draw it in any number of different styles).

UPDATE: Sorry... I see that you put some command lines above the chart, along with their numeric outputs, and I just now took the time to relate those outputs to the chart. So, just to clarify (because my brain isn't working all that well today)... are those command lines just running (a slightly modified version of) the OP script? Thanks.

Replies are listed 'Best First'.
Re^6: How to split file for threading?
by BrowserUk (Patriarch) on Jul 05, 2015 at 22:45 UTC

    Oh yes. The code is essentially identical to that I posted in Re^3: How to split file for threading?, with a little additional timing code.

    This is that slightly modified code, the output from the runs above and the graph:

    #! perl -slw use strict; use Time::HiRes qw[ time ]; use threads; sub worker { my( $filename, $target, $start, $end ) = @_; open my $fh, '<', $filename or die $!; seek $fh, $start, 0; <$fh> if $start > 0; ## discard first partial line my $count = 0; 1+index( <$fh>, $target ) and ++$count while tell( $fh ) < $end; return $count; } my $start = time; our $T //= 4; my( $filename, $target ) = @ARGV; my $fsize = -s $filename; my $chunksize = int( $fsize / $T ); my @chunks = map{ $_ * $chunksize } 0 .. $T-1; push @chunks, $fsize; my @threads = map{ threads->new( \&worker, $filename, $target, $chunks[ $_ ], $chunks +[ $_+1 ] ) } 0 .. $T-1; my $total = 0; $total += $_->join for @threads; print "Found $total '$target' lines"; printf "Took %.9f secs cpu(%s)\n", time()-$start, join ' ', times; __END__ [20:39:41.83] C:\test>1131634.pl -T=1 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 207.819706917 secs cpu(162.296 15.906 0 0) [20:43:09.82] C:\test>1131634.pl -T=2 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 169.678593159 secs cpu(164.203 18.968 0 0) [20:46:00.25] C:\test>1131634.pl -T=3 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 156.497795820 secs cpu(164.375 18.656 0 0) [20:48:37.34] C:\test>1131634.pl -T=4 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 127.086592913 secs cpu(161.843 19.546 0 0) [20:50:45.04] C:\test>1131634.pl -T=5 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 115.240671158 secs cpu(161.89 19.109 0 0) [20:52:40.87] C:\test>1131634.pl -T=6 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 129.716307163 secs cpu(161.781 21.859 0 0) [20:54:51.19] C:\test>1131634.pl -T=7 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 167.007865906 secs cpu(162.328 22.171 0 0) [20:57:38.78] C:\test>1131634.pl -T=8 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 179.142831087 secs cpu(164.171 25.546 0 0) 210 --- E 200 ... 165.0 l 190 ... 164.8 a 180 ... ___ 164.6 p 170 ... ___ X ... 164.4 C s 160 ... .X. --- .X. 164.2 P e 150 ... ... --- ... ... 164.0 U d 140 ... ... ... ___ ... ... 163.8 130 ... ... ... ... ... ... 163.6 S s 120 ... ... ... --- ... ... ... 163.4 e e 110 ... ... ... ... --- ... ... ... 163.2 c c 100 .X. ... ... ... ... ... ... ... 163.0 o o 90 ... ... ... ... ... ... ... ... 162.8 n n 80 ... ... ... ... ... ... ... ... 162.6 d d 70 ... ... ... ... ... ... .X. ... 162.4 s s 60 ... ... ... ... ... ... ... ... 162.2 . . 50 ... ... ... ... ... ... ... ... 162.0 40 ... ... ... .X. .X. .X. ... ... 161.8 30 ... ... ... ... ... ... ... ... 161.6 20 ... ... ... ... ... ... ... ... 161.4 10 ... ... ... ... ... ... ... ... 161.2 1 2 3 4 5 6 7 8 T H R E A D S

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1133295]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2024-04-24 03:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found