Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Why this code is so slow if run in thread?

by vr (Curate)
on Dec 11, 2016 at 12:04 UTC ( [id://1177606]=perlquestion: print w/replies, xml ) Need Help??

vr has asked for the wisdom of the Perl Monks concerning the following question:

The question is related to PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects". But I don't think it has anything to do with PDL, images, etc. The $str is just a very long string. Using $1 instead of $& doesn't help.

Perl is 5.20.2 Win32 Strawberry, or 5.24 Win64.

use strict; use warnings; use 5.020; use threads; use PDL; use PDL::IO::Image; use PDL::Image2D; use Encode qw/ decode /; PDL::no_clone_skip_warning; say "No thread\n---------"; say 'Count: ', scalar @{ test( 'test.png' )}; say "\nThread\n---------"; say 'Count: ', scalar @{ threads-> create( \&test, 'test.png' )-> join + }; sub test { my $fn = shift; my $img = PDL::IO::Image-> new_from_file( $fn ); my $pdl = $img-> pixels_to_pdl-> short; my $s = cc8compt( $pdl == 0 ); my $str = decode 'UTF16LE', ${ $s-> get_dataref }; my ( $w, $h ) = $s-> dims; my @b = map { [ [ $w, 0 ], [ $h, 0 ] ] } 0 .. $s-> max; my $t = time; for my $y ( 0 .. $h - 1 ) { my $s = substr( $str, $y * $w, $w ); while( $s =~ m[[^\0]+]g ) { my $c = ord( $& ); $b[ $c ][ 0 ][ 0 ] = $-[0] if $-[0] < $b[ $c ][ 0 +][ 0 ]; $b[ $c ][ 0 ][ 1 ] = $+[0] - 1 if $+[0] - 1 > $b[ $c ][ 0 +][ 1 ]; $b[ $c ][ 1 ][ 0 ] = $y if $y < $b[ $c ][ 1 +][ 0 ]; $b[ $c ][ 1 ][ 1 ] = $y if $y > $b[ $c ][ 1 +][ 1 ]; } } say 'Time: ', time - $t; shift @b; return \@b; }

The output:

No thread --------- Time: 2 Count: 145 Thread --------- Time: 52 Count: 145

Replies are listed 'Best First'.
Re: Why this code is so slow if run in thread?
by BrowserUk (Patriarch) on Dec 12, 2016 at 04:31 UTC

    Sorry for my earlier misdirection. By way of recompense I have what I believe (though it is essentially untested for lack of a suitable image), that addresses both the slowness of substr on utf strings within threads (which is just weird) and the problem I thought was the cause, that of cloning the returned array.

    It avoids the former by doing away with the encoding, searching instead for runs of pairs of non-null characters in the unencoded pdl; and the latter by accumulating the counts in a packed binary array stored in a scalar.

    sub test { my $fn = shift; my $img = PDL::IO::Image-> new_from_file( $fn ) or die "Failed to +load image"; my $pdl = $img->pixels_to_pdl->short; my $s = cc8compt( $pdl != 0 ); my $str = ${ $s-> get_dataref }; my ( $w, $h ) = $s-> dims; my $bounds = pack 'n4', $w, 0, $h, 0; $bounds x= $s->max; for my $y ( 0 .. $h - 1 ) { my $s = substr( $str, 2 * $y * $w, 2 * $w ); while( $s =~ m[(?:[^\0][^\0])+]g ) { my( $l, $r ) = ( $-[0]/2, (($+[0])-1)/2 ); my $c = ord( $& ); vec( $bounds, 4*$c+0, 16 ) = $l if $l < vec( $bounds, 4*$c ++0, 16 ); vec( $bounds, 4*$c+1, 16 ) = $r if $r > vec( $bounds, 4*$c ++1, 16 ); vec( $bounds, 4*$c+2, 16 ) = $y if $y < vec( $bounds, 4*$c ++2, 16 ); vec( $bounds, 4*$c+3, 16 ) = $y if $y > vec( $bounds, 4*$c ++3, 16 ); } } return $bounds; }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.

        Results of my workaround on your test image:

        C:\test>1177606 vrtest.png C:\test>1177606 vrtest.png No thread --------- Took:0.818306923 Count: 145 Thread --------- Took:2.834208012 Count: 145

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why this code is so slow if run in thread?
by LanX (Saint) on Dec 11, 2016 at 13:44 UTC

      But then Perl's Core "Encode" is not thread safe? :(

      Suppose I split my program in two. First one creates 2 files, with single-byte encoded data and double-byte encoded data (and says "Equal", of course -- this image has less than 255 "objects").

      use strict; use warnings; use 5.020; use PDL; use PDL::IO::Image; use PDL::Image2D; use PDL::IO::FastRaw; use Encode qw/ decode /; my $img = PDL::IO::Image-> new_from_file( 'test.png' ); my $pdl = $img-> pixels_to_pdl-> short; my $s1 = cc8compt( $pdl == 0 )-> byte; my $s2 = cc8compt( $pdl == 0 ); say 'Equal' if decode( 'UTF16LE', ${ $s2-> get_dataref }) eq ${ $s1-> get_dataref }; writefraw( $s1, "fname-1" ); writefraw( $s2, "fname-2" );

      The 2nd program doesn't use PDL modules, it reads data from disk (and uses some hard-coded numbers):

      use strict; use warnings; use 5.020; use threads; use Encode qw/ decode /; say "No thread (byte)\n---------"; say 'Count: ', scalar @{ test( 1 )}; say "\nThread (byte)\n---------"; say 'Count: ', scalar @{ threads-> create( \&test, 1 )-> join }; say "\nNo thread (short)\n---------"; say 'Count: ', scalar @{ test( 2 )}; say "\nThread (short)\n---------"; say 'Count: ', scalar @{ threads-> create( \&test, 2 )-> join }; sub test { my $arg = shift; open my $fh, '<', "fname-$arg"; binmode $fh; my $str = do { local $/; <$fh> }; close $fh; $str = decode( 'UTF16LE', $str ) if $arg == 2; my ( $w, $h ) = ( 7616, 1200 ); my @b = map { [ [ $w, 0 ], [ $h, 0 ] ] } 0 .. 145; my $t = time; for my $y ( 0 .. $h - 1 ) { my $s = substr( $str, $y * $w, $w ); while( $s =~ m[[^\0]+]g ) { my $c = ord( $& ); $b[ $c ][ 0 ][ 0 ] = $-[0] if $-[0] < $b[ $c ][ 0 +][ 0 ]; $b[ $c ][ 0 ][ 1 ] = $+[0] - 1 if $+[0] - 1 > $b[ $c ][ 0 +][ 1 ]; $b[ $c ][ 1 ][ 0 ] = $y if $y < $b[ $c ][ 1 +][ 0 ]; $b[ $c ][ 1 ][ 1 ] = $y if $y > $b[ $c ][ 1 +][ 1 ]; } } say 'Time: ', time - $t; shift @b; return \@b; }

      The output:

      No thread (byte) --------- Time: 0 Count: 145 Thread (byte) --------- Time: 1 Count: 145 No thread (short) --------- Time: 1 Count: 145 Thread (short) --------- Time: 51 Count: 145

        With Perl threads, the thing that is usually the slowest is creating a thread. This is because creating a Perl thread requires cloning every data structure you have created. So I would first determine if that is where the time is being spent. You can report that by comparing the time at the start of the routine running against the time just before you ask for a thread to be created.

        If that is indeed where most of the time is spent, then you should do what just about anybody who has gotten good at using Perl threads does: Create your threads very early and just ship work to them, usually via a thread queue such as Thread::Queue.

        Though, looking at your code, I don't see where you would have a large data structure that should not have been destroyed before you create the thread. But, that data could be something cached (perhaps by accident) in some module you have pulled in, for example.

        Update: Or it could be time spent destroying the second interpreter instance.

        - tye        

        Did you look at your results? They're mostly useless!

        We can tell that shorts are 50x slower with threads, but we have no idea how much slower bytes are with threads. Could be there not slower at all (0.9999s vs 1.0000s). Could be billions time slower (0.000000001 vs 1.000000000).

        Please re-run the tests with use Time::HiRes qw( sleep );. Or provide the data files.

Re: Why this code is so slow if run in thread? (Wrong!)
by BrowserUk (Patriarch) on Dec 11, 2016 at 19:41 UTC

    Update:I just looked back at your code as opposed to my mock-up of it and realised this can't be the cause because you are not timing the return.

    Why this code is so slow if run in thread?

    Because you are returning a ref to an array of arrays, which in the unthreaded code means simply returning a scalar; but in the threaded code, means duplicating the entire contents of that AoAs into the calling threads memory and then returning a reference to the copy.

    This can be demonstrated without PDL or Encode like this:

    #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; sub test { my @b = map[[0,0],[0,0]], 0 .. 255; return \@b; } my $start = time; my $nothread = @{ test() }; printf "Unthreaded took: %.6f\n", time() - $start; $start = time; my $thread = @{ threads->create( \&test )->join }; printf "Threaded took: %.6f\n", time() - $start; __END__ c:\test>junk Unthreaded took: 0.000851 Threaded took: 0.012510

    The simple solution (for your specific posted example, which isn't using the contents of the AoA, but only the count of elements on the outer array ) ):

    say 'Count: ', scalar @{ threads-> create( \&test, 'test.png' )-> join + };

    would be to derive the count internal to the function and thus return a scalar rather than a compound data structure:

    ... shift @b; return scalar @b; }

    If your real code actually needs the AoAs returned to the main thread, then show an example of what you are doing with it there and I can perhaps suggest something else.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1177606]
Approved by marto
Front-paged by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (7)
As of 2024-04-23 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found