http://www.perlmonks.org?node_id=1138699


in reply to Optimizing perl code performance

Update: My next attempt compares running with threads and non-threads on the Mac and Linux. There is something strange about strftime that causes the script to slow down either with threads or non-threads depending on the OS.

Update: The serial code runs faster on a Linux VM. For some reason, the strftime function degrades in performance when running with many workers (even threads on Linux). I'm not sure why.

In my testing, strftime performs poorly when many workers call it simultaneously. This is fine with threads, but must limit the number of workers.

On my laptop (running Mac OS X), the serial code completes in 19.131 seconds for a 500 MB file and MCE completing in 6.569 seconds. Most of that time is coming from strftime. I verified this by replacing $A = strftime with $A = $Y which completes in 1.842 seconds.

#!/usr/bin/perl use strict; use warnings; use threads; use threads::shared; use POSIX qw(strftime); use MCE::Loop; use MCE::Candy; my $infile = $ARGV[0]; my $outfile = $ARGV[1]; open(DATAOUT, ">", $outfile); ## Workers process chunks in parallel until completed. ## Output order is preserved via MCE::Candy::out_iter_fh MCE::Loop::init { chunk_size => "2m", max_workers => 4, use_slurpio => 1, gather => MCE::Candy::out_iter_fh(\*DATAOUT), use_threads => 1 }; mce_loop_f { my ($mce, $chunkRef, $chunkID) = @_; my ($output, @Fields, $X, $Y, $A, $B, $C, $D) = (""); open my $CHUNKIN, "<", $chunkRef; while( my $line = <$CHUNKIN> ) { chomp $line; @Fields = split(',', $line, 9); $X = $Fields[8]; $Y = substr $X, 0, 10; $A = strftime "%M,%Y,%m,%d,%H,%j,%W,%u,%A", gmtime $Y; $B = substr($A, 0, index($A, ',')); $C = int($B/5); $D = int($B/15); $output .= $line.",$Y,$A,$C,$D\n"; } close $CHUNKIN; MCE->gather($chunkID, $output); } $infile; close(DATAOUT);

Kind regareds, Mario.

Replies are listed 'Best First'.
Re^2: Optimizing perl code performance
by marioroy (Parson) on Aug 15, 2015 at 15:40 UTC

    Update: The disparity is coming from strftime.

    Update: One must use threads on the Mac and non-threads on Linux for best performance. This is mind-boggling to me. Replacing the strftime line with $A = $Y completes in a couple seconds for threads or non-threads on the Mac and Linux.

    The same 500 MB input file is used by both OS.

    Mac OS X Serial: 18.185s Mac OS X Parallel: 6.687s threads Mac OS X Parallel: 42.526s non-threads CentOS 7 VM Serial: 10.832s CentOS 7 VM Parallel: 23.849s threads CentOS 7 VM Parallel: 2.993s non-threads
    #!/usr/bin/perl use strict; use warnings; use threads; # Comment out threads for child processes use POSIX qw(strftime); use MCE::Loop; use MCE::Candy; my $mutex :shared = 0; my $infile = $ARGV[0]; my $outfile = $ARGV[1]; open(DATAOUT, ">", $outfile); ## Workers process chunks in parallel until completed. ## Output order is preserved via MCE::Candy::out_iter_fh MCE::Loop::init { chunk_size => "2m", max_workers => 4, use_slurpio => 1, gather => MCE::Candy::out_iter_fh(\*DATAOUT) }; mce_loop_f { my ($mce, $chunkRef, $chunkID) = @_; my ($output, @Fields, $X, $Y, $A, $B, $C, $D, @G) = (""); open my $CHUNKIN, "<", $chunkRef; while( my $line = <$CHUNKIN> ) { chomp $line; @Fields = split(',', $line, 9); $X = $Fields[8]; $Y = substr $X, 0, 10; @G = gmtime $Y; $A = strftime "%M,%Y,%m,%d,%H,%j,%W,%u,%A", @G; $B = substr($A, 0, index($A, ',')); $C = int($B/5); $D = int($B/15); $output .= $line.",$Y,$A,$C,$D\n"; } close $CHUNKIN; MCE->gather($chunkID, $output); } $infile; close(DATAOUT);