Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Poor performances with threads

by BrowserUk (Pope)
on Oct 23, 2009 at 11:18 UTC ( #802877=note: print w/replies, xml ) Need Help??

in reply to Poor performances with threads

It doesn't help that you are making so many copies of your data and results:

sub do_work { my @working_block = @input_data[$start .. $stop]; ## COPY 1 my @partial_output_data; $, = undef; $\ = undef; my $output_line = 0; for (my $i = 0; $i < @working_block; $i += $files) { # calculates the stddev of each slice and outputs the formatte +d resuls if (max(@working_block[$i .. $i + $files - 1]) == 0) { ## CO +PY 2 $partial_output_data[$output_line] = '0.00000E+00'; } else { $partial_output_data[$output_line] = sprintf('%.5E', stddev(@working_block[$i .. $i + $file +s - 1]) / ## COPY 3 mean(@working_block[$i .. $i + $files - 1])); ## COP +Y 4 } $output_line++; } @ReturnData[$start / $files .. $stop / $files] = @partial_output_d +ata; ## COPY 5 return; }

And three of those copies are as lists onto the stack, which then get copied again once and often twice into data structures internal to Statistics::Basic.

I suspect, but cannot verify without trying to re-create your script, (why not post the whole thing if you want help?), that you are spending most of your time allocating memory and copying data, rather than calculating.

Here is a simple script that calculates the coefficient of variation of 55k 7-element datasets in just over half a second:

#! perl -slw use strict; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; use List::Util qw[ sum ]; sub coffOfVar { my $mean = sum( @_ ) / @_; return 0 unless $mean; my @dif2s = map{ ( $_ - $mean )**2 } @_; my $stddev = sum( @dif2s ) / $#_; return $stddev / $mean } my @data; push @data, [ map rand( 2000 )-1000, 1..7 ] for 1 .. 55*1024; my $start = time; my @results; push @results, sprintf '%.5E', coffOfVar( @$_ ) for @data; printf "Took %.2f seconds to find the coffOfVar of %d datasets\n", time - $start, scalar @data; __END__ C:\test> Took 0.59 seconds to find the coffOfVar of 56320 datasets

Re-using other peoples code is all well and good, but when it leads to you writing and supporting far more complex code yourself in order to meet your operational performance requirements, because you are constantly having to restructure the natural state of your data to force-fit it to those of the modules you use--and you have no choice but to inherit a bunch of functionality, or just prepreparation for that functionality that you do not require--then it is time to consider extracting just that code that you actually need from the module and make your life simpler.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Poor performances with threads
by moritz (Cardinal) on Oct 23, 2009 at 11:50 UTC

    One can calculate the standard deviation with just one pass over the data set by iterating once over the numbers, calculating both the sum and the sum of squares in one go.

    In the case of only 7 numbers to sum over that doesn't make much of a difference (0.57s vs. 0.48s on my machine), The difference becomes more pronounced when there are more numbers in one set, when the whole array doesn't fit into the cache anymore (2.94s vs. 4.04s for 1k lists with 7k items each).

    But when such micro optimizations become important, it's worth pointing out that writing things in C (via XS or Inline::C) might be much faster.

    Perl 6 - links to (nearly) everything that is Perl 6.
      doesn't make much of a difference (0.57s vs. 0.48s on my machine)

      Agreed. But once you've reduced the time from 60 seconds to 0.6 of a second, the extra saving hardly seems worth it :) But I guess that's up to the OP.

      But it does highlight the problems with modules like Statistics::Basic. It offers a OO interface, storing copies of the data internally with the intention that it only need calculate things like mean once, whilst providing access to many types of statistic which re-use that calculated value.

      The problem is, calculating the mean of 7 values manually is quicker than calling a nested inherited method to retrieve it from storage. If you are processing single large datasets, amortisations can lead to savings (assuming you actually use the OO interface which the OP isn't). But for processing many small datasets, there are no amortisation gains.

      The upshot is, users pay the penalties of the OO without gaining any benefits and then go off looking at complex solutions (threads in this case) to try and recover the losses.

      So, 41 installed files and 3 dozen lines of code (and more), to achieve what can be done with 6 lines much more productively.

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Poor performances with threads
by olafmar (Novice) on Oct 26, 2009 at 10:01 UTC
    In the first version of the script I used hashes of arrays to iterate through the different input files and data, then I thought that I didn't need the hashes at all and I switched to a single 1D arra containing interleaved data from the different inputs. 20% faster.

    I really don't know why I kept the Statistics module... for such a simple computation. I will now apply your suggestions and I will study them a bit...

    Thank you very much.
      Done, I used the manual calculation of the stddev on slices of the original array containing the interleaved data. It takes more than calculating the stddev of the whole set that you wrote, but it is still 9 seconds against the original 60 seconds (meaning 3 seconds of actual calculation against 54).

      Thank you.

      I post the whole code, even if I am already satisfied. If you have time to read it, maybe I can learn something more.
      The script requires an input on the command line. In that input file are listed the actual starting data files, each one with a weight.
      The header of one input data file is simply copied to the output, the following values (4 per line) split, multiplied by the weight factor and interleaved in the data array (reading is slow, so I tried to put some work in this step).
      In the last part I recreate the input file, with 4 results per line after the headers. In between there is the code I asked you about.

      #!/usr/bin/perl -w # (C) Olaf Marzocchi, 2009 # Calculates the std dev of different tallies, the size must be the sa +me # It assumes a standard order of the values: after LOOKUP_TABLE always + the data require 5.004; use strict; use List::Util qw[ sum ]; # use Data::Dump qw[ pp ]; use POSIX qw(strtod setlocale LC_NUMERIC); setlocale(LC_NUMERIC, "en_US"); $\ = "\n"; $, = "\t"; if (@ARGV < 1) { print "Usage: $0 config_file [output_file]"; print " Config file must contain on each line the relative path +to one "; print " .vtk file and the weight of each one, separated with tab +ulation."; exit; } print "Processing $ARGV[0]..."; # Formats the filenames in order to always get the extension my %inputs; my $output_name = ""; my $weights = 0; open(INPUTS, '<', $ARGV[0]) or die "Error opening $ARGV[0]: $!"; while(<INPUTS>) { next if /^#/; s|\.vtk\t|\t|; /^([^\t]+)\t+([0-9eE+-.,]+)/; $weights += strtod($2); $inputs{$1 . '.vtk'} = strtod($2); } close(INPUTS); $output_name = $ARGV[0] . ".vtk"; $output_name =~ s/\.txt//g; $output_name = $ARGV[1] if (@ARGV > 1); # Sum of the weights normalised to 1 foreach(keys(%inputs)) { $inputs{$_} /= $weights; next unless (-e $inputs{$_}); print "$inputs{$_} is not available.\n"; exit; } # The header of the first file is also copied to the output. my @input_data; my @output_data; my $header_lines = 0; my $elements; my $filename = each (%inputs); open(INPUT, '<', $filename) or die "Error opening $filename: $!"; print "Using $filename for headers.\n"; while (<INPUT>) { chomp; push(@output_data, $_); last if (/lookup_table/i); $elements = $1 if (/point_data +(\d+)/i); $header_lines++; } close(INPUT); # Saves the array of actual data of the different input files into an +array with # values already interleaved. my $files = keys(%inputs); my $current_file = 0; foreach $filename (sort(keys(%inputs))) { print $filename, $inputs{$filename}; open(INPUT, '<', $filename) or die "Error opening $filename: $!"; my $i = 0; while (<INPUT>) { last if ($i == $header_lines); $i++; } my $j = 0; while (<INPUT>) { chomp; my @records = split; foreach (@records) { # multiplies by the weight and puts in the array $input_data[$files * $j + $current_file] = $_ * $inputs{$f +ilename}; $j++; } } close(INPUT); $current_file++; } my @ReturnData; for (my $i = 0; $i < @input_data; $i += $files) { push @ReturnData, sprintf '%.5E', coffOfVar( @input_data[$i .. $i ++ $files - 1] ); } # file generation for (my $i = 0; $i < @ReturnData; $i += 4) { if (@ReturnData - $i < 4) { push(@output_data, join(' ', @ReturnData[$i .. $#ReturnData])) +; last; } push(@output_data, join(' ', @ReturnData[$i .. $i + 3])); } $, = " "; $\ = "\n"; print $output_name; open(OUTPUT, '>', $output_name) or die "Error opening output: $!"; foreach (@output_data) { print OUTPUT $_; } close(OUTPUT); exit(0); sub coffOfVar { my $mean = sum( @_ ) / $files; return 0 unless $mean; my @dif2s = map{ ( $_ - $mean )**2 } @_; my $stddev = sum( @dif2s ) / $files; return $stddev**0.5 / $mean }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://802877]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2018-04-19 22:29 GMT
Find Nodes?
    Voting Booth?