http://www.perlmonks.org?node_id=1017512


in reply to Optimization Help on Perl Hash Traversal (eval use)

The idea behind the eval blocks is to add a layer of abstraction so that when I need to add additional statistical analyses I can add entries to mysql with the proper eval string.

The most effective optimisation would be to avoid (re)-evaling your snippets for every id.

And the easiest way to do that would be to construct your snippets so that they can be eval'd into subroutines once each, and then you can call the appropriate subroutine for each ID instead.

As you haven't posted your snippets, I can't offer a realistic example, but by way of giving you an idea, something line:

$_ = sub{ $_ } for keys %{$agg_snippets};

Would (assuming the snippets are correctly defined, turn the snippets into subroutines.

You then just invoke the appropriate subroutine passing the data as arguments; and your code should run substantially faster.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Optimization Help on Perl Hash Traversal (eval use)
by mwb613 (Beadle) on Feb 06, 2013 at 22:32 UTC

    Thanks so much!

    Here are two examples of the snippets:

    '{ceil(($call_ids->{$this_call_id}->{$this_index}->{\"duration_milliseconds\"} / 1000.) / 6)*6 / 60;}'

    and

    '{if($call_ids->{$this_call_id}->{$this_index}->{"release_code"} == 503){1;} else{0;}}'

    and this bit of ugliness used to create the grouping values in the hash

    '{my $route = $call_ids->{$this_call_id}->{$this_index}->{''route''};my $src_state = $call_ids->{$this_call_id}->{$this_index}->{''o_state''}; my $dest_state = $call_ids->{$this_call_id}->{$this_index}->{''t_state''}; my $juris_indicator = "f";if(!$src_state){$juris_indicator = "c";}elsif($src_state eq $dest_state){$juris_indicator = "a";}else {$juris_indicator = "b";};$route =~ /^[a|b|c](1[2-9][0-9]{2}[2-9][0-9]{2})/;$corrected_route = $juris_indicator .  $1;$corrected_route = $route if $route =~ /loop|none|lnp_error|no_juris_digits/;$corrected_route = $route if !$corrected_route;$ret_val = "$call_ids->{$this_call_id}->{$this_index}->{''day''},$call_ids->{$this_call_id}->{$this_index}->{''day_chunk''},$call_ids->{$this_call_id}->{$this_index}->{''o_trunk''},$call_ids->{$this_call_id}->{$this_index}->{''t_trunk''},$call_ids->{$this_call_id}->{$this_index}->{''route''},$corrected_route";}'

    I'm a little fuzzy on what you're describing as I've never attempted it before but are you creating a dynamic, anonymous function? If there is a name for what you're describing let me know and I'll do some research. I'm sure it would go a long way in clarifying that last block of code.

    Thanks!

      Your 3 snippets can be easily converted to (far more readable) subroutines thus:

      'sub { my( $call_ids, $call_id, $index ) = @_; ceil( ( $call_ids->{$call_id}->{$index}->{duration_milliseconds} / 1 +000. ) / 6 ) * 6 / 60; }' 'sub { my( $call_ids, $call_id, $index ) = @_; if($call_ids->{$call_id}->{$index}->{ release_code } == 503 ){ 1; } else{ 0; } }' 'sub { my( $call_ids, $call_id, $index ) = @_; my $route = $call_ids->{$call_id}{$index}{ route }; my $src_state = $call_ids->{$call_id}{$index}{ o_state }; my $dest_state = $call_ids->{$call_id}{$index}{ t_state }; my $juris_indicator = 'f'; if( !$src_state ){ $juris_indicator = 'c'; } elsif( $src_state eq $dest_state ){ $juris_indicator = 'a'; }else { $juris_indicator = 'b'; }; $route =~ /^[a|b|c](1[2-9][0-9]{2}[2-9][0-9]{2})/; $corrected_route = $juris_indicator . $1; $corrected_route = $route if $route =~ /loop|none|lnp_error|no_jur +is_digits/; $corrected_route = $route if !$corrected_route; join ',', $call_ids->{$call_id}{$index}{ day }, $call_ids->{$call_id}{$index}{ day_chunk }, $call_ids->{$call_id}{$index}{ o_trunk }, $call_ids->{$call_id}{$index}{ t_trunk }, $call_ids->{$call_id}{$index}{ route }, $corrected_route; }'

      Once you have loaded them into your $agg_snippets hash, those text snippets can be replaced by instantiated subroutines in one pass using eval like this:

      $agg_snippets{ $_ }{snippet} = eval $agg_snippets{ $_ }{snippet} for k +eys %{ $agg_snippets };

      Then later, when you are processing the %$call_ids hash, you can invoke them like this:

      for my $this_call_id ( sort keys %$call_ids ) { $count++; next if !$this_call_id; my $route_attempts = 0; for my $this_index ( sort keys %{ $call_ids->{ $this_call_id } } +) { $route_attempts++; foreach $aggregate_name ( keys %{ $agg_snippets } ){ my $this_group_data = eval $grouping_data_eval; $summary_data->{$this_group_data}{$aggregate_name} = 0 if + !$summary_data->{$this_group_data}->{$aggregate_name}; $summary_data->{$this_group_data}{$aggregate_name} += $agg_snippets->{$aggregate_name}{snippet}->( $call_id +s, $this_call_id, $this_index ); # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ } } }

      That still leaves the eval of the $grouping_data_eval, which you give no information for, but it should also be possible to eliminate that eval by instantiating it into a subroutine once near the top of the code.

      The overall effect should be to substantially speed up the processing. (BTW: Note how much clearer things are with: a) a little formatting; b) the omission of unnecessary punctuation; c) a little extra horizontal whitespace.)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Thanks again for your suggestions. Unfortunately, due to the "on demand" nature of my job I often have to drop projects until I have time to dig into them again.

        Anyways, your suggestions and examples, along with BrowserUK's suggestion to store the code in YAML config files helped greatly speed up and organize the project when I got back to it last week

        Cheers!

Re^2: Optimization Help on Perl Hash Traversal (eval use)
by Anonymous Monk on Feb 07, 2013 at 01:28 UTC
    Your example does not seem quite clear since it seems that the value of the subroutine is lost each time. We make a subroutine but where do we put the coderef? Do you mean, say:
    $subs{$_} = sub { $_} for ... ?