Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Speeding up named capture buffer access

by SBECK (Monk)
on Dec 01, 2009 at 13:33 UTC ( #810388=perlquestion: print w/ replies, xml ) Need Help??
SBECK has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to optimize Date::Manip where I've started to use named capture buffers (which I REALLY like), and I end up with something like the following:

     $string =~ $re;
     ($h,$mn,$s) = ($+{'h'},$+{'mn'},$+{'s'});

If I run a test case with 10,000 strings, profiling it shows that I call Tie::Hash::NamedCapture::FETCH 30,000 times at a significant time cost.

I'd like to optimize this somehow (but I'm not willing to give up named buffers). Ideally, I'd like to be able to do something like:

     %tmp = %+;
and copy out the entire hash with a single call, or something like:
     ($h,$mn,$s) = Tie::Hash::NamedCapture::FETCH('h','mn','s')
(which I realize doesn't work) so that I can reduce the problem to one call instead of 3.

Any suggestions on how this might be done?

UPDATE: Based on some of the comments, I want to clarify a couple things. First, I'm not willing to go back to numbered matches. The example I included here is VERY simple, and doesn't illustrate how much nicer a complicated regexp can be when using named buffers... but they are, and I'll take the named buffers (with the ease of maintainability that they bring) over the speed of numbered matches.

I realize that, in the example above, I'm forced to have 3 FETCH'es... I was just looking for a way to make them as optimal as possible, and if I could get an entire hash, as opposed to getting it one key at a time, the FETCH'es could all be done in the Tie::Hash::NamedCapture module (which is basically written in c with a simple perl wrapper) so it should be significantly faster. Unfortunaately, this doesn't seem to be possible.

What's left then is a call for any suggestions to optimize things within the constraint of using named buffers. One of the suggestions (using hash slices) resulted in a significant speedup (though not so much as I would expect if I could reduce the number of FETCHes), probably due to a perl optimization that I wasn't aware of, so the code now reads:

     ($h,$mn,$s) = @+{qw(h mn s)};

Thanks for all the comments and suggestions!

Comment on Speeding up named capture buffer access
Re: Speeding up named capture buffer access
by BrowserUk (Pope) on Dec 01, 2009 at 14:02 UTC

    I'm fairly sure that you are out of luck. You need to access 3 keys, and that will always involve 3 calls to FETCH.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Speeding up named capture buffer access
by moritz (Cardinal) on Dec 01, 2009 at 14:05 UTC
    My first attempt would be to use hash slices:
    ($h, $mn, $s) = @+{qw(h mn s)};

    and see if it's actually faster.

      Good news/bad news.

      The number of calls didn't change at all (I didn't really expect it too).

      Oddly enought though, the time required did decrease significantly, so I'll definitely switch to using hash slices. Probably some internal optimization that I wasn't aware of.

      I'm still going to try to reduce the number of calls though... that's where the big speedup would come.

        If you are going to (have to?) immediately assign the named captures to local/global variables (rather than using the named captures themselves), wouldn't you be better off avoiding the overhead of the ties completely by sticking with unnamed captures?

        I just can't see any advantage in:

        $string =~ $re; ($h,$mn,$s) = ($+{'h'},$+{'mn'},$+{'s'})

        Over (with unnamed captures):

        ($h,$mn,$s) = $string =~ $re;

        Just a not inconsiderable overhead.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        I don't see anything wrong with:

        %tmp = %+;

        Did you try/get the number of calls of this?

        BTW, I first thought about the following as in update/append mode for hashes:

        @tmp{ keys %+ } = values %+;

        but I could predict that the number of calls should increase.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://810388]
Approved by Corion
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-07-28 23:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (210 votes), past polls