Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Simple regex question. Grouping with a negative lookahead assertion.

by BrowserUk (Pope)
on Jul 14, 2013 at 06:42 UTC ( #1044203=note: print w/ replies, xml ) Need Help??


in reply to Re: Simple regex question. Grouping with a negative lookahead assertion.
in thread Simple regex question. Grouping with a negative lookahead assertion.

Sorry pal. Most of your posts -- especially those regarding regex -- get an upvote from me, but this one got --. Its a crock.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re^2: Simple regex question. Grouping with a negative lookahead assertion.
Re^3: Simple regex question. Grouping with a negative lookahead assertion.
by AnomalousMonk (Abbot) on Jul 14, 2013 at 22:47 UTC
    Sorry pal. Most of your posts -- especially those regarding regex -- get an upvote from me, but this one got --. Its a crock.

    Apparently it was so bad, you tried to -- it three times!

    I was curious about the locus of crockitudinousness and decided to do some benchmarking, usually at the root of these squabbles. (Update: Benchmarked variations include some of those used by kcott here.) I must admit I was shocked, shocked by the results. There were no big surprises until I looked at the effect of the  //p regex modifier. Simply adding this modifier to
        m{ atg ([acgt]+?) (?= taa|tag|tga) }xmsg
    in the  push @ra, $1 variation ($push_cg below, which otherwise performs roughly comparably to the other variations) slows its performance by orders of magnitude, so much so that I didn't have the patience to run the benchmark to completion.

    Am I doing this right? (Update: I.e., is the effect of the use of  //p as in the  $push_KM sub below, which I don't even have the patience to benchmark, really so egregious?) Is this all down to the  //p modifier? And if so, have the proper authorities been notified? If you've touched on this in other threads, I have not been following these discussions as carefully as I ought. Anyway, here's my benchmark code. As always, I would be interested in any comments you might have.

      "Benchmarked variations include some of those used by kcott"

      I'm assuming you're referring to cg_ncg with (?: ... ) and cg_atomic with (?> ... ).

      Prior to posting yesterday, and purely out of curiousity, I ran /atg(.+?)(?:taa|tag|tga)/ and /atg(.+?)(?>taa|tag|tga)/ through Regexp::Debugger looking at the matching process step-by-step. From memory, ?: took 64 steps (in total) to complete the match while ?> took 66 steps. That probably accounts for the cg_atomic vs. cg_ncg 3% (66/64 = 1.03125).

      Again from memory, the two extra steps occurred after failing to match taa|tag|tga after either the 'a' or 't' of 'atg'. For the ?: case, the steps were something like: "(?:" start non-capture group; "taa" no match; "|" next alt; ...; "tga" no match. For the ?> case: "(?>" start non-backtracking group; ... as for ?: ...; (then the additional step) ")" end non-backtracking group.

      Obviously, you can check that yourself if you're so inclined. I wasn't inclined to repeat the process. :-)

      [I haven't analysed your benchmarking further.]

      -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1044203]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (16)
As of 2015-07-01 19:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (17 votes), past polls