Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Why would one want in a regex a class with only a single entry?

by hipowls (Curate)
on Mar 25, 2008 at 07:53 UTC ( #676069=note: print w/ replies, xml ) Need Help??


in reply to Why would one want in a regex a class with only a single entry?

Character classes don't involve backtracking as this example shows.

my $string = 'bet'; use re qw(debug); my $rx1 = qr/b(?:a|e)t/; $string =~ /$rx1/; my $rx2 = qr/b[ae]t/; $string =~ /$rx2/;
output of $rx1 which shows the branch
Compiling REx `b(?:a|e)t' size 12 Got 100 bytes for offset annotations. first at 1 1: EXACT <b>(3) 3: BRANCH(6) 4: EXACT <a>(10) 6: BRANCH(9) 7: EXACT <e>(10) 9: TAIL(10) 10: EXACT <t>(12) 12: END(0) anchored "b" at 0 (checking anchored) minlen 3 Offsets: [12] 1[1] 0[0] 4[1] 5[1] 0[0] 6[1] 7[1] 0[0] 7[0] 9[1] 0[0] 10[0] Guessing start of match, REx "b(?:a|e)t" against "bet"... Found anchored substr "b" at offset 0... Guessed: match at offset 0 Matching REx "b(?:a|e)t" against "bet" Setting an EVAL scope, savestack=7 0 <> <bet> | 1: EXACT <b> 1 <b> <et> | 3: BRANCH Setting an EVAL scope, savestack=13 1 <b> <et> | 4: EXACT <a> failed... 1 <b> <et> | 7: EXACT <e> 2 <be> <t> | 10: EXACT <t> 3 <bet> <> | 12: END Match successful!
output of $rx2 where there is no branching.
Compiling REx `b[ae]t' size 16 Got 132 bytes for offset annotations. first at 1 1: EXACT <b>(3) 3: ANYOF[ae](14) 14: EXACT <t>(16) 16: END(0) anchored "b" at 0 (checking anchored) minlen 3 Offsets: [16] 1[1] 0[0] 2[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[ +0] 6[1] 0[0] 7[0] Guessing start of match, REx "b[ae]t" against "bet"... Found anchored substr "b" at offset 0... Guessed: match at offset 0 Matching REx "b[ae]t" against "bet" Setting an EVAL scope, savestack=9 0 <> <bet> | 1: EXACT <b> 1 <b> <et> | 3: ANYOF[ae] 2 <be> <t> | 14: EXACT <t> 3 <bet> <> | 16: END Match successful! Freeing REx: `"b(?:a|e)t"' Freeing REx: `"b[ae]t"'
These examples were run under perl 5.8.8. 5.9.5 introduced trie optimization which gets rid of the branch. Output of $rx1 under 5.10.0
Compiling REx "b(?:a|e)t" Final program: 1: EXACT <b> (3) 3: TRIE-EXACT[ae] (10) <a> <e> 10: EXACT <t> (12) 12: END (0) anchored "b" at 0 (checking anchored) minlen 3 Guessing start of match in sv for REx "b(?:a|e)t" against "bet" Found anchored substr "b" at offset 0... Guessed: match at offset 0 Matching REx "b(?:a|e)t" against "bet" 0 <> <bet> | 1:EXACT <b>(3) 1 <b> <et> | 3:TRIE-EXACT[ae](10) 1 <b> <et> | State: 1 Accepted: 0 Charid: +2 CP: 65 After State: 3 2 <be> <t> | State: 3 Accepted: 1 Charid: +2 CP: 0 After State: 0 got 1 possible matches only one match left: #2 <e> 2 <be> <t> | 10:EXACT <t>(12) 3 <bet> <> | 12:END(0) Match successful! Freeing REx: "b(?:a|e)t"


Comment on Re: Why would one want in a regex a class with only a single entry?
Select or Download Code
Replies are listed 'Best First'.
Re^2: Why would one want in a regex a class with only a single entry?
by ack (Deacon) on Mar 26, 2008 at 03:57 UTC

    Ah, of course! I mistakenly was thinking of the alternating classes rather than classes themselves. My mistake...and a novice one at that.

    Thanks, hipowls.

    ack Albuquerque, NM

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://676069]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (10)
As of 2015-07-08 00:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls