Concatenating regexs?

Micz has asked for the wisdom of the Perl Monks concerning the following question:

I have a bunch of regexs which operate on the same string, doing s/ and tr/. can I put this all into one regex? it needs to be a perl5 or posix compliant regex as I want to use it in perl, php, java etc. which is why I would like one regex, to make it easier to transport. the code currently looks like:

$print =~ s/([b-z]|[A-Z]):/\1/g;
$print =~ s/a |V /A /g;
$print =~ s/a: /A: /g;
$print =~ s/2:/2 %/g;
$print =~ s/9|3:/&/g;
$print =~ s/([a,e,O])I/\1j/g;
$print =~ tr/QRC@6{/Orx**E/;
...
[download]

you get the point. please excuse my coding. thanks for your help!

Comment on Concatenating regexs? Download Code

Replies are listed 'Best First'.
Re: Concatenating regexs? by japhy (Canon) on May 04, 2001 at 17:48 UTC
It's hard to just "group" together regexes into one regex, unless you make a "dispatch table". s/([b-zA-Z]):/\1/g; s/[aV] /A /g; s/a: /A: /g; s/2:/2 %/g; s/9\|3:/&/g; s/([aeO])I/\1j/g; tr/QRC@6{/OrxE/; my %trans; $trans{"$_:"} = $_ for 'b' .. 'z', 'A' .. 'Z'; $trans{"$_ "} = "A " for 'a', 'V'; $trans{"a: "} = "A: "; $trans{"2:"} = "2 %"; $trans{$_} = "&" for 9, '3:'; $trans{"${_}I"} = "${_}j" for 'a', 'e', 'O'; my $regex = join '\|', map quotemeta($_), keys %trans; ### WARNING -- this does not allow for nested ### translations (like (A => 'BC', B => 'D') on ### the string "A" will yield "BC", not "DC") $string =~ s/($regex)/$trans{$1}/g; $string =~ tr/QRC@6{/OrxE/; [download] `japhy` -- Perl and Regex Hacker	[reply] [d/l]
Re: Concatenating regexs? by suaveant (Parson) on May 04, 2001 at 17:45 UTC
not easily, since they all do quite different things... much easier as you have it, but if you are diong them often maybe you should look into using qr{} to precompile the regexps... also, you realize that `$print =~ s/([a,e,O])I/\1j/g;` matches , as well as a e and O? if you don't want to match comma use `[aeO] []` is a character class, so it looks at characters, so a separator is not necessary. also I believe that using $1 in the second half of the substitution is preferred to \1... perl -w would tell you that... you can do `$print =~ s/([b-zA-Z]):/\1/g;` which should be much faster than `$print =~ s/([b-z]\|[A-Z]):/\1/g;` character classes can have as many ranges as you want in them.... but the (\|) will slow you down compared to a straight character class... Update Oops... 9\|3: does match 9 or 3:... my bad, ignore the next line not sure, do you want 9\|3: to match '9\|3:' or do you want it to match '9' or '3:'... it will match the former... - Ant	[reply] [d/l] [select]
Re: Re: Concatenating regexs? by Micz (Beadle) on May 04, 2001 at 18:56 UTC
thanks for your help, I have a lot to learn. I have taken the first points and used them. How would I find either '9' or '3:' ? using 9\\|3: ? thanks again, jan	[reply]
Re: Re: Re: Concatenating regexs? by suaveant (Parson) on May 04, 2001 at 19:06 UTC
Ack! oops, sorry... my bad /9\|3:/ will do the or... - Ant	[reply]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks