Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Concatenating regexs?

by Micz (Beadle)
on May 04, 2001 at 17:34 UTC ( [id://77948]=perlquestion: print w/replies, xml ) Need Help??

Micz has asked for the wisdom of the Perl Monks concerning the following question:

I have a bunch of regexs which operate on the same string, doing s/ and tr/. can I put this all into one regex? it needs to be a perl5 or posix compliant regex as I want to use it in perl, php, java etc. which is why I would like one regex, to make it easier to transport. the code currently looks like:
$print =~ s/([b-z]|[A-Z]):/\1/g; $print =~ s/a |V /A /g; $print =~ s/a: /A: /g; $print =~ s/2:/2 %/g; $print =~ s/9|3:/&/g; $print =~ s/([a,e,O])I/\1j/g; $print =~ tr/QRC@6{/Orx**E/; ...
you get the point. please excuse my coding. thanks for your help!

Replies are listed 'Best First'.
Re: Concatenating regexs?
by japhy (Canon) on May 04, 2001 at 17:48 UTC
    It's hard to just "group" together regexes into one regex, unless you make a "dispatch table".
    s/([b-zA-Z]):/\1/g; s/[aV] /A /g; s/a: /A: /g; s/2:/2 %/g; s/9|3:/&/g; s/([aeO])I/\1j/g; tr/QRC@6{/Orx**E/; my %trans; $trans{"$_:"} = $_ for 'b' .. 'z', 'A' .. 'Z'; $trans{"$_ "} = "A " for 'a', 'V'; $trans{"a: "} = "A: "; $trans{"2:"} = "2 %"; $trans{$_} = "&" for 9, '3:'; $trans{"${_}I"} = "${_}j" for 'a', 'e', 'O'; my $regex = join '|', map quotemeta($_), keys %trans; ### WARNING -- this does not allow for nested ### translations (like (A => 'BC', B => 'D') on ### the string "A" will yield "BC", not "DC") $string =~ s/($regex)/$trans{$1}/g; $string =~ tr/QRC@6{/Orx**E/;


    japhy -- Perl and Regex Hacker
Re: Concatenating regexs?
by suaveant (Parson) on May 04, 2001 at 17:45 UTC
    not easily, since they all do quite different things... much easier as you have it, but if you are diong them often maybe you should look into using qr{} to precompile the regexps...

    also, you realize that $print =~ s/([a,e,O])I/\1j/g; matches , as well as a e and O? if you don't want to match comma use [aeO] [] is a character class, so it looks at characters, so a separator is not necessary.

    also I believe that using $1 in the second half of the substitution is preferred to \1... perl -w would tell you that...

    you can do $print =~ s/([b-zA-Z]):/\1/g; which should be much faster than $print =~ s/([b-z]|[A-Z]):/\1/g; character classes can have as many ranges as you want in them.... but the (|) will slow you down compared to a straight character class...

    Update Oops... 9|3: does match 9 or 3:... my bad, ignore the next line
    not sure, do you want 9|3: to match '9|3:' or do you want it to match '9' or '3:'... it will match the former...
                    - Ant

      thanks for your help, I have a lot to learn. I have taken the first points and used them. How would I find either '9' or '3:' ? using 9\|3: ? thanks again, jan
        Ack! oops, sorry... my bad /9|3:/ will do the or...
                        - Ant

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://77948]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (3)
As of 2024-04-24 04:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found