Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

How can I hope to use regular expressions without creating illegible and unmaintainable code?

by faq_monk (Initiate)
on Oct 08, 1999 at 00:25 UTC ( [id://656]=perlfaq nodetype: print w/replies, xml ) Need Help??

Current Perl documentation can be found at perldoc.perl.org.

Here is our local, out-dated (pre-5.6) version:

Three techniques can make regular expressions maintainable and understandable.

Comments Outside the Regexp

Describe what you're doing and how you're doing it, using normal Perl comments.

    # turn the line into the first word, a colon, and the
    # number of characters on the rest of the line
    s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg;
Comments Inside the Regexp

The /x modifier causes whitespace to be ignored in a regexp pattern (except in a character class), and also allows you to use normal comments there, too. As you can imagine, whitespace and comments help a lot.

/x lets you turn this:

    s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs;

into this:

    s{ <                    # opening angle bracket
        (?:                 # Non-backreffing grouping paren
             [^>'"] *       # 0 or more things that are neither > nor ' nor "
                |           #    or else
             ".*?"          # a section between double quotes (stingy match)
                |           #    or else
             '.*?'          # a section between single quotes (stingy match)
        ) +                 #   all occurring one or more times
       >                    # closing angle bracket
    }{}gsx;                 # replace with nothing, i.e. delete

It's still not quite so clear as prose, but it is very useful for describing the meaning of each part of the pattern.

Different Delimiters

While we normally think of patterns as being delimited with / characters, they can be delimited by almost any character. the perlre manpage describes this. For example, the s/// above uses braces as delimiters. Selecting another delimiter can avoid quoting the delimiter within the pattern:

    s/\/usr\/local/\/usr\/share/g;      # bad delimiter choice
    s#/usr/local#/usr/share#g;          # better

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-19 10:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found