Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

I'm having trouble matching over more than one line. What's wrong?

by faq_monk (Initiate)
on Oct 08, 1999 at 00:25 UTC ( #657=perlfaq nodetype: print w/ replies, xml ) Need Help??

Current Perl documentation can be found at perldoc.perl.org.

Here is our local, out-dated (pre-5.6) version:

Either you don't have more than one line in the string you're looking at (probably), or else you aren't using the correct modifier(s) on your pattern (possibly).

There are many ways to get multiline data into a string. If you want it to happen automatically while reading input, you'll want to set $/ (probably to '' for paragraphs or undef for the whole file) to allow you to read more than one line at a time.

Read the perlre manpage to help you decide which of /s and /m (or both) you might want to use: /s allows dot to include newline, and /m allows caret and dollar to match next to a newline, not just at the end of the string. You do need to make sure that you've actually got a multiline string in there.

For example, this program detects duplicate words, even when they span line breaks (but not paragraph ones). For this example, we don't need /s because we aren't using dot in a regular expression that we want to cross line boundaries. Neither do we need /m because we aren't wanting caret or dollar to match at any point inside the record next to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in.

    $/ = '';            # read in more whole paragraph, not just one line
    while ( <> ) {
        while ( /([w'-]+)(s+1)+/gi ) {   # word starts alpha
            print "Duplicate $1 at paragraph $.
";
        }
    }

Here's code that finds sentences that begin with ``From '' (which would be mangled by many mailers):

    $/ = '';            # read in more whole paragraph, not just one line
    while ( <> ) {
        while ( /^From /gm ) { # /m makes ^ match next to 

            print "leading from in paragraph $.
";
        }
    }

Here's code that finds everything between START and END in a paragraph:

    undef $/;           # read in whole file, not just one line or paragraph
    while ( <> ) {
        while ( /START(.*?)END/sm ) { # /s makes . cross line boundaries
            print "$1
";
        }
    }

Log In?
Username:
Password:

What's my password?
Create A New User
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (10)
As of 2014-07-25 11:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (170 votes), past polls