Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Passing variables into regular expressions

by Tanalis (Curate)
on Apr 20, 2006 at 08:23 UTC ( #544545=note: print w/ replies, xml ) Need Help??


in reply to Passing variables into regular expressions

Hi,

There's a couple of issues with your code that you should be aware of.

Firstly, when you slurp a file into a variable like that, the newline characters (\n) remain within the variable (i.e., you get exactly what was in the file).

To have your regular expression take that into account, you need to add a m alongside the g on the right-hand-side of the expression, which has Perl's regular expression engine match over multiple lines, rather than stopping when it hits the newline.

You can simplify the regular expression too to make your life a little easier. The following (working, but only partially tested) code snippet seems to do the job for me:

#! /usr/bin/perl use strict; use warnings; my $hn = 'localhost.localdomain'; my $hosts = `cat /etc/hosts`; if ( $hosts =~ /^([\d\.]+)\s+($hn)/mg ) { print "true: $1 $2\n"; } else { print "false!\n"; }
Having said that, there's almost certainly a better way to parse this file .. but I'm not caffienated enough to think of it at the minute.

Hope that helps!


Comment on Re: Passing variables into regular expressions
Select or Download Code
Re^2: Passing variables into regular expressions
by johngg (Abbot) on Apr 20, 2006 at 13:02 UTC
    I could be misunderstanding something but I thought you had to use the s flag to get the regular expression to match across newlines. The following script

    use strict; use warnings; my $str = "ab12c\nde34f\ngh56i\njk78l"; my @digits = $str =~ /(\d+)/g; print "\@digits -- @digits\n\n"; my ($span_d) = $str =~ /(\d\d.*?\d\d)/; print "\$span_d -- $span_d\n\n"; my ($span_m) = $str =~ /(\d\d.*?\d\d)/m; print "\$span_m -- $span_m\n\n"; my ($span_s) = $str =~ /(\d\d.*?\d\d)/s; print "\$span_s -- $span_s\n\n";

    produces

    @digits -- 12 34 56 78 Use of uninitialized value in concatenation (.) or string at reSorM li +ne 12. $span_d -- Use of uninitialized value in concatenation (.) or string at reSorM li +ne 15. $span_m -- $span_s -- 12c de34

    The m flag doesn't seem to so the trick. Have I missed something?

    Cheers,

    JohnGG

      Interesting.

      The docs indicate that the s flag causes the input string to be treated as if it's a single line. The m flag, on the other hand, causes the string to be treated as if it's multi-line.

      It seems that .*? can't cross the newline character when using the m flag:

      # doesn't match my ($span_m) = $str =~ /(\d\d.*?\d\d)/m; # matches my ($span_m) = $str =~ /(\d\d.*?\n.*?\d\d)/m; # so does my ($span_m) = $str =~ /(\d\d\w+\s\w+\d\d/m;
      Treating the \n as whitespace (or explicitly naming it in the regex) seems to solve the problem. Any ideas why that'd be the case?
        Looking at the Camel book, 3rd edn., table 5-1 on page 150, the entry for /s says "Let . match newline ... " which sort of implies that /m doesn't. So it is the treatment of the "." metacharacter that changes between the two. This with no modifying flag also matches

        ($span_d) = $str =~ /(\d\d\w+\s\w+\d\d/;

        This might imply that the default behaviour of m/.../ with no modifying flag is the same as m/.../m. I will delve into the documentation when I get a chance.

        Cheers,

        Johngg

        Update:

        This passage is in the "perlre" manual page

        ... m Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of the string to matching the start or end of any line anywhere within the string. s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match. The "/s" and "/m" modifiers both override the $* setting. That is, no matter what $* contains, "/s" without "/m" will force "^" to match only at the beginning of the string and "$" to match only at the end (or just before a newline at the end) of the string. Together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string. ... perl v5.8.4 Last change: 2004-01-17 1

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://544545]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-04-19 20:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (483 votes), past polls