Smaug has asked for the wisdom of the Perl Monks concerning the following question:

Hello all knowing PerlMonks,

I'm having a problem which I don't quite understand. I am seaching a string and matching on certain values. Once I find the values (foo xx xx xx bar) I grab the text in between and add it to an array.

I saw some code on perlmonks today which has made my life both easier and more painful. It is:
@matches = $data =~ m/VALUE: (.+?) OID/g;

Perfect for my needs. I just changed it to:
@matches = $data =~ m/foo (.+?) bar/g;

The problem I am having now is that sometimes there is nothing between foo and bar. There should always be 15 values but if I have two instances where there is nothing between foo and bar, @matches only contains 13 values and the first value after the null looks like "foo bar 'next value'"

Is this right? Have I gone mad? (yes) and could somebody help me get around this?

Replies are listed 'Best First'.
Re: Null scalars in array
by ysth (Canon) on Sep 20, 2006 at 17:10 UTC
    The + means match . (any non-\n character) one or more times. Use * instead, which means 0 or more times. You'll still be requiring two spaces between foo and bar; if you have just one, /foo ?(.*) bar/ may do what you want.
      Thanks, that's most likely my problem, the "null values" are actually not nulls they are spaces, sometimes 1 but it could be up 10.
      Does this mean the piece of regex will not work?
      Thanks again for the help I don't know how I would cope or avoid killing people without perlmonks.

        Your "nulls" weren't nulls at all. They arn't spaces either. They are just cases where your regex didn't match, so it didn't return anything. + Requires at least one occurence of the thing it is quantifiying. So if you say "A B" =~ /A(.+?)B/ you'll get a hit, because there is at least one thing between the A and B. If you do "AB" =~ /A(.+?)B/ it wont match because there isn't at least one thing between A and B. * however matches 0 or more things, so it would match in both cases.

        use strict; use warnings; print "A-B =~ /A.+B/ --> "; print "worked" if "A-B" =~ /A.+B/; print "\n"; print "A-B =~ /A.*B/ --> "; print "worked" if "A-B" =~ /A.*B/; print "\n"; print "AB =~ /A.+B/ --> "; print "worked" if "AB" =~ /A.+B/; print "\n"; print "AB =~ /A.*B/ --> "; print "worked" if "AB" =~ /A.*B/; print "\n";
        A-B =~ /A.+B/ --> worked A-B =~ /A.*B/ --> worked AB =~ /A.+B/ --> AB =~ /A.*B/ --> worked

        Eric Hodges
        You've confused me now. What is your input and what are your desired and actual outputs?
Re: Null scalars in array
by lyklev (Pilgrim) on Sep 20, 2006 at 22:05 UTC
    It looks like you have a string containg some words (or numbers) between "foo" and "bar". The words themselves are separated with spaces, so

    foo a b cd efg bar
    is what you want to search for, I assume.

    The regular expression you mention expects "foo" and "bar" to occur more than once, so it might not be the regexp you are looking for. You could use two steps: first, find everything between "foo" and "bar", then split that into separate words or numbers. The code would then be:

    my @matches; if ($data =~ m/foo (.*) bar/) { # everything between foo and bar go +es into $1 @matches = split(' ', $1); # split on whitespace }

    I cannot think of a regexp that does it in one go, so I welcome smarter solutions.

Re: Null scalars in array
by Smaug (Pilgrim) on Sep 21, 2006 at 06:01 UTC
    Thanks again for the help! Here is some more detail. The input looks like this:
    foo some string bar unneeded data foo another string bar data data foo + bar data data foo s bar data foo string bar data data data foo + bar data foo 123 string bar
    What I wanted from this was @matches where:
    $matches[0] = some string $matches[1] = another string $matches[2] = $matches[3] = s $matches[4] = string $matches[5] = $matches[6] = 123 string
    I'm busy going through all the suggestions.