Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Regular expression double grouping negation headache

by dreadpiratepeter (Priest)
on Jun 29, 2002 at 14:10 UTC ( [id://178237]=note: print w/replies, xml ) Need Help??


in reply to Regular expression double grouping negation headache

Ok, color me confused.
Your script as written dones exactly what you claim that it doesn't when I run it (under linux). However, your output fails to expose the flaw in your parsing. The script throws away some of the values. You only capture the value up to the first space, but spaces are a valid character in the value.
I simplified your expression a little. I removed unnessasary backslashes, parens, and quotes and I replaced [^\s] with \S. The new regex is:
%defaults = map {/([^=]+)=(\S+)/?($1=>$2):()} @ARGV;

Since you already know that (thanks to the shells processing of your arguments) that everything to the right of the equals is a valid part of the value you can correct the problem by simplifying to this:
%defaults = map {/([^=]+)=(.*)/?($1=>$2):()} @ARGV;
And, I would also simplify the left hand side to be:
%defaults = map {/(.*?)=(.*)/?($1=>$2):()} @ARGV;
And, actually, you can takes advantage of the fact that calling a regex in list context returns the submatches as a list and use:
%defaults = map {(/(.*?)=(.*)/)} @ARGV;
Hope this helps

-pete
"Pain heals. Chicks dig scars. Glory lasts forever."

Replies are listed 'Best First'.
Doesn't seem complete
by tlhf (Scribe) on Jun 29, 2002 at 16:21 UTC
    Okay, so we've got;

    %defaults = map {(/(.*?)=(.*)/)} @ARGV;
    Which is cute. Except it fails under some conditions. Let's say you want to give the variable 'foo' the value of 'moo shoo= coo'. I'd assume it would be written 'foo=moo shoo\= coo'. Except your example doesn't allow for this.

    So I tried a little, and this is what I got;

    I was working from the param line as a single scalar variable for simplicity, and my first efforts seemed moderately successful;

    $line = 'foo=moo sho=clue\ woo\=moo'; @pairs = split(/(?<!\\)\s/, $line);


    However, this system has the problem of only checking whether the previous character is a backslash - it doesn't allow for backslashed backslahes. The expression would need to understand that '\\=' signified a backslash and then a non-escape literal. But it couldn't just check for two slashes and cancel, for it should accept '\\\=' as a a backslashed backslash and a backslashed equals symbol. It would, effectively, need to look behind for an even number of slashes, and only slash on that.

    I had;
    @pairs = split(/(?<!(\\\\+))\s/, $line); Which seemed perfect. Except we're not allowed variable length lookbehind, because it's not been implemented yet. Which was very annoying to find out. So, we could match the proceeding backslashes normally, but strap them back on. But that would be horrible. I have tried further routes, but found nothing. *sigh*

    tlhf
    xxx
      No, it would be written foo=moo\ shoo=coo and will be parsed right by my code. The big limitation is that you can't have an = in the key part but I can't see that being much of a problem. And actually an assertion would handle that case.

      -pete
      "Pain heals. Chicks dig scars. Glory lasts forever."
        Er, the limitation of not being able to parse an equals sign was the one I was talking about it.

        Which is cute. Except it fails under some conditions. Let's say you want to give the variable 'foo' the value of 'moo shoo= coo'. I'd assume it would be written 'foo=moo shoo\= coo'. Except your example doesn't allow for this.
        Your code wouldn't parse it correctly, although an assertion would be emminently acceptable.

        tlhf
        xxx
      Let's say you want to give the variable 'foo' the value of 'moo shoo= coo'. I'd assume it would be written 'foo=moo shoo\= coo'.

      I would think that would be foo=moo\ shoo=coo in a Unixish shell and "foo=moo shoo=coo" using an MSish one. In either case /(.*?)=(.*)/ does the job just fine since it will unconditionally use the first equals sign as the delimiter of the variable name, and then just slurp the rest of the string as the value, equals signs or not.

      An inadquacy does therefor arise when one wants an equals sign in the variable name, but I don't know if allowing for that special case is even desired, and even if so whether it's worth going through the tremendous pain of handling escapes properly (a problem if I've repeatedly broken my teeth on - it's not possible without a true, if simple, parser).

      Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://178237]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (7)
As of 2024-03-19 09:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found