Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Text::Balanced question

by shmem (Chancellor)
on Oct 20, 2006 at 23:57 UTC ( [id://579727]=note: print w/replies, xml ) Need Help??


in reply to Text::Balanced question

try
($ext,$rem,$pre) =extract_bracketed($string,'()','[^()]+');

Seems that /.*?/ also matches parens, now doesn't it?

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Replies are listed 'Best First'.
Re^2: Text::Balanced question
by embirath (Novice) on Oct 21, 2006 at 00:44 UTC
    Now I just ran into another problem. Some of my strings contain backslashes. This seems to break everything...

    If I use:

    $string = 'Param1(TYPE,\abc\),Param2(TYPE,\abc\)';
    things break :-(

    I tried using double quotation marks instead, but of course that makes it interpret the backslashes differently/incorrectly.

    Any ideas?

    Thanks again.

    Emma
      You are (semi)hosed, as far as extract_bracketed() is concerned. The problem with backslashes is that they're used to escape things. That lets you represent non-printable characters, such as \n or \t, in a printable manner, as well as allowing one to say things like:

          my $s = "This string \" has an escaped quote";

      If you then print $s, you get:

          This string " has an escaped quote

      If you are trying to match balanced quotes on that string, you need to skip over the escaped quote inside. This is one of the reasons responders to your original post suggested that parsing balanced thingies is difficult to do with regular expressions.

      Looking at the source code in Text::Balanced, there is, indeed, a line that always eats the next character following a backslash:

          next if $$textref =~ m/\G\\./gcs;

      Your sample suggests that the input is using backslashes as some form of quoting operator, rather than an escape character. If that's a true assumption, then you might try normalizing your input to change the backslashes into something else (and then back again after you're done parsing):

      For example:

      $string = 'Param1(TYPE,\abc\),Param2(TYPE,\abc\)'; $string =~ tr{\\}{:};

      Cheers,

        Thanks a million! Got it to work by replacing the backslashes by "::". :-)

        Emma
Re^2: Text::Balanced question
by embirath (Novice) on Oct 21, 2006 at 00:23 UTC
    Aha. Wonderful. That works. Thanks!!!!!

    Though I would have thought '.*?' would try to match a pattern as short as possible (because of the "?") and thus not include the parentheses? Anyway, I guess I still don't fully understand the regex stuff.. but I'm glad my code now works. :-)

    Thanks again! Emma

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://579727]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-04-23 16:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found