Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^3: split on delimiter unless escaped

by yrp001 (Initiate)
on Nov 10, 2010 at 00:19 UTC ( [id://870443]=note: print w/replies, xml ) Need Help??


in reply to Re^2: split on delimiter unless escaped
in thread split on delimiter unless escaped

Hi ikegami,

Thanks for your example. I'm still trying to figure it all out. I'm running it as below, and it doesn't seem to quite do what I want. I only want the escape character to be treated specially if it's in !+; - i.e. a!!b should be a!!b, whereas a!!!;b should be a!;b.

Also, I seem to be getting an empty field at the end. One or more semicolons at the end seem to be parsed properly, though.

One test string returns a blank result. ?

sub dequote { my $x = $_[0]; $x =~ s/!(.)/$1/sg; return $x; } while(<>) { chomp; my @fields = map dequote($_), /\G((?:[^!;]+|!.)*)(?:;|\z)/sg; print "$_ => " . join( '|', @fields ) . "\n"; # print "$_ => @fields\n"; }

Sample results:

aval!!!!;bval => aval!!|bval| aval!!!!!;bval => aval!!;bval| a!!val!!!!!;bval! => !a!!!val!!!!!;bval!! => a!val!!;bval!| a!val!;bva!l; => aval;bval| a!!val!!;;bv!!al;; => a!val!||bv!al||

Replies are listed 'Best First'.
Re^4: split on delimiter unless escaped
by ikegami (Patriarch) on Nov 10, 2010 at 01:10 UTC

    I only want the escape character to be treated specially if it's in !+;

    Yuck! I hope you're being forced to deal with this format.

    It's not only tricker for a human to understand, it's tricker to code. In particular, the definition of a field varies based on whether it's the last field or not, and the function of the "!" varies based on its position in the field.

    sub unescape { my $x = $_[0]; my ($base, $end) = $x =~ /^(.*)(!+)\z/s; return $base . ('!' x (length($end)/2)); } my $last_field = qr/ [^;]* /x; my $other_field = qr/ (?: [^!]+ | (?: ![^!] )+ )* (?:!!)* /x; # Validation my $record = qr/^ (?: $other_field ; )* $last_field \z/x; # Extraction my @fields = map unescape($_), / \G ( $other_field (?= ; ) | $last_field (?= \z ) ) (?:;|\z) /xg;

    You are free to skip the validation.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://870443]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2024-04-20 10:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found