Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Complex Splitting - Parse::RecDescent

by imp (Priest)
on Feb 06, 2007 at 15:02 UTC ( #598556=note: print w/ replies, xml ) Need Help??


in reply to Complex Splitting

If the string being split will always be well-formed then I would go with one of the regex solutions provided above. If there is a possiblity that the data will be malformed then you may be better off with a parser approach as it would allow for more flexibility in error handling.

Here is a solution using Parse::RecDescent.

use Parse::RecDescent; use strict; use warnings; my $str = "ABC[GHI]XY[Z]1A"; my $grammar = <<'GRAMMAR'; token : '[' /[A-Z]*/ ']' {$return = $item[2]} | /[A-Z]/ anything : /./ GRAMMAR my $parser = Parse::RecDescent->new($grammar); # When a reference to a scalar is passed to Parse::RecDescent it will # consume the tokens as they are matched. To avoid modifying the origi +nal # string a copy will be used my $copy = $str; while ($copy ne '') { if (my $token = $parser->token(\$copy)) { print "Token: $token\n"; } else { my $token = $parser->anything(\$copy); print "Invalid symbol: $token\n"; } }


Comment on Re: Complex Splitting - Parse::RecDescent
Download Code
Re^2: Complex Splitting - /\G.../gc
by ikegami (Pope) on Feb 07, 2007 at 08:00 UTC

    That silently ignores whitespace (read up on <skip>).

    Also, P::RD is rather slow. I'd even say inexcusably slow if you're just using it as a tokenizer. May I suggest a much faster tokenizer?

    use strict; use warnings; sub process_token { my ($token) = @_; print("Token: $token\n"); } { my $str = "ABC[GHI]XY[Z]1A"; for ($str) { /\G \[ ([A-Z]*) \] /xgcs && do { process_token("$1"); redo }; /\G ([A-Z]) /xgcs && do { process_token("$1"); redo }; /\G (.) /xgcs && do { printf("Unexpected '%s' at pos %d\n", $1, pos()-length($1)); redo }; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://598556]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2014-09-19 02:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (129 votes), past polls