Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Determing what part of a regex matched.

by blakem (Monsignor)
on Mar 07, 2003 at 10:25 UTC ( #241101=note: print w/replies, xml ) Need Help??


in reply to Determing what part of a regex matched.

Here is how I would tokenize it... note that \d is a subset of \w, so any tokenizer that uses both is probably broken.
#!/usr/bin/perl -wT use strict; my $text = 'The world is foo 2!'; my (@words,@numbers,@spaces,@others); while((pos($text)||0) ne length($text)) { if ($text =~ /\G([a-zA-Z_]+)/gc) { push @words, $1; # or call whatever handler you want } elsif ($text =~ /\G(\d+)/gc) { push @numbers, $1; } elsif ($text =~ /\G(\s+)/gc) { push @spaces, $1; } elsif ($text =~ /\G([^\w\s]+)/gc) { push @others, $1; } else { warn "tokenizer is broken\n"; } } print "W: @words\n"; print "N: @numbers\n"; print "S: @spaces\n"; print "O: @others\n"; __END__ W: The world is foo N: 2 S: O: !

-Blake

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://241101]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2016-09-26 05:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Extraterrestrials haven't visited the Earth yet because:







    Results (476 votes). Check out past polls.