Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Determing what part of a regex matched.

by blakem (Monsignor)
on Mar 07, 2003 at 10:25 UTC ( #241101=note: print w/ replies, xml ) Need Help??


in reply to Determing what part of a regex matched.

Here is how I would tokenize it... note that \d is a subset of \w, so any tokenizer that uses both is probably broken.

#!/usr/bin/perl -wT use strict; my $text = 'The world is foo 2!'; my (@words,@numbers,@spaces,@others); while((pos($text)||0) ne length($text)) { if ($text =~ /\G([a-zA-Z_]+)/gc) { push @words, $1; # or call whatever handler you want } elsif ($text =~ /\G(\d+)/gc) { push @numbers, $1; } elsif ($text =~ /\G(\s+)/gc) { push @spaces, $1; } elsif ($text =~ /\G([^\w\s]+)/gc) { push @others, $1; } else { warn "tokenizer is broken\n"; } } print "W: @words\n"; print "N: @numbers\n"; print "S: @spaces\n"; print "O: @others\n"; __END__ W: The world is foo N: 2 S: O: !

-Blake


Comment on Re: Determing what part of a regex matched.
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://241101]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (8)
As of 2014-07-29 19:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (226 votes), past polls