Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Split a sentence into words

by akho (Hermit)
on May 30, 2009 at 04:45 UTC ( #767010=note: print w/ replies, xml ) Need Help??


in reply to Split a sentence into words

my @vocabulary = qw(abd abcd abc a bc); my $sentence = 'abdaabc'; my $pattern = join '|', @vocabulary; my @words = $sentence =~ /($pattern)/g;

note that @vocabulary has to be sorted in such a way that "longer" words come earlier; i.e. if word x is a prefix of word y, word y must come earlier in the list.

Upd Does not actually work; i.e. it works only for some vocabularies; say (abcd, abc, de) will not split 'abcde' right. Things get complicated and computer-sciencey. See bart and ikegami's replies below.


Comment on Re: Split a sentence into words
Select or Download Code
Re^2: Split a sentence into words
by bart (Canon) on May 30, 2009 at 12:27 UTC
    note that @vocabulary has to be sorted in such a way that "longer" words come earlier
    If you depend on a module like Regex::PreSuf, not only will it have the same effect, i.e. matching the longest match possible, but likely, it'll match faster, at least for longer lists, and pre 5.10 perl.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://767010]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (16)
As of 2014-08-27 20:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (252 votes), past polls