Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: What are the monks doing with Perl and Linguistics?

by Mur (Pilgrim)
on May 09, 2003 at 19:21 UTC ( #256984=note: print w/ replies, xml ) Need Help??


in reply to What are the monks doing with Perl and Linguistics?

Well, I dunno if this qualifies: we're using Lingua::* modules to analyze words for indexing on web pages. Specifically, if a user searches for "advertising", we check words for common stems and so find --

  • ... advert
  • ... advertise
  • ... advertised
  • ... advertiser
  • ... advertisers
  • ... advertises
--
Jeff Boes
Database Engineer
Nexcerpt, Inc.
vox 269.226.9550 ext 24
fax 269.349.9076
 http://www.nexcerpt.com
...Nexcerpt...Connecting People With Expertise


Comment on Re: What are the monks doing with Perl and Linguistics?
Download Code
Replies are listed 'Best First'.
Re: Re: What are the monks doing with Perl and Linguistics?
by allolex (Curate) on May 09, 2003 at 19:50 UTC

    Very interesting stuff. I had a look at the "nexcepts" on your site. Yes, the Lingua derivational morphology modules (looks like Stem, Infinitive, Inflect) have provided some good results. It made me think about how I might go about doing something similar.

    One thing that might make your searches better is some way to account for morphology that is not just stem + ending, like pronounce/pronunciation/pronouncement. Also, grouping (near-)synonyms like "brotherly" and "fraternal" may improve your results. Of course my examples are a bit textbookish, but I'm sure that you can refine things using your expert knowledge about what sort of information your clients might want to look up.

    --
    Allolex

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://256984]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (19)
As of 2015-07-28 18:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (258 votes), past polls