Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: word similarity measure

by tilly (Archbishop)
on Feb 27, 2009 at 16:32 UTC ( [id://746940]=note: print w/replies, xml ) Need Help??


in reply to word similarity measure

I think you want to read Building a Vector Search Engine in Perl.

Replies are listed 'Best First'.
Re^2: word similarity measure
by Gavin (Archbishop) on Feb 28, 2009 at 12:32 UTC

    Or perhaps Vector Space

    This module takes a list of documents (in English) and builds a simple in-memory search engine using a vector space model. Documents are stored as PDL objects, and after the initial indexing phase, the search should be very fast. This implementation applies a rudimentary stop list to filter out very common words, and uses a cosine measure to calculate document similarity.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://746940]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2025-06-23 13:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.