Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Perl Search Engine

by zylstra555 (Initiate)
on Jul 22, 2010 at 06:15 UTC ( [id://850778]=perlquestion: print w/replies, xml ) Need Help??

zylstra555 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, A while back, I used a Perl-based, open-source search engine on my website as well as a clients. It has been a long time now, and I forgot the name! Does anyone know of a good Perl-based search engine that can follow PHP links?

Replies are listed 'Best First'.
Re: Perl Search Engine
by moritz (Cardinal) on Jul 22, 2010 at 06:19 UTC
    A search engine doesn't follow links - that's what a crawler does.

    A search engine has an index of documents, and looks for matches therein. Perl search engines are KinoSearch and Plucene, for example.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Perl Search Engine
by leocharre (Priest) on Jul 22, 2010 at 07:30 UTC

    What do you mean by follow php links?

    The question you posted is vague. Maybe you could refine it- Are you thinking of a cgi script that you put in cgi-bin through which you can search the content of what visitors to the website can see?

    Are you familiar with unix? Do you know how you can do a find command and search content of files? For example

    find ~/public_html -type f | xargs grep carl
    

    Well, that is something of a sloppy search. It's a real time event, highly cpu needy- and would probably cripple your system if you offered multiple people to use that as a 'search engine' backend.

    What (likely) you have in mind is something more advanced. As moritz pointed out- the tasks are divided. One part of the process is to have something that finds or lists *what* it will be you will be searching through- for example, what images, pdf files, html files, whatever- are in your website that you want users to be able to search for.

    Another step, afterwards, is another program/system/part of the 'search engine'- that actually looks at these resources, these *what* which you let people search for- and stores information about those things in perhaps a db file or a mysql database. It may store text, filenames, whatever, file size, whatever.

    Yet another step is the user interface, what presents the user with a form perhaps, in which they type things in like 'look for these words'. That thing, looks at what information we stored about our picked out resources, and finds matches between what the user requested, what we know of the stuff, and then where the stuff is- that they may go and see it/download it.

    Even crazier, and much more fun, is if you're talking about remotely indexing/cataloguing data seen somewhere, say, in another website to which you only have http access. That's a little(lot) bit like search engines such as google do- on the backend.

    Anyway- I don't know if this helps you rewrite your question. Consider trying, you'll get good answers and valuable suggestions here.

Re: Perl Search Engine
by nvivek (Vicar) on Jul 22, 2010 at 08:40 UTC

    There are a lot of search engines in Perl.Most of the sites uses CGI script for displaying the pages.Some of the free and proprietary sites are as follows.
    Dansie Search Engine
    Extropia Site Search (free)
    F3DSearch
    FluffySearch (free but no support)
    Fluid Dynamics Search
    Global Data SiteSearch (free)
    Htgrep
    HTTP::Index module (free)
    KSearch
    Matt's Simple Search (free)
    Perlfect Search (free)
    RuterSearch (free)
    RiSearch (some versions free, some paid)
    Selena Sol's Keyword Search (now Extropia Site Search, free)
    Sphinx new javaOpen Source Code Unix-based tool windows-based tool Mac OS X
    WebSearch Perl Script

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://850778]
Approved by sflitman
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2024-04-18 10:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found