|Perl: the Markov chain saw|
Re: Perl Search Engineby leocharre (Priest)
|on Jul 22, 2010 at 07:30 UTC||Need Help??|
What do you mean by follow php links?
The question you posted is vague. Maybe you could refine it- Are you thinking of a cgi script that you put in cgi-bin through which you can search the content of what visitors to the website can see?
Are you familiar with unix? Do you know how you can do a find command and search content of files? For example
find ~/public_html -type f | xargs grep carl
Well, that is something of a sloppy search. It's a real time event, highly cpu needy- and would probably cripple your system if you offered multiple people to use that as a 'search engine' backend.
What (likely) you have in mind is something more advanced. As moritz pointed out- the tasks are divided. One part of the process is to have something that finds or lists *what* it will be you will be searching through- for example, what images, pdf files, html files, whatever- are in your website that you want users to be able to search for.
Another step, afterwards, is another program/system/part of the 'search engine'- that actually looks at these resources, these *what* which you let people search for- and stores information about those things in perhaps a db file or a mysql database. It may store text, filenames, whatever, file size, whatever.
Yet another step is the user interface, what presents the user with a form perhaps, in which they type things in like 'look for these words'. That thing, looks at what information we stored about our picked out resources, and finds matches between what the user requested, what we know of the stuff, and then where the stuff is- that they may go and see it/download it.
Even crazier, and much more fun, is if you're talking about remotely indexing/cataloguing data seen somewhere, say, in another website to which you only have http access. That's a little(lot) bit like search engines such as google do- on the backend.
Anyway- I don't know if this helps you rewrite your question. Consider trying, you'll get good answers and valuable suggestions here.