Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Search Engine Suggestions

by Anonymous Monk
on May 23, 2023 at 03:12 UTC ( [id://11152378]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I'm looking for suggestions on a personal project. I'm writing a program to fetch a page listing items for sale on an auction site, then parse it into an array (AoA) with title, link and price for each item (this part is done). The idea is to have it search the AoA next for items I'm looking for based on search parameters I set in a config file, then build an html file listing just those items so I can view it in a browser later. I'll use a cron job to run the program nightly.

I want to be able to do better than the basic search engine that you find on those type of sites. Maybe using something similar to SQL where you can have logical operators such as AND, OR, NOT and also be able to deal with variations in spelling of words. The search function is is where I'm looking for suggestions. Speed is not an issue and I'd like to avoid using a database. Are there any modules or examples that you can point me to? Thanks

Replies are listed 'Best First'.
Re: Search Engine Suggestions
by Corion (Patriarch) on May 23, 2023 at 07:14 UTC

    My recommendation is to use DBD::SQLite with its full text search. That way, you can add synonyms etc. to your data to search/find stuff that is not verbatim in your scraped listings.

Re: Search Engine Suggestions
by harangzsolt33 (Chaplain) on May 23, 2023 at 06:00 UTC
    Hmm... That's not an easy job...because I imagine that in some situations, you would want your search engine to find synonyms. For example, let's say I am searching for a "4TB hdd." It should understand the word "hdd" and automatically look for the words "hard drive" and "hard disk" and "ssd" as well. BUT if I were to search for the words "2007 Dodge Caravan rear brake pads" then it would be really annoying if it also showed me Ford brake pads just because they are both cars.

    One time I wrote a stock search engine in JavaScript. But it's very-very simple. It wants you to type in a name or part of the name of a company. Then it lists all the companies that have that word in their name. So, for example, you enter the word "nano" and it will display every company whose name contains the word "nano." It loads all the company names into memory first, which takes a few seconds. But once the page loads, then the search results appear at lightning speed. http://wzsn.net/stocksearch.html

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11152378]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-18 01:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found