Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

[OT] Weighted Search Results

by CiceroLove (Monk)
on May 20, 2004 at 13:13 UTC ( #354937=perlquestion: print w/replies, xml ) Need Help??

CiceroLove has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

I am trying to come up with a better search result for an application I build. This application gets resold to clients and they want the option to restrict columns that can be searched on for weighted results. So for instance, let's say my table looks like this (yes, MYSQL):

name varchar(50)
age int
city varchar(50)
state char(2)
lastmoved date

Obviously, my table is larger than this (by about 85 columns). What I need is to allow the user to provide search terms and then provide a weighted ranking based on how many columns are matched (Thank God, no LIKE operations will be permitted). So if someone wants to search on 'Greg', 32, 'TX', I want it to find all rows that match all three terms and give it 100% weight. Then return all rows with only two of the terms to match and give it 66%, etc. all the way down to a single term.

This application gets sold to clients over and over again so the number of columns that will be searched on is dynamic which means the data types to check are also dynamic. I don't really want to have to retrieve all rows in a long OR operation and process them locally. I am hoping for a more elegant solution (perhaps some hidden/unknown feature in MYSQL?)

All help is of course greatly appreciated

P.S. I searched Google for "Weighted search results MYSQL" and got a bunch of links for search engine rankings. So there's help there possibly but the result set was over 500 pages long. Ack!

Fates! We will know your pleasures: That we shall die, we know; 'Tis but the time, and drawing days out, that men stand upon. - Act III,I, Julius Caesar

Replies are listed 'Best First'.
Re: [OT] Weighted Search Results
by Zaxo (Archbishop) on May 20, 2004 at 14:16 UTC

    I think something like this will work. Warning - really untested.

    my $sql = <<ESQL; SELECT *, (name = 'Greg') + (age = 32) + (state = 'TX') AS score FROM datable WHERE score > 0 ORDER BY score DESC ESQL
    (how's that for a perl fig leaf? ;-)

    Update: What was I thinking of? This looks much better,

    my $sql = <<ESQL; SELECT * FROM datable WHERE ((name = ?) + (age = ?) + (state = ?)) > 0 ESQL
    You lose the score information and ordering, but that can be recovered in perl. You gain placeholders! as well as WHERE clause optimizations mentioned in replies.

    After Compline,

      Brilliant idea. I poped into MySQL Control Center and discovered two things. You need parenthesis around the whole forumula and then you also need to use having instead of where. Where wont work with column aliases and having will.

      SELECT *, ((State = 'CA') + (City = "Chico")) AS score FROM `Address +` HAVING score > 0 ORDER BY score DESC

      Eric Hodges
        This worked like an absolute champ! What a wonderful day this has turned out to be! As was noted by a colleague of mine, this of course coudl slow down dramatically for large datasets since the having is an unindexed column and woudl require some shuffling by MySQL in the background.

        Fates! We will know your pleasures: That we shall die, we know; 'Tis but the time, and drawing days out, that men stand upon. - Act III,I, Julius Caesar

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://354937]
Approved by herveus
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2021-06-25 04:15 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (133 votes). Check out past polls.