Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Many-to-many relationships in databases : SOLVED

by CountZero (Bishop)
on Oct 28, 2005 at 06:18 UTC ( #503568=perlquestion: print w/replies, xml ) Need Help??

CountZero has asked for the wisdom of the Perl Monks concerning the following question:

My Dear Fellow PerlMonks!

I have a question which relates mainly to SQL-syntax, but as I'm not sure it can be solved purely in SQL alone and most probably need some aggregating and munching in our beloved Perl language, I post this question here.

I have a database which has several tables (say "Books" and "Authors" to keep it simple) amongst which exists a "many-to-many" relationship. To set this relationship up, I added the usual "BooksAuthors" table, which has foreign keys pointing to the "Books" and the "Authors" table.

This works nice and I can collect easily the names of the authors for a certain book or the books one author has written. This is all standard stuff as found in many tutorials on SQL. Even Class::DBI and its brethren knows how to do it.

Now in my application I want to include a search function which will find me the book(s) written by more than one author: "Give me the list of books written by Jones AND Smith". 'Jones' and 'Smith' being co-authors for these books as it were.

Of course an SQL such as SELECT book FROM BooksAuthors WHERE Author='Jones' AND Author='Smith' will not work since each row of this table only has one Author and that field cannot be 'Jones' and 'Smith' at the same time.

So I vaguely remembered something about self-joins and tried the following: SELECT book FROM BooksAuthors AS B1 JOIN BooksAuthors AS B2 WHERE B1.Author = 'Jones' AND B2.Authors = 'Smith' AND B1.Book = B2.Book

That seems to work, but gets me into problems if I want to check for three (or more) authors. It seems I will have to add more and more self-joins and I wonder if that is The Right Way. Or is there perhaps a more Perlish-way to handle this?

UPDATE: Thanks all for the answers and good suggestions. The solution suggested by Corion seems best esp. when combined with the 'IN' syntax.

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Replies are listed 'Best First'.
Re: Many-to-many relationships in databases
by Corion (Pope) on Oct 28, 2005 at 06:30 UTC

    As I'm working my way through Celkos "SQL for Smarties", he points out an interesting way to map such boolean queries by using GROUP BY and HAVING:

    -- find all book ids coauthored by Smith and Jones: -- (let's assume that the author_name is unique) select book_id, count(*) from authors where author_name = 'Smith' or author_name = 'Jones' group by book_id having count(*) = 2

    Whether the performance of this is better or worse than selecting all books by one and all books by the other and scanning the lists in Perl, or worse than dynamically creating the self joins needs to be benchmarked. Another way to make this slightly less dynamic could be to use the IN predicate:

    -- find all book ids coauthored by Smith and Jones: -- (let's assume that the author_name is unique) select book_id, count(*) from authors where author_name in ('Smith', 'Jones') group by book_id having count(*) = 2

    To make it completely static, you could insert the authors into a (session-local) temporary table and cross-join that table.

    Update: Oops - this approach will also return books co-authored by 'Smith', 'Jones' and 'Miller'. Depending on whether that's what you want or not, you will need to use a subselect to make sure you get all of the authors. But I'm not sure if I can come up with the correct subselect at the moment :-)

      Well, you can fix it by using except on a query that gives all the book_id's where the author isn't one of 'Smith' or 'Jones'.

      select book_id from authors where author_name in ('Smith','Jones') group by book_id having count(*)=2 except select book_id from authors where author_name not in ('Smith','Jones');

      You could also alter the where clause to make certain book_id doesn't have a different author:

      select book_id from authors where author_name in ('Smith','Jones') and book_id not in ( select book_id from authors where author_name not in ('Smith','Jones') ) group by book_id having count(*)=2

      By the way, if anyone wants to read some of Joe Celko's old articles from DBMS magazine, here are some that are available from their website (the puzzles are really fun).

Re: Many-to-many relationships in databases
by davorg (Chancellor) on Oct 28, 2005 at 08:51 UTC

    This is really off-topic for this site, but let's have a go anyway.

    You need to join the BookAuthor table with itself. I'm old-skool so I do joins in the WHERE clause, but there's probably a simpler way to do it using an explicit JOIN clause.

    select a.book from BookAuthor a, BookAuthor b where a.author = 'Smith' and b.author = 'Jones' and a.book = b.book;

    Update: The JOIN syntax looks like this:

    select a.book from BookAuthor a inner join BookAuthor b on a.book = b.book where a.author = 'Smith' and b.author = 'Jones';

    You should be able to omit the "INNER", but I was testing on a rather old version of MySQL which didn't support the abbreviated syntax.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: Many-to-many relationships in databases
by leriksen (Curate) on Oct 28, 2005 at 06:32 UTC
    I'm not the strongest on SQL, but I think you can use 'in' to solve your problem i.e.

    SELECT book FROM BooksAuthors WHERE Author in ('Jones', 'Smith');

    SQL syntax may not be perfect - not my main field of expertise

    Also look at SQL::Abstract and the extension to Class::DBI that uses it, Class::DBI::AbstractSearch, for an easy way to build these queries dynamically using perl data structures, for any length list of authors.

    ...reality must take precedence over public relations, for nature cannot be fooled. - R P Feynmann

      That's close but will also return books authored by only one of 'Jones' or 'Smith' alone. Of course, some quick Perl filtering might be easier than doing it in SQL.

        I don't think that's true, unless different SQL engines implement IN differently.
        On all the engines I've used, IN is equivalent to a list of OR statements.
Re: Many-to-many relationships in databases
by Delusional (Beadle) on Oct 28, 2005 at 12:06 UTC
    Just my 2 cents here. The IN is CaSe sensitive, so a search for Smith will only return Smith, not smith (atleast not in the MySQL version I have here). Sounds trivial, however, if your allowing the user to enter a name, then its no longer trivial. On the other hand LIKE returns Smith, smith, sMiTh and any other variants, however, each queried name must be entered indivuially.... (The good with the bad I guess).
    SELECT book FROM BooksAuthors AS B1 JOIN BooksAuthors AS B2 WHERE B1.Author LIKE '%Jones%' AND B2.Authors LIKE '%Smith%' AND B1.Book = B2.Book
    To note here, the % sign is a wild card. Thus, if searching for John using John%, would also return items like Johny. Use only if you think it may be necessary.

    You could make a for loop to create a query variable that, essentually does something like:
    Assign select query up to WHERE
    Loop through user input adding B1.Author LIKE '$input' AND B2.Authors LIKE '$input' (don't forget an AND or OR claus between each pass)
    Add the AND B1.Book = B2.Book to the end of the variable, then run the query, process the output.

      Just convert everything to upper case. Far more efficient than using LIKE unnecessarily.

      SELECT book FROM BooksAuthors AS B1 JOIN BooksAuthors AS B2 ON B1.Book = B2.Book WHERE UPPER(B1.Author) = 'JONES' AND UPPER(B2.Author) = 'SMITH'
      --
      <http://dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://503568]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2020-02-20 07:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?










    Results (86 votes). Check out past polls.

    Notices?