Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Yet Another Half Perl Half mySQL Question

by Anonymous Monk
on Jul 06, 2001 at 02:47 UTC ( [id://94322]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

What is the best way to store a list of things. Lets say I have a name and a list of related IPs, whats the best way to store the list? Fields such as "ipone","iptwo","ipthree" etc would seem like a waste of space and very clumsy to work with. Using split() and join() would seem to defeat the purpose of the mySQL DB in the first place, but it would seem to me that it would be the best way to store a list of things. Am I missing something?

Replies are listed 'Best First'.
Re: Yet Another Half Perl Half mySQL Question
by Masem (Monsignor) on Jul 06, 2001 at 03:01 UTC
    Here's how I've done it:

    Make sure that you have some unique ID for each row of the main table in which you need this 'set'. Call this "mainUniqueID"

    Create a new table, with a different uniqueID (this is optional, but I still like unique ID's on all my tables), "otherUniqueID", so that the columsn will be "otherUniqueID" (your index), "mainUniqueID" which will NOT be unique, and "yourData". For each item that is in your list for that mainUniqueID, you'll have a row in this second table.

    To get the list, just "SELECT yourData FROM secondTable WHERE mainUniqueID=?". Other options are similar.

    I know that some databases have SETs that you can use, but this is not necessarily reliable.

    Optionally, you can always store the list as a string delimiated by characters you know will not be in it. If you have IPs, for example, you can easily convert these from 2^32 to ints, and the join ';' them. Mind you, these makes it a bit harder to select a specific IP out of a list.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: Yet Another Half Perl Half mySQL Question
by lestrrat (Deacon) on Jul 06, 2001 at 03:13 UTC

    Note: I'm a Postgres user, so don't know much about mySQL -- somethings below may not make sense in mySQL

    since everybody else is suggesting an approach with two tables, I'll try to be creative and suggest a way with one table. I don't know the pros and cons between two tables and one table, but here goes...

    If you want to push the processing to the backend DB, then you probably will have make table like so:

    CREATE TABLE name2ip ( name text -- don't make this a primary key! ip text -- or inet, or whatever mySQL has );

    Then just insert all the IP's in name:ip pairs. So it would be like ( John Doe, ip1 ), ( John Doe, ip2 ), ( Mary Jane, ip3 ), etc.... When you retrieve all IP's for a name, you do

    SELECT ip FROM name2ip WHERE name = somename;

    This way you can push the processing of the ip's to the DB, withou doing any manual comparison. Like this

    SELECT name FROM name2ip WHERE ip = someip; -- or, LIKE blah or wha +tever

    Other ways that I can think of is maybe using an array for the type of the IP address column, but that suffers the same disadvantages as just storing a long text, with each IP address concatnated with some delimiter. Namely, since you can't do a "Does this array contain element X?" type of call from the SQL statement ( at least not in Postgres! ), you're back to processing your data manualy in perl, and you also most likely have some sort of upper limit to the size of the column. ( I think in Postgres it was 4k or 8k )

    You may be able to use large objects as well, but again, you need to do some manual processing in perl, plus it's kind of annoying ( in my experience dealing with large objects, anyway )

Re: Yet Another Half Perl Half mySQL Question
by Abigail (Deacon) on Jul 06, 2001 at 03:29 UTC
    The "best" way of storing things depends on what you want to do with them. On several levels even, from the low level storage (disks? RAID?) to high level issues like what kind of datastructure, or table structure.

    split and join could do the task very well, or they are totally unsuitable. But that depends on how you are going to use the data. If you always want to return all IP numbers, and are never using them in a (database) join, then putting them in a string might work very well. If you need to search on IP number, or need them in a join, you need to do something else. Perhaps a one to many table, with two or three columns: first column is the name, second column is an IP number. You repeat the name for each IP number belonging to said name. You could use an optional third column with a sequence number.

    -- Abigail

Re: Yet Another Half Perl Half mySQL Question
by eejack (Hermit) on Jul 06, 2001 at 03:09 UTC
    I like creating multiple tables...
    create table tbl_names ( id_name INT NOT NULL auto_increment, str_name VARCHAR(255), ); create table tbl_ipaddresses ( str_ipaddress VARCHAR(255), id_name INT, );

    Then you can do schtuff like

    @names = qw( firstname secondname ); @ips = ( "123.132.123.121" , "198.21.252.2" , "123.123.123.132"); $sql_statement = qq|INSERT INTO tbl_names (str_name) VALUES (?)|; $sth = $dbh->prepare($sql_statement); $sql_statement = qq|INSERT INTO tbl_ipaddresses (id_name, str_ipaddres +s) VALUES (?,?)|; $sth2 = $dbh->prepare($sql_statement); foreach $name (@names){ $rv = $sth->execute($name); my $id_name = $sth->{'insert_id'}; foreach $ip (@ips){ $rv = $sth2->execute($id_name, $ip); } }
    To get a name's worth of ips out you could then do..
    $sql_statement = qq|select str_ipaddress from tbl_ipaddresses, tbl_names where tbl_names.str_name = 'firstname' and tbl_names.id_name = tbl_ipaddresses.id_name|; $sth = $dbh->prepare($sql_statement); $rv = $sth->execute; while ( ($ipaddress) = $sth->fetchrow_array ){ # do whatever }

    EEjack

      This really isn't a great table design since you're not actually gaining anything by splitting the information this way. The lookups are slower than on a single table using a straighforward SELECT query along the lines of "SELECT ipaddress WHERE name=?", but you haven't gained any additional flexibility since everything is bound to the name_id.,

      If you really want to use ids to track this information a better schema would be:

      create table names ( name_id INT NOT NULL auto_increment, name_str VARCHAR(255) ); create table ipaddresses ( ipaddress_id INT NOT NULL auto_increment, ipaddress_value VARCHAT(255) ); create table ip_names ( name_id INT NOT NULL, ipaddress_id INT NOT NULL );

      This gives you a lot more flexibility -- reverse look ups on virtual hosts, for instance, and allows for extensibility at a later date (associating machine-specific information with an IP address id, and domain specific information with the name id).

        I would tend to agree with you, but you really only need the two tables if the assumption is a one to many relationship.

        But assuming a many to many relationship your schema is nicer, and certainly if you expand the scope, having an id on both the ip address and name is a good thing (most probably a necessary thing).

        One minor point though, you can still do reverse lookups on my schema:

        $sql_statement = qq|select tbl_names.str_name from tbl_ipaddresses, tbl_names where tbl_ipaddresses.str_ipaddress = '201.2.23.2' and tbl_names.id_name = tbl_ipaddresses.id_name|; $sth = $dbh->prepare($sql_statement); $rv = $sth->execute; while ( ($str_name) = $sth->fetchrow_array ){ # do whatever }
        But thanks for pointing out a nicer way of doing it than I did.

        Thanks,

        EEjack

Re: Yet Another Half Perl Half mySQL Question
by Brovnik (Hermit) on Jul 06, 2001 at 17:49 UTC
    This is what I used for a module written recently.

    Table Creation :

    $sql = "CREATE TABLE depend (\ usedby VARCHAR(80) NOT NULL ,\ uses VARCHAR(80) NOT NULL \ )"; $dbh->do($sql);
    Then INSERT rows. And later, to access :
    $sql = "SELECT usedby FROM depend where uses=$name"; $aref = $dbh->selectall_arrayref($sql); my @uses = map {$_->[0]} @$aref; $sql = "SELECT uses FROM depend where usedby=$name"; $aref = $dbh->selectall_arrayref($sql); my @usedby = map {$_->[0]} @$aref;
    Note that the table is created without unique IDs or primary keys since this is a many-to-many relationship. One of the columns could be an IP address in your case.

    For the pedants, there is replication of the name in the columns, but the duplication is outweighed by the ease of access (see how easy in lines of code it is to retrieve a list of names) and reduction in complexity by not having a name-to-id mapping.

    BTW, the previous iteration of the module used a single entry in another table and used join/split, which is quicker, but less elegant in DB terms.

    If you need raw speed, use Perl to do the split, if you want it cleaner and more DBish, use a table as above.

    Full module is Depend::Module
    --
    Brovnik

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://94322]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-26 01:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found