Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
What is the best way to store a list of things. Lets say I have a name and a list of related IPs, whats the best way to store the list? Fields such as "ipone","iptwo","ipthree" etc would seem like a waste of space and very clumsy to work with. Using split() and join() would seem to defeat the purpose of the mySQL DB in the first place, but it would seem to me that it would be the best way to store a list of things. Am I missing something?
Re: Yet Another Half Perl Half mySQL Question
by Masem (Monsignor) on Jul 06, 2001 at 03:01 UTC
|
Here's how I've done it:
Make sure that you have some unique ID for each row of the main table in which you need this 'set'. Call this "mainUniqueID"
Create a new table, with a different uniqueID (this is optional, but I still like unique ID's on all my tables), "otherUniqueID", so that the columsn will be "otherUniqueID" (your index), "mainUniqueID" which will NOT be unique, and "yourData". For each item that is in your list for that mainUniqueID, you'll have a row in this second table.
To get the list, just "SELECT yourData FROM secondTable WHERE mainUniqueID=?". Other options are similar.
I know that some databases have SETs that you can use, but this is not necessarily reliable.
Optionally, you can always store the list as a string delimiated by characters you know will not be in it. If you have IPs, for example, you can easily convert these from 2^32 to ints, and the join ';' them. Mind you, these makes it a bit harder to select a specific IP out of a list.
Dr. Michael K. Neylon - mneylon-pm@masemware.com
||
"You've left the lens cap of your mind on again, Pinky" - The Brain
| [reply] |
Re: Yet Another Half Perl Half mySQL Question
by lestrrat (Deacon) on Jul 06, 2001 at 03:13 UTC
|
Note: I'm a Postgres user, so don't know much about
mySQL -- somethings below may not make sense in mySQL
since everybody else is suggesting an approach with
two tables, I'll try to be creative and suggest a way
with one table. I don't know the pros and cons between
two tables and one table, but here goes...
If you want to push the processing to the backend DB,
then you probably will have make table like so:
CREATE TABLE name2ip (
name text -- don't make this a primary key!
ip text -- or inet, or whatever mySQL has
);
Then just insert all the IP's in name:ip pairs. So
it would be like ( John Doe, ip1 ), ( John Doe, ip2 ),
( Mary Jane, ip3 ), etc.... When you
retrieve all IP's for a name, you do
SELECT ip FROM name2ip WHERE name = somename;
This way you can push the processing of the ip's to the
DB, withou doing any manual comparison. Like this
SELECT name FROM name2ip WHERE ip = someip; -- or, LIKE blah or wha
+tever
Other ways that I can think of is maybe using an array
for the type of the IP address column, but that suffers
the same disadvantages as just storing a long text, with
each IP address concatnated with some delimiter. Namely,
since you can't do a "Does this array contain element X?"
type of call from the SQL statement ( at least not in
Postgres! ), you're back to
processing your data manualy in perl, and you also most
likely have some sort of upper limit to the size of the
column. ( I think in Postgres it was 4k or 8k )
You may be able to use large objects as well, but
again, you need to do some manual processing in perl, plus
it's kind of annoying ( in my experience dealing with large
objects, anyway )
| [reply] [d/l] [select] |
Re: Yet Another Half Perl Half mySQL Question
by Abigail (Deacon) on Jul 06, 2001 at 03:29 UTC
|
The "best" way of storing things depends on what you want
to do with them. On several levels even, from the low level
storage (disks? RAID?) to high level issues like what kind
of datastructure, or table structure.
split and join could do the task
very well, or they are totally unsuitable. But that depends
on how you are going to use the data. If you always want to
return all IP numbers, and are never using them in a (database)
join, then putting them in a string might work very well.
If you need to search on IP number, or need them in a join,
you need to do something else. Perhaps a one to many table,
with two or three columns: first column is the name,
second column is an IP number. You repeat the name for each
IP number belonging to said name. You could use an optional
third column with a sequence number.
-- Abigail | [reply] [d/l] [select] |
Re: Yet Another Half Perl Half mySQL Question
by eejack (Hermit) on Jul 06, 2001 at 03:09 UTC
|
I like creating multiple tables...
create table tbl_names (
id_name INT NOT NULL auto_increment,
str_name VARCHAR(255),
);
create table tbl_ipaddresses (
str_ipaddress VARCHAR(255),
id_name INT,
);
Then you can do schtuff like
@names = qw( firstname secondname );
@ips = ( "123.132.123.121" , "198.21.252.2" , "123.123.123.132");
$sql_statement = qq|INSERT INTO tbl_names (str_name) VALUES (?)|;
$sth = $dbh->prepare($sql_statement);
$sql_statement = qq|INSERT INTO tbl_ipaddresses (id_name, str_ipaddres
+s) VALUES (?,?)|;
$sth2 = $dbh->prepare($sql_statement);
foreach $name (@names){
$rv = $sth->execute($name);
my $id_name = $sth->{'insert_id'};
foreach $ip (@ips){
$rv = $sth2->execute($id_name, $ip);
}
}
To get a name's worth of ips out you could then do..
$sql_statement = qq|select str_ipaddress
from tbl_ipaddresses, tbl_names
where tbl_names.str_name = 'firstname'
and tbl_names.id_name = tbl_ipaddresses.id_name|;
$sth = $dbh->prepare($sql_statement);
$rv = $sth->execute;
while ( ($ipaddress) = $sth->fetchrow_array ){
# do whatever
}
EEjack | [reply] [d/l] [select] |
|
This really isn't a great table design since you're not actually gaining anything by splitting the information this way. The lookups are slower than on a single table using a straighforward SELECT query along the lines of "SELECT ipaddress WHERE name=?", but you haven't gained any additional flexibility since everything is bound to the name_id.,
If you really want to use ids to track this information a better schema would be:
create table names (
name_id INT NOT NULL auto_increment,
name_str VARCHAR(255)
);
create table ipaddresses (
ipaddress_id INT NOT NULL auto_increment,
ipaddress_value VARCHAT(255)
);
create table ip_names (
name_id INT NOT NULL,
ipaddress_id INT NOT NULL
);
This gives you a lot more flexibility -- reverse look ups on virtual hosts, for instance, and allows for extensibility at a later date (associating machine-specific information with an IP address id, and domain specific information with the name id). | [reply] [d/l] |
|
I would tend to agree with you, but you really only need the two tables if the assumption is a one to many relationship.
But assuming a many to many relationship your schema is nicer, and certainly if you expand the scope, having an id on
both the ip address and name is a good thing (most probably a necessary thing).
One minor point though, you can still do reverse lookups on my schema:
$sql_statement = qq|select tbl_names.str_name
from tbl_ipaddresses, tbl_names
where tbl_ipaddresses.str_ipaddress = '201.2.23.2'
and tbl_names.id_name = tbl_ipaddresses.id_name|;
$sth = $dbh->prepare($sql_statement);
$rv = $sth->execute;
while ( ($str_name) = $sth->fetchrow_array ){
# do whatever
}
But thanks for pointing out a nicer way of doing it than I did.
Thanks,
EEjack
| [reply] [d/l] |
Re: Yet Another Half Perl Half mySQL Question
by Brovnik (Hermit) on Jul 06, 2001 at 17:49 UTC
|
This is what I used for a module written recently.
Table Creation :
$sql = "CREATE TABLE depend (\
usedby VARCHAR(80) NOT NULL ,\
uses VARCHAR(80) NOT NULL \
)";
$dbh->do($sql);
Then INSERT rows.
And later, to access :
$sql = "SELECT usedby FROM depend where uses=$name";
$aref = $dbh->selectall_arrayref($sql);
my @uses = map {$_->[0]} @$aref;
$sql = "SELECT uses FROM depend where usedby=$name";
$aref = $dbh->selectall_arrayref($sql);
my @usedby = map {$_->[0]} @$aref;
Note that the table is created without unique IDs or primary
keys since this is a many-to-many relationship. One of the
columns could be an IP address in your case.
For the pedants, there is replication of the name in the columns,
but the duplication is outweighed by the ease of access (see
how easy in lines of code it is to retrieve a list of names)
and reduction in complexity by not having a name-to-id mapping.
BTW, the previous iteration of the module used a single entry in
another table and used join/split, which is quicker, but less
elegant in DB terms.
If you need raw speed, use Perl to do the split, if you want it cleaner
and more DBish, use a table as above.
Full module is Depend::Module
-- Brovnik | [reply] [d/l] [select] |
|
|