Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Hi All,
I have created a simple sql script to write an email address (from form input) to a MySQL database, the database itself is very simple, and only stores just email addresses. Below is my code to read the data, line by line, but this script is expected to store upto 20,000 email addresses (will this cause problems?) so printing them line by line will take a while to load and create an extremely large page, could anyone advise me as to how I could read the data and split it in different pages, say 30 emails per page ?
$dbh = DBI->connect("DBI:mysql:$database:localhost","$username","$pass
+word");
my $sth = $dbh->prepare("SELECT * FROM emails");
$sth->execute(); # iterate through resultset
while(my $ref = $sth->fetchrow_hashref()) {
print "Email: $ref->{'contest'}\n\n";
} # clean up
$dbh->disconnect();
Thats how I do it at the moment, but splitting it into pages of 30 emails would make it a lot more organised.
Any help would be great appreciated Thanks, John.
(Ovid: Death to Select Star!) Re: MySQL DBI Help
by Ovid (Cardinal) on Sep 22, 2001 at 21:51 UTC
|
On a note totally unrelated to your question, I couldn't help but notice your SQL:
SELECT * FROM emails
I do a lot of heavy database work and I've been bitten by this type of SQL many times. As a general rule, when trying to write maintainable code, it's a good idea to be very specific about what you will allow. For example, consider the following SQL statement:
SELECT userid, email, contest FROM emails
That SQL statement has a variety of benefits:
- It's immediately obvious to the maintenance programmer exactly what is being requested.
- It's more efficient than selecting a bunch of columns you don't need, if you have more columns in the table than you're actually using.
- If you do a $sth->fetchrow_arrayref, you don't have to worry about columns in the database being reordered.
- If someone renames a column, you will understand faster why your $ref->{'contest'} autovivified a hash entry and produced an undef value.
- If someone removes a column from the database, you get the same benefit as mentioned in the fourth point.
If you're just writing a small program, or use very simple logic as what you have above, these issues may be less likely to impact you. However, as you do more and more work with databases, getting in the habit of stating explicitly what you want will reap tremendous rewards.
Cheers,
Ovid
Vote for paco!
The title of this node is a reference to Death to Dot Star!, for those who may be confused.
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats. | [reply] [d/l] [select] |
Re: MySQL DBI Help
by larsen (Parson) on Sep 22, 2001 at 20:34 UTC
|
You could use LIMIT directive in your SQL
query. Since you're using Mysql, this is a good place to learn how. I hope this could be useful. | [reply] [d/l] |
|
| [reply] |
Re: MySQL DBI Help
by Anonymous Monk on Sep 22, 2001 at 23:14 UTC
|
Thanks for your suggestions, and your tips Ovid - I will remember that for future use.
On the suggestion of flat files, I am currently running a duplicate of this script on another server which uses flat files, I was advised to switch to a mysql database solution so that it eased the load on the system and would speed up the script.
In what instances does mysql really become the beneficial choice if its more suitable to hold 20,000 email addresses in a flat file?
Thanks
| [reply] |
|
I understand that MySQL has recently had some improvements to bring it closer to ACID compliance, but I am still leery of it. MySQL, as I have used it, is as best a file system with SQL, not a database. An excellent explanation of the limitations of MySQL is at Why Not MySQL (and explains what "ACID compliant" means). As for Open Source alternatives to MySQL, you can check out Postgresql or Interbase.
I do not know who recommended MySQL to you or why it was recommended. If the only consideration is performance, I would strongly recommend that you examine this decision more carefully. One thing that many people misunderstand in making technology decisions is that the technology chosen is part of the answer. In order to answer a question properly, you need to get enough information to understand what the question is. With technology choices, we need to identify the problem solved, determine what resources we can apply to these problems and determine how the pros and cons of those resources affect our short and long term goals. Then, and only then, do we begin to have an adequately defined question. Once we have the parameters of the problem and possible solutions clearly defined, then we can begin to decide what technology solutions are an appropriate fit. Here are a few questions off the top of my head regarding that would affect whether or not MySQL is an appropriate solution for your problem:
- What, exactly, is the problem that you need to solve?
- Do you already have technologies in-house that may be applicable (this is important because just throwing another technology at a solution means more technologies to support -- too many technologies is like too many cooks!)
- If you need a database, how many users are likely going to be using the system?
- Will the database be primarily "read only"? (MySQL can be fine for that purpose)
- Do you have the in-house capability to adequately support MySQL? Is there only one person who can support it? What happens if they leave?
The last question is particularly vexing. At work, we primarily use MS SQL Server because that is the database the most of us are familiar with. For some of our projects, we are thinking about switching to Postgres, but there's some resistance due to the lack of in-house knowledge of the product. Unfortunately, that can be a valid complaint if we have acceptable alternative solutions.
Cheers,
Ovid
Vote for paco!
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.
| [reply] |
|
I agree with CubicSpine and Ovid that you might want to
re-evaluate the decision to use mysql. The only advantage
to using mysql in these circumstances would be to enhance
your perl/database skills. If this were the only
consideration then by all means, continue doing so. But,
OTOH, if the following are true:
- each record in the file contains only one field, e-mail address.
- once you get the file, you are the only person using it.
- you don't expect that format of the file to change
anytime soon (like by adding new fields per record).
Then there is nothing that mysql can do that you can't do
faster working with plain text files.
I work with a file that has 48,193 94-byte records. On our
ancient Sun Sparc 10, it takes 3 whole seconds to load the
entire file and store it in an array. On our Sun
Enterprise 350, it takes less than a second.
In my case, however, I have to break each record into 10
fields, be able to sort across multiple non-contiguous
fields, perform merges and complex selects. So I should
consider using a database implementation even though the
file is used only by me and is never updated (except
when I replace the entire file weekly).
"Make everything as simple as possible, but not simpler." -- Albert Einstein
| [reply] |
|
AM - I think that you will find that using a SQL database will give you a major advantage in situations where you are dealing with:
- More complex data than just a single field
- An application that does more than just "get everything in the database".
In your situation, it might be worth doing some benchmarking. It wouldn't be too hard to set up a time test comparing reading in all the records from the DB and reading in all the records from a file. I think that if you can prove to your coworkers that the load is minimal then you should go with the file. Of course, this does depend on how often you expect to be performing this operation.
| [reply] |
Re: MySQL DBI Help
by CubicSpline (Friar) on Sep 22, 2001 at 20:40 UTC
|
Why don't you handle this from within your script? If all you want is to print out 30 email addresses per page, just select all the addresses and only print out 30 at a time.
The way you've set up your database really limits your options of pulling out data. | [reply] |
|
| [reply] |
|
| [reply] |
|
|
|