Re: Which is quicker - writing to a file or inserting into a db?
by Joost (Canon) on Dec 13, 2007 at 23:13 UTC
|
Oh, assuming the actual disk speed is the same; well let's assume the database is on the same machine and writes to the same disk, I would guess that appending to a file is much quicker than inserting a record in the database.
All bets are off once you need to check against existing records (basically starting when you need to see if an email has already been registered). For simple cases, it's still relatively trivial to beat mysql, but you're going to have to build a lot more infrastructure yourself, or use some simpler, faster (probably non-relational) database system, like BerkelyDB.
update: you are already using mod_perl or some other persistent processing system, right? otherwise there's really not much point in even thinking about this.
| [reply] [Watch: Dir/Any] |
Re: Which is quicker - writing to a file or inserting into a db?
by eserte (Deacon) on Dec 13, 2007 at 23:08 UTC
|
Writing to a plain file is probably faster. Talking to a database has some overhead like passing data through a network/socket connection, parsing and interpreting SQL statements (although excessive parsing can be avoided by using a prepared statement). Also there's additional overhead in the underlying file structure of databases, and possible the index overhead.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
There's no performance penalty in appending to a large file.
But you should probably think about locking, as multiple processes simultaneously writing to the same file may cause problems. Or just make sure that each process writes to its own file, e.g. by appending $$ to the file name.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Which is quicker - writing to a file or inserting into a db?
by KurtSchwind (Chaplain) on Dec 14, 2007 at 00:33 UTC
|
There seems to be an assumption that the over-head of the db + the disk io of the db is going to be greater than the disk io alone. This is most likely true.
However, it's entirely possible to cache db writes so that you can essentially subtract the disk io from the db. RAM is orders of magnitude faster than disk. And if you have extensive caching on the db, it really can out-perform straight disk io.
I'm just sayin . . .
--
I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.
| [reply] [Watch: Dir/Any] |
|
True, but (and this is partially a response to the post right above): once you've opened the file in a mod_perl or equivalent persistent process, there's no need to close and re-open it between requests, which may or may not shift a lot of IO into RAM anyway. And on an OS designed for high performance open/close may not even make much difference (except for the extra 2 calls).
Which is to say, a RAM disk will probably not gain you much (if anything) once you much once you've got the basic IO strategy optimized. Provided your system isn't completely overloaded. And you really don't want to use a RAM disk or any serious write-caching in a situation where you're pushing the whole system to the limit. Assuming you would actually want to store the data somewhere where it's safe.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Which is quicker - writing to a file or inserting into a db?
by naChoZ (Curate) on Dec 14, 2007 at 03:37 UTC
|
I recently did a poor man's benchmark just for kicks. The results might at least be interesting to you.
I wrote this little script to populate a db with 50,000 rows into a db table of three columns (one just an autoinc id).
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper; $Data::Dumper::Indent = 3;
use Digest::MD5 qw/md5_base64/;
use base qw/DBIx::Class::Schema::Loader/;
DBIx::Class::Schema::Loader->loader_options( relationships => 1 );
my $schema = DBIx::Class::Schema::Loader->connect('dbi:SQLite:/tmp/tes
+t');
#my $schema = DBIx::Class::Schema::Loader->connect('dbi:Pg:dbname=test
+', 'postgres', '' );
#my $schema = DBIx::Class::Schema::Loader->connect('dbi:mysql:dbname=t
+est', 'mysql', '' );
my @inserts;
for ( 1 .. 50000 ) {
push @inserts,
{
foo => md5_base64( $_ . 'foo' ),
bar => md5_base64( $_ . 'bar' )
};
}
my $new_row = $schema->resultset('TestTable');
$new_row->populate( \@inserts );
Loading to postgresql-server-8.1.9-2.1 took just under 8 minutes, mysql-5.0.26-12 about 35 seconds, sqlite-3.3.8-14 even with synchronous mode turned off couldn't even seem to handle 1000 rows a minute (I didn't wait for it to finish all 50,000 rows). I reused this script just now for writing 50,000 lines to a flat text file and it was done almost before I pressed enter.
Obviously YMMV. My workstation is a sun ultra 40 dual opteron with a pair of sata drives that I striped into a raid 0 and I'm running opensuse 10.2.
--
naChoZ
Therapy is expensive. Popping bubble wrap is cheap. You choose.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Which is quicker - writing to a file or inserting into a db?
by aquarium (Curate) on Dec 14, 2007 at 00:26 UTC
|
| [reply] [Watch: Dir/Any] |
Re: Which is quicker - writing to a file or inserting into a db?
by chromatic (Archbishop) on Dec 14, 2007 at 07:53 UTC
|
Which is quicker and less likely to break if a deluge of visitors come to the site all at one time: Writing the email address to a file or inserting the email into a mysql db?
Your network connection is most likely to get saturated, unless you run out of file descriptors or available database connections first.
| [reply] [Watch: Dir/Any] |
Re: Which is quicker - writing to a file or inserting into a db?
by pajout (Curate) on Dec 14, 2007 at 18:36 UTC
|
I think that solution depends on some factors:
1. How many requests do you want to REALLY serve? (10/sec? 1000/sec?)
2. Should be saved data on-line immediately for further use? If not, how long time gap from their save to their usability is acceptable?
3. How many http listeners will do this service?
4. Are there some requirements for data consistency and integrity?
5. Are there some scenarios of HW/OS/SW crash?
| [reply] [Watch: Dir/Any] |