note
Marshall
<i>In the past (last time I tried was, I think, a couple of years ago) SQLite always proved prohibitively slow: loading multimillion-row data was so ridiculously slow (even on fast hardware), that I never bothered with further use. </i><p>
I showed the framework of some code at [id://942945] which is an abbreviated version of some code that I'm currently working on. One table has a million records of 50 fields. So I ran a couple of tests.<p>
First test was with all the speed-up stuff turned off:
<c>
Starting: Fri Dec 16 13:56:56 2011
Creating new Database!! - HD table
records inserted: 1,117,526
Ending: Sun Dec 18 02:15:30 2011
</c>
Now I figure that qualifies as <u>"ridiculously slow!"</u>. I actually had to run it twice because I got one of those "Windows Automatic Reboot was required" things! Bummer when that happens after one day of processing!<p>
Using the optimizations and by FAR and away the biggest effect is to do a single transaction! Results in:
<c>
Starting: Sun Dec 18 15:26:53 2011
Creating new Database!! - HD table
records inserted: 1,117,526
Ending: Sun Dec 18 15:29:44 2011
</c>
Or about 3 minutes instead of 2 days! A lot better! This is fast enough for my needs. Using the bulk import utility would probably be faster, but I haven't tested that because 3 minutes doesn't bother my application. <p>
I have another app that builds a 500K record table and it builds it from 1,000 input files. Takes about 1/2 the time or about 90 seconds. Its not worth my programming effort to emit an intermediate file in the whatever the bulk import utility needs - I just put the data into the DB right away. A reasonable programming tradeoff. Mileage varies.<p>
It should be noted that my machine is an older one, a hyper threaded one (before the multi-core days), the Prescott stepping - last one with PGA (pin grid array) and my disks are only 5K rpm (not 7K+). A more modern machine can run a single process at least 4x this fast or about 45 seconds instead of 3 minutes (I've bench marked my machine vs a friend's on similar tasks before).<p>
The time scales linearly, so 11M records would take 10x as long. Is that "ridiculously slow?", I don't know. I guess that depends upon the application. <p>
I do have a MySQL server running on my machine and in the past I've done some benchmarking vs SQLite. For a complicated query, MySQL is faster, but for my current projects, SQLite "wins" due to admin simplicity (none required!).
942908
942949