|Problems? Is your data what you think it is?|
Re^5: DBI::SQLite slownessby erix (Parson)
|on Sep 22, 2013 at 05:12 UTC||Need Help??|
That's actually the same behaviour as other DB's have.
But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary.
If the problem is that simple, can't you just run
on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets.
(BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication):
That's a rate of just under 2 million per second
UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)