Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^5: DBI::SQLite slowness

by erix (Vicar)
on Sep 22, 2013 at 05:12 UTC ( #1055165=note: print w/ replies, xml ) Need Help??


in reply to Re^4: DBI::SQLite slowness
in thread DBI::SQLite slowness

That's actually the same behaviour as other DB's have.

But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary.

If the problem is that simple, can't you just run

sort -u dupslist > no_dupslist

on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets.

(BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication):

perl -e 'for (1..50_000_000){ printf "%012d\n", $_; }' > t_data.txt; echo " drop table if exists t; create unlogged table t(klout integer); " | psql; echo "copy t from '/tmp/t_data.txt'; " | psql time < t_data.txt psql -c 'copy t from stdin' real 0m25.661s
That's a rate of just under 2 million per second

)

UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)


Comment on Re^5: DBI::SQLite slowness
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055165]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (12)
As of 2015-07-03 11:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (51 votes), past polls