Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^4: DBI::SQLite slowness

by Endless (Beadle)
on Sep 21, 2013 at 18:20 UTC ( #1055143=note: print w/ replies, xml ) Need Help??


in reply to Re^3: DBI::SQLite slowness
in thread DBI::SQLite slowness

Thanks for the tip; it will come in handy when I switch to other databases. Looks like DBI::SQLite keeps that all under the hood, though, with transactions implicitly invoked if autocommit is off. See https://metacpan.org/module/DBD::SQLite#Transactions


Comment on Re^4: DBI::SQLite slowness
Re^5: DBI::SQLite slowness
by erix (Vicar) on Sep 22, 2013 at 05:12 UTC

    That's actually the same behaviour as other DB's have.

    But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary.

    If the problem is that simple, can't you just run

    sort -u dupslist > no_dupslist

    on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets.

    (BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication):

    perl -e 'for (1..50_000_000){ printf "%012d\n", $_; }' > t_data.txt; echo " drop table if exists t; create unlogged table t(klout integer); " | psql; echo "copy t from '/tmp/t_data.txt'; " | psql time < t_data.txt psql -c 'copy t from stdin' real 0m25.661s
    That's a rate of just under 2 million per second

    )

    UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055143]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (10)
As of 2014-08-20 20:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (124 votes), past polls