Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^4: DBI::SQLite slowness

by Endless (Beadle)
on Sep 21, 2013 at 18:20 UTC ( #1055143=note: print w/ replies, xml ) Need Help??


in reply to Re^3: DBI::SQLite slowness
in thread DBI::SQLite slowness

Thanks for the tip; it will come in handy when I switch to other databases. Looks like DBI::SQLite keeps that all under the hood, though, with transactions implicitly invoked if autocommit is off. See https://metacpan.org/module/DBD::SQLite#Transactions


Comment on Re^4: DBI::SQLite slowness
Replies are listed 'Best First'.
Re^5: DBI::SQLite slowness
by erix (Vicar) on Sep 22, 2013 at 05:12 UTC

    That's actually the same behaviour as other DB's have.

    But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary.

    If the problem is that simple, can't you just run

    sort -u dupslist > no_dupslist

    on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets.

    (BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication):

    perl -e 'for (1..50_000_000){ printf "%012d\n", $_; }' > t_data.txt; echo " drop table if exists t; create unlogged table t(klout integer); " | psql; echo "copy t from '/tmp/t_data.txt'; " | psql time < t_data.txt psql -c 'copy t from stdin' real 0m25.661s
    That's a rate of just under 2 million per second

    )

    UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055143]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (11)
As of 2015-07-08 00:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls