Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^4: DBI::SQLite slowness

by Endless (Beadle)
on Sep 21, 2013 at 18:20 UTC ( [id://1055143]=note: print w/replies, xml ) Need Help??


in reply to Re^3: DBI::SQLite slowness
in thread DBI::SQLite slowness

Thanks for the tip; it will come in handy when I switch to other databases. Looks like DBI::SQLite keeps that all under the hood, though, with transactions implicitly invoked if autocommit is off. See https://metacpan.org/module/DBD::SQLite#Transactions

Replies are listed 'Best First'.
Re^5: DBI::SQLite slowness
by erix (Prior) on Sep 22, 2013 at 05:12 UTC

    That's actually the same behaviour as other DB's have.

    But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary.

    If the problem is that simple, can't you just run

    sort -u dupslist > no_dupslist

    on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets.

    (BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication):

    perl -e 'for (1..50_000_000){ printf "%012d\n", $_; }' > t_data.txt; echo " drop table if exists t; create unlogged table t(klout integer); " | psql; echo "copy t from '/tmp/t_data.txt'; " | psql time < t_data.txt psql -c 'copy t from stdin' real 0m25.661s
    That's a rate of just under 2 million per second

    )

    UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1055143]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2025-06-15 15:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.