Re^3: DBI::SQLite slowness

in reply to Re^2: DBI::SQLite slowness
in thread DBI::SQLite slowness

If I simply have "$dbh->commit" following the For loop, is that effectively the same as explicitly using transactions?

No; BEGIN WORK (or BEGIN TRANSACTION or just plain BEGIN) starts a transaction in SQL. Then (normally after 1 or more insert/delete/update's) COMMIT commits the transaction (or, when an error occurs, ROLLBACK rolls the transaction back to the state just before the BEGIN).

See DBI's begin_work

(I don't use SQLite; in the above I am assuming SQLite does this the same way that other (RDBMS) databases do)

Comment on Re^3: DBI::SQLite slowness

Replies are listed 'Best First'.
Re^4: DBI::SQLite slowness by Endless (Beadle) on Sep 21, 2013 at 18:20 UTC
Thanks for the tip; it will come in handy when I switch to other databases. Looks like DBI::SQLite keeps that all under the hood, though, with transactions implicitly invoked if autocommit is off. See https://metacpan.org/module/DBD::SQLite#Transactions	[reply]
Re^5: DBI::SQLite slowness by erix (Prior) on Sep 22, 2013 at 05:12 UTC
That's actually the same behaviour as other DB's have. But only now do I see the initial thread/problem (Scaling Hash Limits). (It's useful to link to original threads in follow-up posts, you know). With the relatively small sizes involved, a database doesn't seem necessary. If the problem is that simple, can't you just run `sort -u dupslist > no_dupslist` [download] on your id list? Perhaps not very interesting, or fast (took about 7 minutes in a 100M test run here), but about as simple as it gets. (BTW, just another datapoint (as I did the test already): PostgreSQL (9.4devel) loads about 9000 rows/s, on a slowish, low-end desktop. That's with the laborious INSERT-method that your script uses; bulk-loading (with COPY) loads ~ 1 million rows /second (excluding any de-duplication): `perl -e 'for (1..50_000_000){ printf "%012d\n", $_; }' > t_data.txt; echo " drop table if exists t; create unlogged table t(klout integer); " \| psql; echo "copy t from '/tmp/t_data.txt'; " \| psql time < t_data.txt psql -c 'copy t from stdin' real 0m25.661s` [download] That's a rate of just under 2 million per second ) UPDATE: added 'unlogged', adjusted timings (it makes the load twice as fast)	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom