Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^3: SQLite Slow insert

by mmittiga17 (Scribe)
on Feb 18, 2013 at 17:43 UTC ( #1019378=note: print w/ replies, xml ) Need Help??


in reply to Re^2: SQLite Slow insert
in thread SQLite Slow insert

Thank you for your replay guys!!! I think the issue is trying to check if data exists first before inserting, so it seems I have to do one row of data at a time, unless there is a way to test if data row exists then how I was doing it. I am reading all of the suggested links everyone has passed along, This is my first crack on using a DB like this.

my $sth = $dbh->prepare("INSERT INTO HLDdata (Data) VALUES (?)"); foreach $txt (@TXT) { print "$txt\n"; open(IN,"$txt") || warn("cant open $txt"); while(<IN>){ $line = $_; my $sth = $dbh->prepare( "select count(*) from HLDdata where Data += '$line'"); $sth->execute(); my ($data) = $sth->fetchrow(); print "$data \n"; if ($data ne 0){ print "record exist not adding\n"; &Logit("record exist not adding"); }else{ my ($Data) = $line; $sth->execute(); $line =~s/\|/\t/g; # print "$line\n"; print OUT "$line"; } }


Comment on Re^3: SQLite Slow insert
Download Code
Replies are listed 'Best First'.
Re^4: SQLite Slow insert
by Anonymous Monk on Feb 19, 2013 at 10:50 UTC

    There were a few issues in your code. I fixed them below.

    my $sth = $dbh->prepare("INSERT INTO HLDdata (Data) VALUES (?)"); my $count = $dbh->prepare( "select count(*) from HLDdata where Data = +?"); foreach $txt (@TXT) { print "$txt\n"; open(IN,"$txt") || warn("cant open $txt"); while(<IN>){ $line = $_; $count->execute($line); my ($data) = $count->fetchrow(); print "$data \n"; if ($data != 0){ print "record exist not adding\n"; Logit("record exist not adding"); next; } my ($Data) = $line; $sth->execute($Data); $line =~s/\|/\t/g; # print "$line\n"; print OUT $line; } }

    Yes, you're probably correct that the checking is taking the most of the time. This is because you're missing an index on HLDdata (data) -- sqlite has to scan the whole table every time you check for a row's existence. You can create the missing index with CREATE UNIQUE INDEX HLDdata_uniq ON HLDdata (Data);

    There's a trade-off: you'll now use double the disk space. If $Data tends to be big (above a hundred bytes or so), you are better off storing a hash (sha1, md5, or similar) of $Data in the table in a second column and checking for its existence instead. (SQLite does not appear to have support for functional indices or a hashing function, so you'll have to do it on the Perl side.)

      Thank you for your help, it got me sorted. I appreciate the help and have learned from it. Cheers!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1019378]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (8)
As of 2015-07-31 03:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls