Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
Perl-Sensitive Sunglasses
 
PerlMonks  

DBI strangeness

by jimbus (Friar)
on Oct 13, 2005 at 21:52 UTC ( #500058=perlquestion: print w/ replies, xml ) Need Help??
jimbus has asked for the wisdom of the Perl Monks concerning the following question:

hello

I'm having an issue with insertions/updates using DBI. With Blokhead's help (sorry for getting snippy, Blokhead, but I do try to look things up before I ask) I turned up tracing and I'm not understanding what I'm seeing.

This is the table:

| Field | Type | Null | Key | Default +---------+--------------+------+-----+------------------ | tstamp | timestamp | YES | PRI | CURRENT_TIMESTAMP | url | varchar(64) | | PRI | | average | double(17,1) | YES | | NULL | maximum | bigint(20) | YES | | NULL | total | double(17,0) | YES | | NULL

Here is the code in question

foreach $key (sort keys %urls) { my @tmpData = @{$urls{$key}}; print "\nDATA: @tmpData\n"; eval { $i2_stmt->execute(@tmpData[0],@tmpData[1],@tmpData[2],@tmpData[3], +@tmpData[4 ]); }; if ($@) { $u2_stmt->execute(@tmpData[2],@tmpData[4],@tmpData[4],@tmpData[3], +@tmpData[4],@tmpData[0],@tmpData[1]); } }

the general idea is to insert incoming data, but if a row already exists, update it instead. For test data I'm loading the same data twice, so in the trace output I should be seeing a nice flow of insert, fail, update, insert, fail, update... but I'm seeing insert, fail, update, insert, insert, insert, fail update, fail, exit.

DATA: 05-09-30 10:00 cnet.com 1.0000 1 2 -> execute for DBD::mysql::st (DBI::st=HASH(0x838fd60)~0x804d26c ' +05-09-30 1 0:00' 'cnet.com' '1.0000' '1' '2') -> dbd_st_execute for 0815d350 Binding parameters: insert tmp_urls (tstamp,url,average,maximum, +total) val ues ('05-09-30 10:00','cnet.com','1.0000','1','2') Duplicate entry '2005-09-30 10:00:00-cnet.com' for key 1 error 1062 re +corded: Du plicate entry '2005-09-30 10:00:00-cnet.com' for key 1 <- dbd_st_execute -2 rows !! ERROR: 1062 'Duplicate entry '2005-09-30 10:00:00-cnet.com' for + key 1' (e rr#0) <- execute= undef at coalData.pl line 68 via ./coalData.pl line 67 -> execute for DBD::mysql::st (DBI::st=HASH(0x83c3110)~0x838fe20 ' +1.0000' '2 ' '2' '1' '2' '05-09-30 10:00' 'cnet.com') -> dbd_st_execute for 0838fe14 Binding parameters: update tmp_urls set average = round((average +*total+'1. 0000'*'2')/(total+'2'),2), maximum = greatest(maximum,'1'),total = tot +al + '2' w here tstamp = '05-09-30 10:00' and url = 'cnet.com' <- dbd_st_execute 1 rows <- execute= 1 at coalData.pl line 70 DATA: 05-09-30 12:00 cnet.com 1.0000 1 3 -> execute for DBD::mysql::st (DBI::st=HASH(0x838fd60)~0x804d26c ' +05-09-30 1 2:00' 'cnet.com' '1.0000' '1' '3') -> dbd_st_execute for 0815d350 Binding parameters: insert tmp_urls (tstamp,url,average,maximum, +total) val ues ('05-09-30 12:00','cnet.com','1.0000','1','3') <- dbd_st_execute 1 rows <- execute= 1 at coalData.pl line 68 via ./coalData.pl line 67 DATA: 05-09-30 13:00 cnet.com 1.2000 2 6 -> execute for DBD::mysql::st (DBI::st=HASH(0x838fd60)~0x804d26c ' +05-09-30 1 3:00' 'cnet.com' '1.2000' '2' '6') -> dbd_st_execute for 0815d350 Binding parameters: insert tmp_urls (tstamp,url,average,maximum, +total) val ues ('05-09-30 13:00','cnet.com','1.2000','2','6') Duplicate entry '2005-09-30 13:00:00-cnet.com' for key 1 error 1062 re +corded: Du plicate entry '2005-09-30 13:00:00-cnet.com' for key 1 <- dbd_st_execute -2 rows !! ERROR: 1062 'Duplicate entry '2005-09-30 13:00:00-cnet.com' for + key 1' (e rr#0) <- execute= undef at coalData.pl line 68 via ./coalData.pl line 67 -> execute for DBD::mysql::st (DBI::st=HASH(0x83c3110)~0x838fe20 ' +1.2000' '6 ' '6' '2' '6' '05-09-30 13:00' 'cnet.com') -> dbd_st_execute for 0838fe14 Binding parameters: update tmp_urls set average = round((average +*total+'1. 2000'*'6')/(total+'6'),2), maximum = greatest(maximum,'2'),total = tot +al + '6' w here tstamp = '05-09-30 13:00' and url = 'cnet.com' Duplicate entry '2005-10-13 15:58:42-cnet.com' for key 1 error 1062 re +corded: Du plicate entry '2005-10-13 15:58:42-cnet.com' for key 1 <- dbd_st_execute -2 rows !! ERROR: 1062 'Duplicate entry '2005-10-13 15:58:42-cnet.com' for + key 1' (e rr#0) <- execute= undef at coalData.pl line 70 DBD::mysql::st execute failed: Duplicate entry '2005-10-13 15:58:42-cn +et.com' fo r key 1 at ./coalData.pl line 72. -> DESTROY for DBD::mysql::st (DBI::st=HASH(0x838fe20)~INNER) ERROR: 1062 'Duplicate entry '2005-10-13 15:58:42-cnet.com' for + key 1' (e rr#0) <- DESTROY= undef -> DESTROY for DBD::mysql::st (DBI::st=HASH(0x804d26c)~INNER) ERROR: 1062 'Duplicate entry '2005-10-13 15:58:42-cnet.com' for + key 1' (e rr#0) <- DESTROY= undef -> DESTROY for DBD::mysql::st (DBI::st=HASH(0x804d254)~INNER) ERROR: 1062 'Duplicate entry '2005-10-13 15:58:42-cnet.com' for + key 1' (e rr#0) <- DESTROY= undef -> DESTROY for DBD::mysql::db (DBI::db=HASH(0x838fb20)~INNER) &imp_dbh->mysql: 806405c ERROR: 1062 'Duplicate entry '2005-10-13 15:58:42-cnet.com' for + key 1' (e rr#0) <- DESTROY= undef

The DATA lines are a print line of the input data. I'm also seeing fresh lines being inserted with current time... which might indicate I'm seeing null timestaps... but I don't see that in the DATA lines or the binds to the update

thanks

Jimbus

Never moon a werewolf!

Comment on DBI strangeness
Select or Download Code
Re: DBI strangeness
by pg (Canon) on Oct 14, 2005 at 01:46 UTC

    Quite possible, it is because of the fact that you are using timestamp as part of the primary key.

    You don't do this, NEVER do this! I have dealt with this kind of database design before. When your program execute fast enough and the timestamps of the two operations will be the same, and can cause failure (duplicate key); When it is not that fast, and you get two different timestamps, your program will work. Unpredicatable is the word, and you don't want it.

      Yes, primary keys could be numbers generated from sequence (which garants their uniqueness) in 99% of cases. Exceptionally, you can use characters or some composite types.
      Good theme for a little flame, but not perl-related :)

        You don't think time is numeric, sequential and unique when managesd in a discrete manner?

        Thats crazy talk

        Jimbus

        Never moon a werewolf!

      Never say NEVER

      I probably should have detailed it better, but posting here is often trying to find a balance between providing enough information to get relavant answers and putting too much information and having a million and six tangents questioning every aspect of the code except the question asked. On the other hand, if you would have read as far as the code or the traces, you would have seen that your issue was anticipated and there was more involved.

      The application is not real time, it is processing timestamped log data. The most important information coming from the file is urls/second. I'm deriving url/second as over all and by url. I build hashes with the per socond counts based on epic seconds and dump the data out to a database. Collisions can only occur if there is a time overlap between incoming logs. The insert is inside an eval block so that if it fails, the original and new record are summed with an update. This logical and it has been tested and works.

      In the end, the app users only see data on an hourly basis, so keeping the data on a second basis was expense in both diskspace and query time. In a fit of laziness, I decide to leave the original script alone and run the coalation query in a new script and create a new database based on the hourly averages, peak and totals. I leveraged the eval insert, else update logic from the previous script and it works for the over all table, which has a single component key, the timestamp.

      One difference between the new and old script is the database they are pointing at. The original one uses datatime for the tstamp field, but the new one uses timestamp. Are you inferring that there is an issue with the timestamp data type that makes it unstable? I did notice that when I made it a primary key I noticed that its default is the current time. but if you look at the data I print out and the trace's bind statements, I'm never putting a null tstamp.

      I googled on tstamp and didn't find much. Mysql's reference page pretty much ignores it. It says it is like datetime, but that it will cover the differences that and when you foolow the link to timestamp properties as of 4.1, it mostly talks about 5.0. I'll alter the table so that tstamp is a datetime and test that, but I still don't see whats wrong with what I am doing

      Jimbus

      Never moon a werewolf!
        The UPDATE statement can set a timestamp field to the current time automatically, maybe this is causing the problem.
        poj
Re: DBI strangeness
by jimbus (Friar) on Oct 14, 2005 at 18:15 UTC

    Changing the column to datetime fixed the problem. There's one mythical man day I'll never get back. Now, after acknowledging my laziness in my previous comment, I feel obligated to revisit this and do it all in one step... that should finish the rest of this day

    Jimbus

    Never moon a werewolf!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://500058]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (12)
As of 2014-04-23 10:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (541 votes), past polls