Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

DBI: select distinct returns same values

by tukusejssirs (Beadle)
on Sep 21, 2019 at 10:45 UTC ( #11106471=perlquestion: print w/replies, xml ) Need Help??

tukusejssirs has asked for the wisdom of the Perl Monks concerning the following question:

I have strange problem.

I have Pg database table with 20,000+ rows. The only important column from this table is the timestamp column called date.

I need to get all unique dates from the table, so I run the following code in the terminal:

mydb=# select distinct cast(date as date) from myschema.mytable; -[ RECORD 1 ]----------- date | 2019-09-18 -[ RECORD 2 ]----------- date | 2019-09-20 -[ RECORD 3 ]----------- date | 2019-09-19

which is correct. However, when I run the following code, it output three times the same date:

$sth_dates_uniq = $dbh->prepare("select distinct cast(date as date) fr +om myschema.mytable;") or die; $sth_dates_uniq->execute(); my @all_rows; while ( my $ref = $sth_dates_uniq->fetchrow_arrayref()) { push(@all_rows, $ref); } foreach my $row_ref (@all_rows) { print "@{$row_ref}\n"; } # Output 2019-09-19 2019-09-19 2019-09-19

Do I do something wrong or is this a bug? What else could I do? I could get all the data from the table, sort them and get all unique values, however, the table is quite big to do so. Id like to have it as fast as possible.

Replies are listed 'Best First'.
Re: DBI: select distinct returns same values
by FreeBeerReekingMonk (Deacon) on Sep 21, 2019 at 11:08 UTC
    The "date" is probably a timedate, and eventhough you see it as the only-the-date, the time makes it unique. Try to put also a groupby clause:
    select distinct cast(DATE as date) from myschema.mytable GROUP BY DATE


    Edit: somewhat tested on

    create table foo (IDENTITY INT, LoginTime DATETIME); INSERT INTO foo VALUES( 1, '2019-09-29 17:00:48.000' ); INSERT INTO foo VALUES( 2, '2019-09-29 17:29:02.000' ); INSERT INTO foo VALUES( 3, '2019-09-30 10:22:46.000' ); INSERT INTO foo VALUES( 4, '2019-09-30 11:16:23.000' ); --Select unique SELECT distinct cast(LoginTime as date) FROM foo GROUP BY LoginTime;

    returns: 2019 (just the year, strange)

    EDIT 2: Ok, tried sqlite from Perl and SQLite does not have DATE... and returns the same "2019" as the online version. But it DOES have strftime and that made it work

    #!/usr/bin/perl use strict; use warnings; use DBI; my $dbfile = "sample.db"; my $dsn = "dbi:SQLite:dbname=$dbfile"; my $user = "x"; my $password = "x"; my $dbh = DBI->connect($dsn, $user, $password, { PrintError => 0, RaiseError => 1, AutoCommit => 1, FetchHashKeyName => 'NAME_lc', }); if(-z $dbfile){ my $sql = <<'END_SQL'; CREATE TABLE foo ( id INTEGER PRIMARY KEY, LoginTime DATE ) END_SQL $dbh->do($sql); my @DATES = ('2019-09-29 17:00:48.000', '2019-09-29 17:29:02.000', ' +2019-09-30 10:22:46.000', '2019-09-30 11:16:23.000'); for(my $i=0;$i<=$#DATES;$i++){ $dbh->do('INSERT INTO foo (LoginTime) VALUES (?)', undef,$DATES[$i]); } } # This is what I thought would work #my $sql = 'select distinct cast(LoginTime as date) from foo GROUP BY +LoginTime'; # this actually works in SQLite (strftime) my $sql = 'select distinct strftime("%d-%m-%Y",LoginTime) from foo'; my $sth = $dbh->prepare($sql); $sth->execute(); while (my @row = $sth->fetchrow_array) { print "[@row]\n"; } $dbh->disconnect;


    [29-09-2019] [30-09-2019]

    Ok, after reading @Erix's comment, he's right. And Anonymous Monk (too bad for the points) pointed the error out, and it also occurs with SQLite. Data::Dumper shows the following after injecting 29th then 30th:

    $VAR1 = [ [ '30-09-2019' ], $VAR1->[0] ];

    So all old values become a reference to the first entry of itself. So if you inject n fetchrow_arrayref() then you get n times the last pushed entry.
    So to copy an arrayref: [@$ref] to get an AoA in @all_rows.

    $sth->execute(); my @all_rows; while ( my $ref = $sth->fetchrow_arrayref()) { # Needs to be de-referenced (Anonymous monk's fix) push(@all_rows, [@{$ref}]); } foreach my $row_ref (@all_rows) { print "$row_ref->[0]\n"; }

    And my first example worked because I used fetchrow_array() instead of fetchrow_arrayref() and went around the referencing problem (but references are faster in general, not by much, so keep using fetchrow_arrayref)

      select distinct cast(DATE as date) from myschema.mytable GROUP BY DATE

      Sorry, but that GROUP BY really doesn't make sense. No point in using the GROUP BY clause here.

      Thank you for your answer!

      You presumed right that date contains datetime. However, I have no idea why do you think I use sqliteI said I have a Pg (i.e. PostgreSQL) database table. :)

      The thing is that Pg does not support strftime() function.

      So, according to your answer I believe that DBI does not get the selected values (i.e. the output from select distinct cast(...)) or it does not understand/use the cast() function, instead it selects the timestamp value (the actual value) instead of the casted one. Is that right? Is this expected? I believe it is not. And if it is so, I would say it is a bug. What do you think?

Re: DBI: select distinct returns same values
by erix (Parson) on Sep 21, 2019 at 13:09 UTC

    The only important column from this table is the datestamp column called date.

    Data type 'datestamp' does not exist in PostgreSQL. What's the data type of column 'record_date'? (try \d itens.lbd_w300 in psql)

      By datestamp I meant timestamp. ;)

      And I tried to use general names for db/schema/table, but have missed some of them. I have just modified the OP to make it more consistent.

      To answer your question: the date/record_date column is of timestamp type.

        I can't get your results. In my hands, I get the distinct values, as expected, also with that little chunk of perl code of yours. If I remove the 'DISTINCT' I get the doubles (I made a 4 row table).

        So you must be overlooking some detail. Maybe you can write up a small but complete) code example (including data) with data (!) that can be run against a postgres database (remove/change the user/password, of course). I expect that while you prepare that, you will see what went wrong.

        (by the way: the Pg don't-do-this page advises against this data type )

        edit: I was mistaken; I probably futzed my own try-out code, but I see you got good answers already so I'll leave it at that.

        I have just modified the OP ...

        Please see How do I change/delete my post? for site etiquette and protocol regarding such changes. Bottom line: Do Not Destroy Context!

        Give a man a fish:  <%-{-{-{-<

Re: DBI: select distinct returns same values
by Anonymous Monk on Sep 21, 2019 at 14:19 UTC

    You might try replacing push( @all_rows, $ref ); with push( @all_rows, [ @$ref ] );.

    The DBI docs for fetchrow_arrayref() say "Note that the same array reference is returned for each fetch, so don't store the reference and then use it after a later fetch.".

    The [ @$ref ] effectively does a shallow clone of $ref, so you should get distinct values again. Of course, if it were me I would push $ref->[0], and then in the following foreach loop just print "$row_ref\n";.

      This is correct: your results-array contains three references to the same result record. That is why you see three copies of the last row. (The SQL query is presented as literal text and you have already showed us what the result is.) You must make a deep copy of each row that is returned ... or, simply skip the step of putting them into a separate array entirely and work directly from the first loop.

        AnomalousMonk: Sorry I have done againt the site rules. Ill mend my ways, I promiss! :)

        Anonymous Monk: You are right about the push(@all_rows, [@$ref]). It really fixed the output. Thanks!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11106471]
Approved by haukex
Front-paged by haukex
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2020-09-19 09:58 GMT
Find Nodes?
    Voting Booth?
    If at first I dont succeed, I

    Results (114 votes). Check out past polls.