Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

DBD:Pg pg_putcopydata

by LiquidT (Initiate)
on Jun 30, 2010 at 07:07 UTC ( [id://847265]=perlquestion: print w/replies, xml ) Need Help??

LiquidT has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I have very little experience with Perl, and I am also new to this forum. I would greatly appreciate any direct or indirect help with a problem I am currently facing.

My program has several arrays which I intend to push into a pgSQL database. Each array represents a unique table in the database. I have used flat files as a medium for transport in the past, however the scale of data in this project requires more efficiency. After some research I have settled on using the COPY method of moving the data into the db.

So far, I have been able to find very few example of code using the pg_putcopydata method and those I have found I either do not understand or do not use a data structure much like the one I am using (arrays).

Each element in the array @ArrayInMemory will be a row in the db table with multiple columns delimited by /. This is what I image the code will be like:

@ArrayInMemory $dbh->do("COPY mytable(col1, col2, col3) FROM STDIN WITH DELIMITER '/' +"); foreach $row (@ArrayInMemory) { $dbh->pg_putcopydata($row); } $dbh->pg_putcopyend();

Again, thanks for any input on this topic and I hope we can generate a few quality examples for other users who are unfamiliar with the pg_putcopydata database handle method.

Replies are listed 'Best First'.
Re: DBD:Pg pg_putcopydata
by stefbv (Curate) on Jun 30, 2010 at 07:57 UTC

    Here is a quick hack of a working example:

    #!/usr/bin/perl use strict; use warnings; use DBI; my $dbname = 'testdb'; my $server = 'localhost'; my $user = 'user'; my $pass = 'pass'; my $dbh = DBI->connect( "DBI:Pg:dbname=".$dbname.";host=".$server, $user, $pass ); # The \n at the end of each row is required my @ArrayInMemory = ( "1/value 12/value 13\n", "2/value 22/value 23\n", "3/value 32/value 33\n", ); $dbh->do("COPY testcp (col1, col2, col3) FROM STDIN WITH DELIMITER '/' +"); foreach my $row (@ArrayInMemory) { # Alternative if the array elements doesn't contain \n # $row .= "\n"; $dbh->pg_putcopydata($row); } $dbh->pg_putcopyend();

    Regards, Stefan

      Many Thanks, Stefan!

      In particular, thank you for pointing out the requirement for the newline character.

      One of the things I love about Perl is having many ways to do the same thing. It can be frustrating for the inexperienced, but it can also be very rewarding.

      If anyone has a working alternative to our foreach loop method, I would encourage you to share. I love seeing how different programmers approach a simple problem in different fashion.

Re: DBD:Pg pg_putcopydata
by james2vegas (Chaplain) on Jun 30, 2010 at 18:56 UTC
    You can always use Text::CSV to prepare your data for your COPY FROM (and use the CSV mode of COPY FROM)

    Something like:

    my @ArrayInMemory = ( [1, 'value 12', 'value 13'], [2, 'value 22', 'value 23'], [3, 'value 32', 'value 33'], ); my $csv = Text::CSV->new ({ binary => 1, eol => $/ }); $dbh->do(q{COPY testcp (col1, col2, col3) FROM STDIN CSV}); foreach my $row (@ArrayInMemory) { $csv->combine(@$row) or die 'Error '.$csv->error_input; $dbh->pg_putcopydata($csv->string); } $dbh->pg_putcopyend();
      James,

      This is a very interesting approach and I had not considered the possibility. I am hoping you might provide a little insight into the the combine function. I am a little confused by the variable @$row in particular. $csv is a scalar, as is $row, however I believe $csv->combine is expecting an array. I would have guessed some work would have needed to be done to split the scalar into an array.

      This is what happens when a T-SQL SPROC junky decides to break out of the safe bubble I had been in for years and learn some new tricks in an open environment!

      Also, what situations would you recommend this method over the first example? I personally like the idea of processing it as a CSV due to the fact I built the arrays to mimic a delimited flat file, simply because it was an easy way to express my intended end result.

      Thanks!

        Sure, also be sure to check out perlreftut, perlref, perllol, perldata and perldsc for details on references. and their use in data structures.

        @ArrayInMemory is what is called an AoA (array of arrays). In order to include an array inside another without it being converted into a single flat array, you make it an array reference (delimited by [ and ] instead of ( and )).

        In the foreach loop $row gets assigned to an element in the @ArrayInMemory array. That element is an Array Reference, not the array which Text::CSV's combine method requires. Luckily we can 'dereference' an Array Reference back into an array by adding an @ to the beginning (@$row) indicating we are interested in the dererferenced array, not the array reference in $row.

        The reason I would choose this method is that seems, to me, a natural fit, COPY FROM accepts CSV, Text::CSV produces it. This is especially helpful if your data potentially contains data that would need to be escaped before being sent to COPY FROM, Test::CSV handles all the ugly details of that for you. The first solution assumes you already have a string for each row you are submitting to Pg, which is difficult for me to imagine being the case, you are more likely to have a collection of fields (in an AoA or similar structure) which you need to combine. Instead of joining them yourself and worrying about escaping rules (which are different between Perl, Pg and CSV), use a well-tested and recommended module instead.
Re: DBD::Pg pg_putcopydata
by erix (Prior) on Jul 01, 2010 at 11:02 UTC

    Keep in mind that you can also directly pipe data to postgres' COPY via psql:

    # for instance, a program producing tab-delimited output: perl my_tsv.pl | psql -d dbname -c "copy mytable from stdin csv delimi +ter E'\t'"
    # or if the data is in a tsv file already: < data.tsv psql -d dbname -c "copy mytable from stdin csv delimiter E' +\t'"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://847265]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-23 23:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found