http://www.perlmonks.org?node_id=657938

GertMT has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I have a little script that selects data from a csv-text file (currently 4000 lines). Everything works smooth except that I currently have in the first line of output a row of headers that tell what in the respective columns is presented.

My way of trying to get rid of this first line is seen in the script below. It doesn't work as expected as the file 'blah_no_first_line.csv' get's created before creation of the first file 'blah.csv' is completely finished and therefore 'blah_no_first_line.csv' lacks not only the first line but several at the end as well.

Someone willing to share an idea that will avoid this?

Thanks for any reply,

Gert
#!/usr/bin/perl -w use strict; use diagnostics; use Data::Table; use DBI; open( OUT, ">blah.csv" ) or die "cant open out $!"; open( OUT_no_fl, ">blah_no_first_line.csv" ) or die "cant open out $!" +; my $dbh = DBI->connect( 'dbi:AnyData(RaiseError=>1):' or die $DBI::err +str ); $dbh->func( 'test_me', 'CSV', 'big_data_file.csv', { sep_char => ',', eol => "\015", col_names => 'ID,Seizoen,Brand,Model,Color,Size,Material,Stock' }, 'ad_catalog' ); my $select_from_data = Data::Table::fromSQL( $dbh, "SELECT ID, Size, Brand, Color FROM test_me WHERE Stock>'0'" ); $dbh->disconnect(); $select_from_data = $select_from_data->match_pattern( '$_->[1] =~ /^\d+$/ && $_->[1] >= 18 && $_->[1] <= 41'); print OUT $select_from_data->csv; open( FH, "blah.csv" ) or die "$!\n"; readline(FH); #<-- read the first line effectively skipping it while (<FH>) { print OUT_no_fl $_; }

Replies are listed 'Best First'.
Re: Get output AnyData script without header or data_row
by jZed (Prior) on Dec 19, 2007 at 18:30 UTC
    DBD::AnyData can handle column names two ways: by default it looks for them as the first record in the file and treats that row as a header, not as a data row. However, if you specifically set the column names (as you do in the script above), then it will assume that the first row is data, not a header. So to get it to work as you want, just take out the col_names specification in your ad_catalog().
      Okay, I'm going to 'close' the file but obviously also take advantage of this explanation.

      Thanks, great module.
        I think what you'll want to do is have two ad_catalog() calls - the first to read the existing files with headers should not specify col_names, the second, to write a new file without headers should specify col_names. That way you do it all with the DBD and don't have to do any explicit file opening, writing, or closing.
Re: Get output AnyData script without header or data_row
by jrsimmon (Hermit) on Dec 19, 2007 at 18:20 UTC
    Two things jump out at me immediately: First, you may want to turn off buffering.
    open( OUT, ">blah.csv" ) or die "cant open out $!"; select OUT; $|++;
    Second, you should close your first filehandle (OUT) to blah.csv before opening a new handle (FH) to it. This is really just good practice given the way you are using the file.

    Another option would be to reset the file pointer to the beginning of the file and just keep using the same filehandle.
      alright thanks, that works. Never really paid much focus on 'closing' of files but I will do from now on!