http://www.perlmonks.org?node_id=1059485


in reply to Re^2: Data validation and blank spaces in tab formatted csv file
in thread Data validation and blank spaces in tab formatted csv file

There are several approaches. You can skip those spaces while parsing resulting in empty fields:

my $csv = Text::CSV_XS->new ({ binary => 1, sep_char => "\t", auto_diag => 1, allow_whitespace => 1, });

but as you are dealing with tab separated data, I'd personally would coose to do it inside the loop

while (my $row = $csv->getline ($fh)) { # Check if the 5th field contains data if ($row->[4] =~ m/\S/) { # more than just whitespace $csv_o->print ($fhv, $row); } else { # sorry, this is not filled: invalid $csv_o->print ($fhi, $row); }

Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 15:34 UTC
    Thanks, one issue is still there. If the field is empty (no white spaces) then this is valid because the value was not provided and will go to valid file. But if the field contains only blank spaces, then this is invalid. I just want to remove all white spaces from the field and make it null before doing the comparison. I have tried this:
    chomp($pending_move_out_dt); $pending_move_out_dt =~ s/\s//g;
    But the record is still going to invalid as the field $pending_move_out_dt is empty but containing white spaces. It should contain nothing where there is no data (just null value).
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 15:54 UTC
    Thanks, Does Perl distinguish between white space and null. For example, this field contains no data but white space:
    chomp($pending_move_out_dt); $pending_move_out_dt =~ s/\s//g;
    I want to remove all white spaces for that field so that field appear as null (undefined) if there is no data. Any way to do this.
      $variable = undef unless $variable =~ /\S/;
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 17:45 UTC
    Thank you. It work fine when the field has value. But when the field contains blank spaces, I am getting this error message: Use of uninitialized value $pending_move_out_dt in concatenation (.) or string at pp6.txt line 52, <FH> line 3. pending_move_out_dt = Use of uninitialized value $pending_move_out_dt in concatenation (.) or string at pp6.txt line 64, <FH> line 3.

      Which is highly unlikely to be from my example code snippets: $csv->print (...) won't warn on undefined values. I see the message show <FH>, which - to me -indicates that you used a global files descriptor (open FH, ">", ...; instead of open my $fh, ">", ...;) and I bet that you are using it in a print statement.


      Enjoy, Have FUN! H.Merijn