Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: Data validation and blank spaces in tab formatted csv file

by Tux (Monsignor)
on Oct 24, 2013 at 14:46 UTC ( #1059485=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Data validation and blank spaces in tab formatted csv file
in thread Data validation and blank spaces in tab formatted csv file

There are several approaches. You can skip those spaces while parsing resulting in empty fields:

my $csv = Text::CSV_XS->new ({ binary => 1, sep_char => "\t", auto_diag => 1, allow_whitespace => 1, });

but as you are dealing with tab separated data, I'd personally would coose to do it inside the loop

while (my $row = $csv->getline ($fh)) { # Check if the 5th field contains data if ($row->[4] =~ m/\S/) { # more than just whitespace $csv_o->print ($fhv, $row); } else { # sorry, this is not filled: invalid $csv_o->print ($fhi, $row); }

Enjoy, Have FUN! H.Merijn


Comment on Re^3: Data validation and blank spaces in tab formatted csv file
Select or Download Code
Reaped: Re^4: Data validation and blank spaces in tab formatted csv file
by NodeReaper (Curate) on Oct 24, 2013 at 15:33 UTC
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 15:34 UTC
    Thanks, one issue is still there. If the field is empty (no white spaces) then this is valid because the value was not provided and will go to valid file. But if the field contains only blank spaces, then this is invalid. I just want to remove all white spaces from the field and make it null before doing the comparison. I have tried this:
    chomp($pending_move_out_dt); $pending_move_out_dt =~ s/\s//g;
    But the record is still going to invalid as the field $pending_move_out_dt is empty but containing white spaces. It should contain nothing where there is no data (just null value).
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 15:54 UTC
    Thanks, Does Perl distinguish between white space and null. For example, this field contains no data but white space:
    chomp($pending_move_out_dt); $pending_move_out_dt =~ s/\s//g;
    I want to remove all white spaces for that field so that field appear as null (undefined) if there is no data. Any way to do this.
      $variable = undef unless $variable =~ /\S/;
Re^4: Data validation and blank spaces in tab formatted csv file
by Ma (Novice) on Oct 24, 2013 at 17:45 UTC
    Thank you. It work fine when the field has value. But when the field contains blank spaces, I am getting this error message: Use of uninitialized value $pending_move_out_dt in concatenation (.) or string at pp6.txt line 52, <FH> line 3. pending_move_out_dt = Use of uninitialized value $pending_move_out_dt in concatenation (.) or string at pp6.txt line 64, <FH> line 3.

      Which is highly unlikely to be from my example code snippets: $csv->print (...) won't warn on undefined values. I see the message show <FH>, which - to me -indicates that you used a global files descriptor (open FH, ">", ...; instead of open my $fh, ">", ...;) and I bet that you are using it in a print statement.


      Enjoy, Have FUN! H.Merijn

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1059485]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2014-10-26 08:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (152 votes), past polls