Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

print line not showing up

by Anonymous Monk
on Oct 29, 2007 at 13:36 UTC ( [id://647838]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello wise Monks,

again I have a question:
I have to convert several CSV files to some kind of XML. The Perl script reads all CSV and builds several lists and hashes.

To sum it up:
There are 1500 messages overall, but the script only works for 1495! The remaining 5 messages aren't properly handled. Why?

In detail:
The CSV lines all look like this

Fehler_SWCA_UgaUgeMSG ,M_sw_canInter,506,KL_FAT,BHI, +AUS_DEFAULT,AKT_NOTAUS,"""MSG Ueberlauf""","""UGA UGE MSG Ueberlauf"" +",""" """
This is one of those that DON'T work, but I didn't recognize any difference to any other line so far.

The text between the groups of '"""' should show up in the result file as XML element. The structure should look like this:

<MSG id="E_Fehler_SWCA_UgaUgeMSG"> <TXT_S> <TXT lang="de">"MSG Ueberlauf"</TXT> </TXT_S> <TXT_L> <TXT lang="de">"UGA UGE MSG Ueberlauf"</TXT> </TXT_L>
But it actually does look like that:
<MSG id="E_Fehler_SWCA_UgaUgeMSG"> <TXT_S> </TXT_S> <TXT_L> </TXT_L>
We see that not only the content of the <TXT> elements is missing, but the WHOLE LINE! What's going on there??

I set some kind of debug output in my perl script, which looks like this:

foreach $txt (@{$txtById{$id}}) { print OUTFILE "\t\t<TXT_S>\n"; if ($msg->{id} eq "E_Fehler_SWCA_UgaUgeMSG") #DEBUG OUTPUT { print "$msg->{id}\nTXT =\n$txt->{id}\n$txt->{lang}\n$txt->{txt +S}\n$txt->{txtL}\n"; } print OUTFILE "\t\t\t<TXT lang=\"$txt->{lang}\">\"$txt->{txtS}\"</ +TXT>\n"; print OUTFILE "\t\t</TXT_S>\n"; print OUTFILE "\t\t<TXT_L>\n"; print OUTFILE "\t\t\t<TXT lang=\"$txt->{lang}\">\"$txt->{txtL}\"</ +TXT>\n"; print OUTFILE "\t\t</TXT_L>\n"; }
In my shell window I get that output:
E_Fehler_SWCA_UgaUgeMSG TXT = E_Fehler_SWCA_UgaUgeMSG de MSG Ueberlauf UGA UGE MSG Ueberlauf
which is what I expected.
Each field is filled with proper data. It does print on the screen but not in the file, and I don't have any clue. :-(

Thanks in advance for helping,
Faltblatt

Replies are listed 'Best First'.
Re: print line not showing up
by vlademonkey (Pilgrim) on Oct 29, 2007 at 15:38 UTC
    Are you sure the CSV data is good? Check that it doesn't contain any extraneous special characters. For instance, a line with just a carriage return and no line feed may cause that problem:
    my $cr = chr 0xD; my $lf = chr 0xA; print "Line before\n"; print "Line with crlf$cr$lf"; print "Line after\n"; print "Line before\n"; print "Line with cr$cr"; print "Line after\n"; print "Line before\n"; print "Line with lf$lf"; print "Line after\n";
    This prints:
    Line before Line with crlf Line after Line before Line aftercr Line before Line with lf Line after
Re: print line not showing up
by thezip (Vicar) on Oct 29, 2007 at 16:59 UTC

    How are you reading the CSV? You might want to use Text::CSV_XS, and check the status after reading each line, as in:

    #!/perl/bin/perl use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new( { binary => 1 } ); my $infile = "foo.txt"; open(IFH, "<", $infile) or die "Could not open infile for reading.\n"; while(my $line = <IFH>) [ my $success = $csv->parse($line); if ($success) { ... stuff it in your hash ... } else { ... complain about it ... } } close IFH;

    This splits the troubleshooting in half, since it's not clear whether the problem arises from the CSV parsing or in the hash insertion.

    Good luck!


    Where do you want *them* to go today?
      I currently use this:

      open(INFILE,"<$cvsDir/$file") || die("can't open datafile $file: $!"); $csv = Text::CSV_XS->new(); while(<INFILE>) { $status = $csv->parse($_); if ($status == 0) { $bad_argument = $csv->error_input(); print "\t\tBad Argument: $bad_argument\n"; exit; } ... stuff it in my hash ...

      1.) I use Text::CSV_XS, but without the { binary => 1 }, does that make a big difference? What is the { binary => 1 } expected to help to my problem? *)

      2.) I thought my CSV parsing was okay, because the expected content shows up in my debug print lines? I even thought the hash *insertion* was okay, for the same reason?

      *) I tried that at the moment, it doesn't change anything.

        Per the Text::CSV_XS docs, by turning on "binary", it allows the parser to accept a wider range of characters. "To cover the widest range of parsing options, you will always want to set binary."

        I just was not making any assumptions about the data that you are trying to parse.

        Given that, I'd say your next step would be to isolate which specific lines are failing. You can do this by writing the good lines into one file, and if possible, the bad lines into another file. If you can't write the bad lines into another file, then you'll have to determine which lines are missing from the good file.

        To aid in this process, you might prepend a unique index (source line # ?) to the beginning of each line.


        Where do you want *them* to go today?
Re: print line not showing up
by meraxes (Friar) on Oct 29, 2007 at 13:59 UTC

    You're data does indeed look like it's being slurped fine... but can you show what you're using to actually output XML? Basic print statements? A particular module? Can't help with what we can't see.

    --
    meraxes
      I simply PRINT to an outfile. Basically as stated in the perl snippet in my initial post. I open (OUTFILE,">>$cnvFile"); and then     print OUTFILE "\t\t<TXT_S>\n"; as necessary.
Re: print line not showing up
by Anonymous Monk on Oct 29, 2007 at 14:00 UTC
    You have more than one line with E_Fehler_SWCA_UgaUgeMSG
      No, I don't. ;-) And even if I had, they were distinct by the CSV they're contained in. That's assured in advance.
Re: print line not showing up
by Anonymous Monk on Oct 29, 2007 at 14:08 UTC
    print OUTPUT "blah " or warn "COULDN'T PRINT $!";
      I also tried that - and guess what? Neither the warning nor the line is printed! :-(

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://647838]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-03-19 07:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found