Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Read text file - Encoding problem?

by Kenosis (Priest)
on Mar 17, 2013 at 02:05 UTC ( #1023867=note: print w/ replies, xml ) Need Help??


in reply to Read text file - Encoding problem?

Please note Text::CSV's documentation:

open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!"; while ( my $row = $csv->getline( $fh ) ) { $row->[2] =~ m/pattern/ or next; # 3rd field should match push @rows, $row; }

Let the $csv object read from the file handle (and parse the line), instead of:

... while (my $line = <CSV>) { chomp $line; if ($csv->parse($line)) { ...

In the documentation, $row is an array reference, pointing to the array that contains the line's parsed fields. You can just use @$row to get all those fields.

It's not clear to me what chomp (@fields); is doing in your script. It looks like you're expecting @fields to contain the parsed fields from the currently-read line. But there wouldn't be any newlines in @fields, so chomping it is unnecessary.

Also, as a side note, consider using lexical variables (my) for file handles, instead of barewords. For example:

open my $CSV, '<:encoding(utf8)', $file or die "Cannot open $file: $!\ +n"


Comment on Re: Read text file - Encoding problem?
Select or Download Code
Re^2: Read text file - Encoding problem?
by better (Acolyte) on Mar 17, 2013 at 11:32 UTC

    Hi Kenosis,

    thanks again for your support.

    Of course, you are right reminding me to use lexical variables for file handles. I changed that and added 'use strict;', this time without getting errors like "requires explicit package name".

    The chomp command was intended to cut a newline, if there should be a field including one (it doesn't make a difference, so I deleted it).

    I can't follow you, implementing the matching command here. If I understand it correctly, it compares the strings given in a row with a certain pattern? What I want to achieve is, to get the string (ID) of the first field of the first coloumn and use it as a regex for matching the fh of read_dir. Than get the second string of the first coloumn etc. (all IDs are listed in the first coloum only).

    The problem why it is not working seems to be an unvisible thing at the end of each line. Following McAs hint to get the hex number of the text file: the result is that each line ends with the letter "d". This might cause the problem, because the script works with another text file, in which the lines don't end with "d".

    better

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023867]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (17)
As of 2014-07-11 20:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (235 votes), past polls