Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Printing a String (TOO stupid?)

by jimson (Initiate)
on Jun 18, 2014 at 05:55 UTC ( [id://1090256]=perlquestion: print w/replies, xml ) Need Help??

jimson has asked for the wisdom of the Perl Monks concerning the following question:

It's actually routine job. But today I noticed something special.
open(FH1, "test.txt") || die ("Failed to open!\n"); my $line; while ($line = <FH1>) { chomp($line); print "It was $line!!!\n"; }

when using a test.file like this:

01234567

01234567

01234567

-----------------

The code prints special things like:

!!!was 01234567

!!!was 01234567

!!!was 01234567

-----------------

Just can not understand why the "!!!" should be added to overwrite the string head "It ..."

Replies are listed 'Best First'.
Re: Printing a String (TOO stupid?)
by AnomalousMonk (Archbishop) on Jun 18, 2014 at 06:16 UTC

    Possibly the chomp-ed  $line still has a carriage-return at the end of it.

    c:\@Work\Perl\monks>perl -wMstrict -le "my $line = qq{01234567\x0d}; print qq{It was $line!!!\n}; " !!!was 01234567

    This might arise if, e.g., the file had been created on a Windows system (with  \x0d\x0a line-enders) and then transferred to a *nix system without proper newline translation. On the *nix system, chomp will take care of the  \x0a but leave the  \x0d untouched.

    (BTW: It may or may not be too stupid, but the reason I know about this sort of thing is that I've done it myself — and more than once!)

      Seems exactly so. I tested using 'chop' twice, which finally work.

      However, this would make the code only work for Windows. Any suggestion on how to let such codes portable to Unix?

        Probably the best approach would be to do a proper system-to-system transfer in the first place. (Update: On second thought, see below for probable BP.) E.g., the ftp utility common to most (all?) OSen has an ascii mode (automatic line-end translation) as well as a binary (don't touch nuthin') mode.

        Otherwise, doing something like
            $line =~ s{ [\x0d\x0a]+ \z }{}xms;
        to each and every line would do the trick to eliminate any possible combination of  \x0d and  \x0a characters from the end of the line. Other regexes can handle more particular combinations of line-enders.

        Updated

        I was wrong. Somehow, I remembered it working differently.

        But { local $/ = ''; chomp($x); } seems to work. Needs more testing and I'm not sure there would still be a performance benefit.


        Try chomp; chomp; instead of chop; chop;

        chomp is conditional. Being built in, I would expect it to be more efficient than a regex.

Re: Printing a String (TOO stupid?)
by moritz (Cardinal) on Jun 18, 2014 at 07:49 UTC

    If some strings do fishy things (like overwriting existing output, messing up your console etc.), a good way to debug that is this:

    use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper $yourstring;

    This will probably reveal a \r, which is a carriage return, which causes the console to go back to the start of the line, overwriting existing output.

Re: Printing a String (TOO stupid?)
by Laurent_R (Canon) on Jun 18, 2014 at 18:51 UTC
    This is obviously a problem with Windows files used under Unix (I have been through that several times, I know now where it comes from, but the first time it took me a while to figure out). Under Unix, the chomp function removes only the "\n" part of the line end, and the "\r" also present in Windows line ends stay there and put the subsequent characters to the start of the line (overwriting whatever is there). Use this:
    chomp $line; $line =~ s/\r//g;
    or
    $line =~ s/[\r\n]//g;
    Update: Corrected two typos (a silly colon before the g regex modifier and a missing /) in the second regex. Thanks to poj for having pointed out them to me.
      ... a problem with Windows files used under Unix ...

      If this is certain to be the situation, I think I would take the approach of local-ly changing  $/ (see perlvar) to  "\x0d\x0a" to control both the reading of Windoze-ish lines from a filehandle and the behavior of chomp in removing their line-enders. Something like:

      open my $fh ... or die ...; ... { local $/ = "\x0d\x0a"; while (defined(my $line = <$fh>)) { chomp $line; do_something_with($line); } }
      Both problems handled in one swell foop. (Of course, you have to be aware that the change to the global  $/ propagates 'into' the  do_something_with() call and into anything that function may call, but that's a post for another day.)

        Yes, that's another interesting way to do it. As for whether it is certain that it is a problem with Windows files used under Unix, I of course cannot be sure without having seen the file itself, but I know for a fact that I have met that problem a few times and got exactly the same symptoms: the words added to the end of the lines were coming over the start of the lines. And it took me at least an hour to figure out what the issue was the first time I got it.
Re: Printing a String (TOO stupid?)
by TGI (Parson) on Jun 20, 2014 at 05:06 UTC

    Printing is a red-herring here. The real issue is how you open the file.

    Instead of doing a plain-vanilla open, you want to turn on one of [Perl's amazing IO layers]. In this case :crlf.

    open my $fh, '<:crlf', "test.txt" or die "Failed to open $!\n"; my $line; while ($line = <$fh>) { chomp($line); print "It was $line!!!\n"; }

    The awesome thing about this method, is you automatically translate \r\n to \n, but if the file has plain old \n line endings, the just cruise on through to your code without a problem.

    You may also notice that I used a 3-argument open with a lexical file handle. This is generally considered to be a good idea.


    TGI says moo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1090256]
Approved by Eily
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-03-19 05:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found