Given this question, asked anonymously the day before, contains test data with the identical first two lines namely:
nick 5
nick 10
I'm guessing you are the same anonymonk. If so, it would've been good to mention that.
Further to hv's excellent suggestion of using s{\s+$}{} (that presumably fixed your problem),
you might consider writing a standalone program that does nothing more than
verify that your input data files are well-formed.
Though you didn't rigorously define the format of your input files in either of your questions,
I'm guessing that to be well-formed, each line in your data files must match:
^[a-z]+\t\d+$
Is that right? If so, to avoid future pain,
you might consider writing a simple data validation program, for example:
use strict;
use warnings;
my $fname = shift or die "usage: $0 file\n";
open( my $fh, '<', $fname ) or die "error: open '$fname': $!";
my $lcnt = 0;
my $line;
while ( defined($line = <$fh>) ) {
++$lcnt;
chomp $line;
$line =~ /^\s+/ and die "error: line $. contains leading whitespac
+e\n";
$line =~ /\s+$/ and die "error: line $. contains trailing whitespac
+e\n";
length($line) or die "error: line $. is empty\n";
$line =~ /^[a-z]+\t\d+$/ or die "error: line $. ($line) does not ma
+tch word TAB number\n";
}
close $fh;
warn "file '$fname': $lcnt lines, no data format errors detected\n";
Running this program on Linux against a CRLF terminated Windows file produces:
error: line 1 contains trailing whitespace
Obviously, you could make the crude data validation program above more elaborate.
Alternatively, you might add more rigorous file format checks to your original program.
|