Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Is this expected behavior for chomp/split...?!

by Krambambuli (Deacon)
on May 20, 2009 at 09:24 UTC ( #765153=perlquestion: print w/ replies, xml ) Need Help??
Krambambuli has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I've just stumbled over a small issue with chomp/split (or just my understanding of them). Here's the code to test:
#!/usr/bin/perl use strict; use warnings; my $line ="A\tB\tC\tD\t\t\t\t\n"; print "Test string line: $line"; print "Split without chomp: "; count_fields( $line ); print "Split _after_ chomp: "; chomp $line; count_fields( $line ); exit; sub count_fields { my ($line) = @_; my @x = split( /\t/, $line); my $count = @x; print "$count fields.\n"; }
I expected to print out the same field count - 8 - regardless of applying or not the 'chomp'. Instead, I see 8 and 4. Can someone shed some lite on it ? Why do the two field counts differ ..?!!

Many thanks,

Krambambuli
---

Comment on Is this expected behavior for chomp/split...?!
Download Code
Re: Is this expected behavior for chomp/split...?!
by Corion (Pope) on May 20, 2009 at 09:28 UTC

    See split. Most likely you want split /\t/, $_, 8 if you always want 8 columns.

      I don't know the exact number of fields, and I do not want the newline in the last element.

      So I decided to chomp _after_ the split:
      @x = split(...) chomp $x[-1];
      Thanks!

      Krambambuli
      ---

Re: Is this expected behavior for chomp/split...?!
by johngg (Abbot) on May 20, 2009 at 09:36 UTC

    Because without a third argument split will discard empty trailing fields which, as soon as you chomp away the newline, you have four of.

    I hope this is helpful.

    Cheers,

    JohnGG

      Got it. _Now_ it seems sooo obvious :)

      Many thanks!

      Krambambuli
      ---

      This is a good and concise answer.
Re: Is this expected behavior for chomp/split...?!
by Anonymous Monk on May 20, 2009 at 09:39 UTC
    split docs say By default, empty leading fields are preserved, and empty trailing ones are deleted. (If all fields are empty, they are considered to be trailing.)

    For clarity, add

    use Data::Dumper; print Data::Dumper->new([\@x])->Indent(0)->Useqq(1)->Dump, "\n";
    You'll get
    Split without chomp: 8 fields. $VAR1 = ["A","B","C","D","","","","\n"]; Split _after_ chomp: 4 fields. $VAR1 = ["A","B","C","D"];
Re: Is this expected behavior for chomp/split...?!
by Utilitarian (Vicar) on May 20, 2009 at 09:54 UTC
    There is an extra character "\n" at the end of the line, this becomes the final element of the array. Change sub count_fields such that my @x = split( /\t+/, $line); to see what I mean. The fact that there are empty elements on the way to the last legitimate element is of no interest, try the following with and without the + modifier on \t in the split.
    #!/usr/bin/perl use strict; use warnings; my $line ="A\tB\tC\tD\t\t\t\t\n"; print "Test string line: $line"; print "Split without chomp: "; my @x=count_fields( $line ); check_for_null_elements(@x); print "Split _after_ chomp: "; chomp $line; @x=count_fields( $line ); check_for_null_elements(@x); exit; sub check_for_null_elements { foreach my $element (@_){ print "Empty element found\n" if ($element eq ""); } } sub count_fields { my ($line) = shift; my @x = split( /\t+/, $line); my $count = @x; print "$count fields.\n"; return (@x); }
Re: Is this expected behavior for chomp/split...?!
by jwkrahn (Monsignor) on May 20, 2009 at 12:50 UTC

    If you use a negative number for the third argument of split you will get the same results for both cases:

    $ perl -e' my $line ="A\tB\tC\tD\t\t\t\t\n"; print "Test string line: $line"; print "Split without chomp: "; count_fields( $line ); print "Split _after_ chomp: "; chomp $line; count_fields( $line ); exit 0; sub count_fields { my ($line) = @_; my @x = split /\t/, $line, -1; my $count = @x; print "$count fields.\n"; } ' Test string line: A B C D Split without chomp: 8 fields. Split _after_ chomp: 8 fields.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://765153]
Approved by citromatik
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2014-07-28 04:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (185 votes), past polls