in reply to Re^2: selecting columns from a tab-separated-values file
in thread selecting columns from a tab-separated-values file
What's the first arg to split()? It appears to be a single blank char. How does that work to split upon tab chars?
It's a space enclosed within single quotes. It tells split to split on whitespace, e.g., \t, \n, space.
In trySplitSliceLimit, wouldn't it be better to set LIMIT to 3, or in general to the number of fields you expect to extract?
It should be set to the number of fields plus one that are needed to get the fields you want. For example, using your original string:
"FIRST\tMIDDLE\tLAST\tSTRNO\tSTRNAME\tCITY\tSTATE\tZIP" 1 2 3 4 5 6 7 ----> my @capture = ( split /\t/, $line, 7 )[ 2, 0, 5 ];
You want LAST FIRST CITY and CITY is the sixth field. Setting the LIMIT to seven will return the first six fields and the remainder of the string is the seventh. The slice is then used on those seven to get only the three you want.
And one observation for the record: the indices can appear in any order. To extract LAST, FIRST, and CITY, you'd write [2, 0, 5]
You're correct!
Update: Changed splitting on ' ' to \t. Thanks CountZero.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: selecting columns from a tab-separated-values file
by CountZero (Bishop) on Jan 22, 2013 at 07:28 UTC | |
by Kenosis (Priest) on Jan 22, 2013 at 07:56 UTC | |
Re^4: selecting columns from a tab-separated-values file
by Kenosis (Priest) on Jan 22, 2013 at 07:55 UTC |