http://www.perlmonks.org?node_id=448726


in reply to cut vs split (suggestions)

Lots of examples in this thread, but everyone seems to be using a split and join of some sort. I doubt cut will do this much work just to print out the first 15 columns.

When looking for speed the first thing I usually look to is removing the need for any regexps where index and substr will do just as well! So why not just look for the 15th ',' and print everything before that?

while (<>) { $col = 15; $index = index($_, ',', $index+1) while ($col--); print substr($_, 0, $index), $/; $index = 0; }

This doesn't do any unnecesary string manipulation, and avoids expensive regexps as well. It could easily be extended to not start at the first column.

This proved faster than some of the examples above that I tried out (I didn't try them all).

$ time perl cut.pl numbers.csv > /dev/null real 0m5.577s user 0m4.792s sys 0m0.055s $ time cut -d, -f"1-15" numbers.csv > /dev/null real 0m1.081s user 0m0.866s sys 0m0.042s

- Cees