Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Can this script be improved so that it is comparable to the UNIX cut command in performance? If the Perl script can finish in 10 seconds that will be great (50% drop in peformance)! I am happy to take this performance drop because it keeps the script clean and portable (typically i work on UNIX machines so this is not a huge requirement)
I don't think there's any way to speed up the perl approach. (I tried BrowserUK's idea -- not a rigorous benchmark, but no evidence that it made any difference.) I just have two reactions to your comments:

(1) The unix-style "cut" is portable -- you can find free ports of unix command line utils for ms-windows, and macosx is unix, and "cut" behaves the same everywhere. What more portability do you need?

(2) The reason to choose a perl approach over a common, compiled utility would be that the perl approach makes it a lot easier to provide a lot more flexibility, and the performance hit is a small price to pay for the extra power. I wrote my own perl version of cut years ago and use it all the time (as well as using the original "cut" when it seems quicker), because with perl I get to use a regex for the split, and output the columns in whatever order I choose, and have the input field separator be different from the ouput field separator (e.g. using "\n" to output one field per line), and insert arbitrary quoted strings between columns when this is convenient, and ... anything else I feel like doing, because perl makes it easy to do. Compared to the time it would take to work around the limitations of standard "cut", perl makes things really efficient.

... would you typically consider piping output from cut when the script does not require all the columns for processing? i.e. say the script only needs 3 columns instead of a possible 200 columns then would you pipe the 3 column output from cut instead of spliting the 200 columns in Perl and keeping only the 3 that is required?
Would a bear typically consider defecating in its natural habitat? If processing 3 columns out of 200 were something I intended to do with any regularity, I would probably write and save a perl script that does something like:
open( IN, "cut -d, -f13-15 numbers.csv |" ); while (<IN>) { chomp; @row = split /,/; # this is only cols 13, 14, 15 from numbers.csv # and now, do something }
(update: naturally, I would have this perl script accept command-line options to specify the field separator and column selections for running the "cut" command, assuming this sort of flexibility were useful.)

In reply to Re: cut vs split (suggestions) by graff
in thread cut vs split (suggestions) by sk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others studying the Monastery: (6)
    As of 2014-08-29 10:41 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (278 votes), past polls