Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^4: cut vs split (suggestions)

by sk (Curate)
on Apr 17, 2005 at 07:01 UTC ( #448601=note: print w/replies, xml ) Need Help??

in reply to Re^3: cut vs split (suggestions)
in thread cut vs split (suggestions)

Very interesting! Thanks for the new idea!

I lost my server connection for some reason and so I tested this on my laptop and I do see a very good improvement with the your modification.

update: corrected <> with $_ per pijll post

C:\>perl -lne "BEGIN{$,=','} print+(split',',$_)[0..14] " > junk
this finishes in about 14 seconds.... corrected timing

C:\>perl -lanF, -e "BEGIN{ $,=\",\"} print @F[0..14];" numbers.csv > j +unk
this takes about 18 seconds

I don't have a timing utility in Windows so the times are just wallclock times.

I guess windows is faster because the process run at 100% CPU (or whatever is required i guess?). On the UNIX servers the process might be more time-shared?

My laptop is 1.6G Centrino/1GB Ram/perl, v5.6.1


SK Update: Thanks pijll, the time it takes to run your version of the code is almost same as the one that uses -n.

Replies are listed 'Best First'.
Re^5: cut vs split (suggestions)
by pijll (Beadle) on Apr 17, 2005 at 07:54 UTC
    You are using both the -n switch and <> in the first line! This means you lose half of your lines...

    Anyway: -n does an unnecessary chomp on every line, so remove that; and use a limit on split: it doesn't actually need to split all 25 fields:

    perl -le 'BEGIN{$,=","} print+(split",",$_,16)[0..14]for <>' numbers.c +sv
    Update: But for<> reads all lines in at ones; you may not want that with large files, so use while <> instead.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://448601]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2018-04-23 03:10 GMT
Find Nodes?
    Voting Booth?