Re: Performance challenges


Syntactic Confectionery Delight
	PerlMonks

Re: Performance challenges

by dragonchild (Archbishop)

on Mar 22, 2006 at 12:21 UTC ( [id://538488]=note: print w/replies, xml )

Need Help??

in reply to Performance challenges

If the regex solution provided by Melly isn't as fast as you need, you might try:

open( IN, '<', "Your_data_here" );
open( GOOD, '>', "Good_file_here" );
open( BAD, '>', "Bad_file_here" );

while (<IN>) {
    my @row = split "\t", $_;
    if ( length($row[14]) == 5 && length($row[15]) == 7 ) {
        print GOOD $_;
        next;
    }
    print BAD $_;
}

close BAD;
close GOOD;
close IN;
[download]

Note: awk processing a million rows/minute probably isn't that bad. I'm not sure Perl is going to be much faster. This is a very I/O-bound activity.

My criteria for good software:

Does it work?
Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Comment on Re: Performance challenges Download Code

Replies are listed 'Best First'.
Re^2: Performance challenges by Eimi Metamorphoumai (Deacon) on Mar 22, 2006 at 18:24 UTC
One suggestion: if each record has more than 16 fields, you might find slightly better performance with `my @row = split /\t/, $_, 17;` [download] which tells perl to split into at most 17 fields (0 to 15, leaving the trailing data in 16).	[reply] [d/l]
Re^2: Performance challenges by Anonymous Monk on Mar 22, 2006 at 13:31 UTC
Thanks very much folks! Much appreciate the fast response. dragonchild: I had a script similar to the one you wrote here; but I was not sure if that was the most optimal one. Sounds like it is. Thanks again! :) -Kris	[reply]

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://538488]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others learning in the Monastery: (4)

As of 2024-04-24 05:10 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found