http://www.perlmonks.org?node_id=1041123


in reply to Help with code optimization

G'day hyu968,

One thing that immediately leapt out at me was this block of code:

if($data=~m/^\s*FORM\s*TYPE:\s*(.*$)/m) {$form_type=$1;} if($data=~m/^\s*CENTRAL\s*INDEX\s*KEY:\s*(\d*)/m) {$cik=$1;} if($data=~m/^\s*CONFORMED\s*PERIOD\s*OF\s*REPORT:\s*(\d*)/m) {$rep +ort_date=$1;} if($data=~m/^\s*FILED\s*AS\s*OF\s*DATE:\s*(\d*)/m) {$file_date=$1; +} #if($data=~m/^\s*SEC\s*FILE\s*NUMBER:\s*([0-9-]*)/m) {$file_number +=$1;} if($data=~m/^\s*COMPANY\s*CONFORMED\s*NAME:\s*(.*$)/m) {$name=$1;} if($data=~m/^\s*STANDARD\s*INDUSTRIAL\s*CLASSIFICATION:.*?\[(\d{4} +)/m) {$sic=$1;}

Each of those conditions is mutually exclusive: once you have a match, all the remaining conditions will be FALSE. So you could change all but the first if to elsif, use one of the forms shown in perlsyn - Basic BLOCKs, or use something like this:

for ($data) { /^\s*FORM .../ and do { $whatever = $1; last }; /^\s*CENTRAL .../ and do { $whatever = $1; last }; /^\s*CONFORMED .../ and do { $whatever = $1; last }; }

If you have some idea of which matches are more likely, test them earlier.

Also, I'd advise against any of the given/when constructs, such as those shown in the Switch Statements section (immediately following the Basic BLOCKs section I linked to above), as they're experimental and not really suitable for production code.

-- Ken

Replies are listed 'Best First'.
Re^2: Help with code optimization
by Anonymous Monk on Jun 28, 2013 at 13:10 UTC
    Those conditions are NOT mutually exclusive. The m modifier will cause the ^ to match any beginning of line, not just at the beginning of the string. Since $data contains the entire file, each of the expressions could match.
Re^2: Help with code optimization
by sundialsvc4 (Abbot) on Jun 28, 2013 at 03:13 UTC

    last?   Or maybe should be next?   Do you intend to jump out of the loop, or don’t you instead really mean, continue it?

      "last? Or maybe should be next? Do you intend to jump out of the loop, or don’t you instead really mean, continue it?"

      The loop for ($data) { ... } only executes once.

      Neither last nor next will make it execute any more or less times.

      When a match is found and the captured data is assigned to a variable, that's the last thing to be done.

      Here's what the last documentation says:

      "The last command is like the break statement in C (as used in loops); it immediately exits the loop in question."

      That's what we want to do here: immediately exit the loop.

      Here's what the next documentation says:

      "The next command is like the continue statement in C; it starts the next iteration of the loop"

      That's not what we want to do here: it's a one-pass loop; there are no more iterations.

      [You might like to follow the link I provided in my original reply. It has two more code examples where last is used to exit one-pass loops.]

      -- Ken