Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Re^3: simple table join script

by NetWallah (Abbot)
on Jun 01, 2012 at 14:15 UTC ( #973776=note: print w/ replies, xml ) Need Help??

in reply to Re^2: simple table join script
in thread simple table join script

Addressing your points:

  • It changes the order of the rows. : Yes - it SORTS them. No change if they are already sorted.
  • It drops a value from the first row.: Agreed - I found and fixed this bug this AM (original post was late at night)
  • (Minor) It splits on spaces, not tabs.: Correction - it splits on WHITESPACE, that includes tab.

I agree - it is not the easiest code to read, but I think it is more data-tolerant than yours, which complains under "use warnings", if the data contains blank lines (which are present in the OP).

Regarding efficiency - sorting a million pre-sorted records takes less than a second, on modern computers, so I don't see an issue.

I do appreciate your critique, and enjoy the discussion, but, at times, i have a low threshold for responding to nits, so I apologize in advance if I appear to be un-responsive. to subsequent posts.

             I hope life isn't a big joke, because I don't get it.

Comment on Re^3: simple table join script
Re^4: simple table join script
by sauoq (Abbot) on Jun 01, 2012 at 14:44 UTC
    Yes - it SORTS them.

    Where did he say changing the order in any manner was what he wanted? The issue is that your code essentially requires reordering in some manner because of your poor choice to use hashes which will destroy the original order. You left yourself with the choice to either sort or go with whatever reordering perl gives you.

    That's not a nit. Not being able to retain the original order is a fundamental issue. Perhaps he doesn't need to. He didn't say. But if he does need to, your code is worthless.

    The efficiency issue may or may not be of practical importance. But its only an issue at all because of the same aforementioned poor design decision. The straight forward approach I showed isn't better because it's more efficient... it's more efficient because it's better.

    Correction - it splits on WHITESPACE, that includes tab.

    That's right. And thanks for picking a nit with my point... I like precision. It's still a potential problem because he didn't specify whether his fields could have other whitespace... and if they did, your code would break. That's a little less data tolerant, arguably, than some warnings on blank lines.† I called it "minor" because it is so easily corrected, unlike your code's other issues.

    Note that adding sorting to the straight forward approach is easily done with a single sort if desired.

    Note also that skipping blank lines is easily added to the straight forward approach with an appropriately placed next unless /\S/; or similar.

    † Warnings on blank lines may actually be desirable. It's often a good think to throw a warning on data anomalies.

    "My two cents aren't worth a dime.";

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://973776]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2014-09-21 17:41 GMT
Find Nodes?
    Voting Booth?

    How do you remember the number of days in each month?

    Results (173 votes), past polls