Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

sort utility key specification

by BrowserUk (Patriarch)
on Apr 21, 2012 at 13:16 UTC ( [id://966354]=perlquestion: print w/replies, xml ) Need Help??

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a sort utility. I need to decide how keys should be specified on the command line. I'm looking for two pieces of guidance:

  1. Would what I have allow you to specify the keys for sorts that you commonly use?

    Note: I'd prefer comments based upon real, actual requirements rather than speculative ones.

  2. Comments, criticisms, and suggestions regarding the format of the command line switches.

My current thinking is that there will be two different, mutually exclusive key specification switches:

  • -Kp[ndsi][l][,p[ndsi][l]][,...]

    Where:

    • p -- absolute offset within the record, of the start of the key field;
    • [ndsi] -- field type numeric (integer) | double (float) | string (case sensitive) | ignore (case insensitive).

      Optional: default 's';

    • l -- length

      Optional: default, the rest of the record for 's' & 'i'; whatever the string conversion uses when applied at that point in the record.

  • -Ff[ndsi][o][,f[ndsi]o][,...]

    Where:

    • p -- field number (1-based fields separated by -T"...");
    • [ndsi] -- field type numeric (integer) | double (float) | string (case sensitive) | ignore (case insensitive).

      Optional: default 's';

    • o -- character offset (1-based) into field

      Optional: default, the rest of the fields for 's' & 'i'; whatever the string conversion uses when applied at that point in the field.

To save time, my response to "Why not use the same switch format as gnu sort?". Because I find them ambiguous, verbose and inflexible.

Thanks for your time.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re: sort utility key specification
by roboticus (Chancellor) on Apr 21, 2012 at 13:53 UTC

    BrowserUk:

    I occasionally need to sort financial files from mainframes, so I'd love to see flags for packed decimal fields. I have a script that I use to patch up these files into text for sorting, but it would be nice to skip that step.

    Also, I don't know whether your n flag means that the sort routine expects binary values in your records, or text strings representing numbers. If it means your records contain binary values, I'd really like to sort a numeric text column in natural order:

    unsorted sorted ---------- ---------- xxx1 xxx xxx1 xxx xxx11 xxx xxx 5 xxx xxx 5 xxx xxx 010xxx xxx010 xxx xxx11 xxx

    Regarding your comment about gnu sort: I know what you mean. I use it infrequently enough that I can't remember just how to specify what I want, but often enough to wish it was easy to remember. I generally have to sort the file a few times before I can get the switches just right.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      I'd love to see flags for packed decimal fields.

      Hm. I was under the impression that packed decimal fields sorted in correct numeric order when treated as strings?

      Also, I don't know whether your n flag means that the sort routine expects binary values in your records, or text strings representing numbers.

      Text representing numbers.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        BrowserUk:

        They do, until you throw in some negative numbers. They munge the sign into the last character.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: sort utility key specification
by flexvault (Monsignor) on Apr 21, 2012 at 14:21 UTC

    BrowserUk,

    I think your command line switches are fine. One thing I would add is the ability to only have 1 copy of the line in the output. I use that all the time and I personally prefer '-1' over '-u' which I have to look up all the time.

    Thank you

    "Well done is better than well said." - Benjamin Franklin

      the ability to only have 1 copy of the line in the output.... prefer '-1' over '-u'

      -u is already enabled -- it's not mentioned as this post was only in reference to the key specifications. I can see no reason not to have -1 as an alias.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        Great . . .

        "Well done is better than well said." - Benjamin Franklin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://966354]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-03-29 12:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found