Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: Bug in Sort::Fields?

by cmv (Chaplain)
on Jul 20, 2010 at 18:39 UTC ( #850480=note: print w/ replies, xml ) Need Help??


in reply to Re: Bug in Sort::Fields?
in thread Split(), Initial Spaces, & a limit?

ikegami-

I'm sorry, but I don't believe I understand your point. It seems that all you did to fix the problem was to remove the initial spaces in the original data.

In my opinion Sort::Fields should sort the data the same way, regardless of where the data is (field 1 or field 2). If you try to numerically sort the output of an 'ls -s' command, you can see the problem clearly:

use strict; use warnings; use Sort::Fields; use Data::Dumper; my @data = `ls -s`; chomp(@data); my @sorted = fieldsort( ['1n'], @data); print(Dumper(\@sorted));
This doesn't do what is intended, and is why I made the report to the author. I'm sure I could remove the initial spaces for Data::Dumper, then put them back after it's done, but that doesn't seem right to me.

-Craig


Comment on Re^2: Bug in Sort::Fields?
Download Code
Re^3: Bug in Sort::Fields?
by ikegami (Pope) on Jul 20, 2010 at 19:02 UTC

    regardless of where the data is (field 1 or field 2).

    The key must be either in field 1 or in field 2. It can't vary by row. You're providing

    Field 1 Field 2 Field 3 ----------- ----------- ----------- 56 1752.eps "", "56", "1752.eps" key in 2 1160 trace.exe "1160", "trace.exe" key in 1 123 foo bar.pl "123", "foo", "bar.pl" key in 1

    You need to normalize your fields so that they are the same for each row. I did it by removing the extraneous delimiter in the front of some lines.

    Field 1 Field 2 Field 3 ----------- ----------- ----------- 56 1752.eps "56", "1752.eps" key in 1 1160 trace.exe "1160", "trace.exe" key in 1 123 foo bar.pl "123", "foo", "bar.pl" key in 1

    You could also add an extraneous delimiter to the lines that don't have one.

    Field 1 Field 2 Field 3 Field 4 ----------- ----------- ----------- ----------- 56 1752.eps "", "56", "1752.eps" 1160 trace.exe "", "1160", "trace.exe" 123 foo bar.pl "", "123", "foo", "bar.pl"

    By the way, why not just let ls do the sorting if you're going to use ls?

    Update: Improved visuals.

      Yes, I agree with you that the key cannot vary by row.

      Since this is what Sort::Fields is doing, are you agreeing with me that this is a bug in that module?

      As I implied earlier, the module should sort the exact same data, the same way, no matter where that data shows up (field 1, field 2, or field N).

      Thanks

      -Craig

      update: Ah, sorry, you were still editing, and added more to the reply while I was responding.

      I think my point is that Sort::Fields will "do the right thing" if the data containing initial spaces is in any place other than field 1. If the data is in field 1, then it does something different.

      I believe this is because the author specifically uses /\s+/ as the field delimiter in his code. This works fine for every field except field 1 (as shown by the discussion at the beginning of this post). I would like to see this module do the same thing on the same data, no matter what field it show up in.

      I believe this should be easily fixable, by replacing the /\s+/ with /' '/, as you showed me in an earlier post. I just can't figure out how to do that in his code. I also don't know what side effects that would have.

        Yes, I agree with you that the key cannot vary by row.

        Then don't pass such data to Sort::Fields.

        are you agreeing with me that this is a bug in that module?

        Most definitely not. It's sorting using the specified field as the key. It's not its fault you didn't put the key in that field.

        I would like to see this module do the same thing on the same data, no matter what field it show up in.

        That's exactly what it does.

        What's left of the first space group is the first field.
        What's left of the second space group is the second field.
        What's left of the third space group is the third field.

        You're the one who wants to treat the first field specially. That's not a problem, but you'll have to do it yourself.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://850480]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2014-07-29 04:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (211 votes), past polls