### Sorting dates with the Schwartzian Transform

by Wobbel (Acolyte)
 on Aug 03, 2011 at 13:28 UTC Need Help??
Wobbel has asked for the wisdom of the Perl Monks concerning the following question:

The sorting on 4 columns of a 40 column textfile with the Schwartzian Transform works fine, except the date column.

It's in DD-MM-YYYY format and alfabetically/numerically handled as a string.

Do I have to convert this column before the ST to a number, sort it during the ST as a number, and after the ST convert it back to the DD-MM-YYYY format?

Or is there an elegant solution within the ST possible?

I think, the first option will harm the performance...

Any expert advice is very welcome.

Thanks.

• Comment on Sorting dates with the Schwartzian Transform

Replies are listed 'Best First'.
Re: Sorting dates with the Schwartzian Transform
by moritz (Cardinal) on Aug 03, 2011 at 13:37 UTC
You can convert it all to YYYYMMDD or <c>YYYY-MM-DD<c> format, and then use a simple string-comparison based sort on the results.
Re: Sorting dates with the Schwartzian Transform
by ikegami (Pope) on Aug 03, 2011 at 18:21 UTC

A Schwartzian Transform when all you have to do is parse dates? That will make things *slower*. Creating all those arrays and references adds up.

This could speed up the sorting (because it creates few extra variables and it uses the specially optimised \$a cmp \$b callback):

```my @sorted =
map substr(\$_, 8),
sort
map join('', (/(..)-(..)-(....)/)[2,1,0], \$_),
@dates;  # DD-MM-YYYY

Naïve:

```my @sorted =
sort {
join('', (\$a =~ /(..)-(..)-(....)/)[2,1,0])
cmp
join('', (\$b =~ /(..)-(..)-(....)/)[2,1,0])
}
@dates;  # DD-MM-YYYY

Schwartzian Transform:

```my @sorted =
map \$_->[0],
sort { \$a->[1] cmp \$b->[1] }
map [ \$_, join('', (/(..)-(..)-(....)/)[2,1,0]) ],
@dates;  # DD-MM-YYYY

Dear Perl experts, thanks for all the usefull replies and different approaches! The PerlMonks website is a real good place to learn new things and why you choose a certain solution. Great.

The sorting is not only about dates. Patient_ID, Course_ID, Session_number, Session_date, Imaging_type, and a lot of measurements. The sorting question is the last part of a bigger project. "The doctor" needs the data in an Excel-friendly format :-( .

I can't wait to fine tune my code, but I'll have to wait till tomorrow.

I'm so close to the last step, but....

What if you sort on two or three special columns? In my case date 11 and time 12. Is your original code limited to one column, our is it possible to "map" on more then one time/date format?

I've Googled and tried a lot last week, but I'm stuck (on the syntaxis).

```my @sorted =
map \$_->[0],
sort { \$a->[11] cmp \$b->[11] || #Date, original
#      \$a->[12] cmp \$b->[12]    #Time, to do list
}
map [ \$_, join('', (/(..)-(..)-(....)/)[2,1,0]) ],
# map [ \$_, join('', (/(..):(..):(..)/)[2,1,0]) ], # Is it possible
+ to map two columns date and time?

@dates;  # DD-MM-YYYY
# HOURS:MIN:SEC

I've Googled and tried a lot last week, but I'm stuck (on the syntaxis).

I hope you understand my message despite the wording

Actually it seems more like you're stuck on syntax and arrays.

You need to read perlintro and Basic debugging checklist and How do I post a question effectively? and References quick reference

Also, when you have a program, with real, named variables, talking about columns can get confusing , talk about your variables instead ;)

The code you pasted will never have a 12 element array, nor do you want one.

I would go back to

```my @sorted =
map substr(\$_, 8),
sort
map join('', (/(..)-(..)-(....)/)[2,1,0], \$_),
@dates;  # DD-MM-YYYY
Don't get it? To understand, you would write a program like this
```#!/usr/bin/perl --
use strict;
use warnings;
use Data::Dumper;
my @dates = qw[
08-15-2011
08-10-2011
08-05-2011
];

print "\ndates ", Dumper( \@dates );

#~ my @firstTransform =  map join('', (/(..)-(..)-(....)/)[2,1,0], \$_)
+, @dates;  # DD-MM-YYYY
my @firstTransform =  map join('', ReorderForCmp(\$_), \$_), @dates;  #
+DD-MM-YYYY
print "\nfirstTransform ", Dumper( \@firstTransform  );

my @firstSorted = sort @firstTransform  ;
print "\nfirsSorted ", Dumper( \@firstSorted  );

my @finalTransform = map substr(\$_, 8), @firstSorted ;
print "\nfinalTransform ", Dumper( \@finalTransform );

sub ReorderForCmp {
my( \$one ) = @_;
my @date = \$one =~ /(..)-(..)-(....)/;

#~     return @date[2,1,0];
return \$date[2], \$date[1], \$date[0];
}

__END__
which produces this output
```dates \$VAR1 = [
'08-15-2011',
'08-10-2011',
'08-05-2011'
];

firstTransform \$VAR1 = [
'2011150808-15-2011',
'2011100808-10-2011',
'2011050808-05-2011'
];

'2011050808-05-2011',
'2011100808-10-2011',
'2011150808-15-2011'
];

finalTransform \$VAR1 = [
'08-05-2011',
'08-10-2011',
'08-15-2011'
];

So yes, it is possible to "map two columns date and time", just adjust sub ReorderForCmp to return iso-8601 style datetime ( YYYYMMDDHHMMSS)
Re: Sorting dates with the Schwartzian Transform
by FunkyMonk (Chancellor) on Aug 03, 2011 at 16:20 UTC
Do I have to convert this column before the ST to a number,
A number or a string, it makes little difference
and after the ST convert it back to the DD-MM-YYYY format?
No, you just throw the sortable number/string away. Something like this:
```print for map  { \$_->[0] }             # extract original date
sort { \$a->[1] cmp \$b->[1] } # Sort, using sortable date
map  {
m/(\d\d)-(\d\d)-(\d{4})/;
[\$_, "\$3\$2\$1"]           # [original date, sortable date
+]
} <DATA>;

__DATA__
02-02-2007
01-01-2006
03-03-2009
02-02-2009

Output:

```01-01-2006
02-02-2007
02-02-2009
03-03-2009

Unless I state otherwise, all my code runs with strict and warnings
Re: Sorting dates with the Schwartzian Transform
by Anonymous Monk on Aug 03, 2011 at 13:45 UTC

Do I have to convert this column before the ST to a number, sort it during the ST as a number, and after the ST convert it back to the DD-MM-YYYY format?

Um no, just do a schwartzian-transform ;)

```@dates = map { \$_->[0] }
sort {  \$a->[1] <=> \$b->[1] }
map { my \$f = \$_; s/\D//g; [ \$f, \$_ ] }
@dates;

Schwartzian, no doubt, but not enough transformation. The dates are originally in DD-MM-YYYY format and must be rearranged for effective sorting.

```@dates = map { \$_->[0] }
sort {  \$a->[1] >= \$b->[1] }
map { my \$f = \$_; /(\d\d)-(\d\d)-(\d{4})/g; [ \$f, "\$3\$2\$1" ] }
@dates;

Also, incomplete. There are 3 other columns that should figure in the sort. I'd like to see the OP's code.

On a side note, I had never heard of the ST before. I looked briefly in perlsyn to see how I would know that the map {} sort {} map {} would be executed in reverse order, but I didn't find it. Any clues?

... how I would know that the map {} sort {} map {} would be executed in reverse order, but I didn't find it. Any clues?

Both map and sort take a list (which is on the RHS), do some transformation of it then return the transformed list (to the LHS).

```my @mapped = map { # some transform code } @unmapped;
my @sorted = sort { # sorting code } @unsorted;

The ST code is just an extension of this right-to-left pattern:

• the first map extracts the sorted dates in the original DD-MM-YYYY format and assigns to the @dates array on the LHS of the assignment operator (=);

• but it can't do that before the sort has evaluated, sorting the items;

• which in turn can't do any sorting before the bottom (or rightmost) map has transformed some dates into something sort can work with, taking its raw material from the rightmost part of the expression which is the original, unsorted @dates array.

I hope this makes things a bit clearer.

Cheers,

JohnGG

Re: Sorting dates with the Schwartzian Transform
by salva (Abbot) on Aug 04, 2011 at 09:06 UTC
Re: Sorting dates with the Schwartzian Transform
by osbosb (Monk) on Aug 03, 2011 at 14:02 UTC
alfabetically? Really? Come on.

Create A New User
Node Status?
node history
Node Type: perlquestion [id://918248]
Approved by Marshall
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2018-05-25 13:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
World peace can best be achieved by:

Results (186 votes). Check out past polls.

Notices?