Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Sort files descending by date

by Scrat (Monk)
on Jun 19, 2007 at 07:30 UTC ( [id://621961]=perlquestion: print w/replies, xml ) Need Help??

Scrat has asked for the wisdom of the Perl Monks concerning the following question:

Hi Everyone

I have a list of xml files in 'n directory:

WAN_DX_ACD_ACD_2007_06_10_00_10_38_042.csv.xml WAN_DX_ACH_ACH_2007_06_13_00_10_37_051.csv.xml WAN_DX_ADY_ADY_2007_06_10_00_10_37_060.csv.xml WAN_DX_ALD_ALD_2007_06_10_00_10_38_073.csv.xml WAN_DX_ALE_ALE_2007_06_10_00_10_39_106.csv.xml WAN_DX_BFN_BFP_2007_06_11_00_15_52_400.csv.xml WAN_DX_BNA_BNA_2007_06_30_00_22_32_641.csv.xml WAN_DX_BLV_BLV_2007_06_22_00_22_34_667.csv.xml
The above filenames contain the following (taking the first file as an example):
WAN_DX_ACD_ACD - This is the network device name
2007_06_10 - This is the date (yyyy-mm-dd)
00_10_38_042 - This is the time including milliseconds(00:10:38:042 AM)

I need to sort these files descendingly according to date and then time (process the oldest files first, and the newest files last). This is my code, which doesn't work:

#!/usr/bin/perl use strict; use warnings; my $dir = "D:\\scripts\\"; my (@sorted_list, @file_list); if ( opendir(DIR, "$dir") ) { foreach my $file( readdir(DIR) ) { next if ( $file =~ /^\./ ); if ($file =~ /xml$/) { foreach ($file) { push (@file_list, $_); } } } @sorted_list = map {$_->[0]} sort { $b->[1] <=> $a->[1] } map { [ +$_,(split/\D/)[15..21]] } @file_list; foreach (@sorted_list) { print "$_\n"; } } closedir (DIR);

When I print @sorted_list, the contents are still not sorted. What am I doing wrong?

Update:Fixed typo.

Update2:Just realised another mistake - I was printing the old unsorted @file_list in my initial 2nd foreach loop, and not the new, sorted @sorted_list.

Replies are listed 'Best First'.
Re: Sort files descending by date
by grinder (Bishop) on Jun 19, 2007 at 07:45 UTC

    oog! Looking at that split I have no idea what you're trying to extract. Well, not at first glance. All you want is to extract the digit-and-underscore component before the file extensions. How about the simpler:

    my @sorted_list = map {$_->[0]} sort { $b->[1] cmp $a->[1] } map { /([\d_]+)\.csv\.xml$/ ? [$_,$1] : [$_, 0] } @file_list;

    Also, you should avoid using

    next if ( $file =~ /^\./ );

    One of these days it will come back to haunt you. It is wiser to use the more prosaic form:

    next if $file eq '.' or $file eq '..'

    Even if it doesn't look as cool because it lacks regexps.

    • another intruder with the mooring in the heart of the Perl

      Thanks for the reply Grinder.

      It worked like a charm - now prints out:
      WAN_DX_BNA_BNA_2007_06_30_00_22_32_641.csv.xml WAN_DX_BLV_BLV_2007_06_22_00_22_34_667.csv.xml WAN_DX_ACH_ACH_2007_06_13_00_10_37_051.csv.xml WAN_DX_BFN_BFP_2007_06_11_00_15_52_400.csv.xml WAN_DX_ALE_ALE_2007_06_10_00_10_39_106.csv.xml WAN_DX_ALD_ALD_2007_06_10_00_10_38_073.csv.xml WAN_DX_ACD_ACD_2007_06_10_00_10_38_042.csv.xml WAN_DX_ADY_ADY_2007_06_10_00_10_37_060.csv.xml
Re: Sort files descending by date
by salva (Canon) on Jun 19, 2007 at 07:54 UTC
    @sorted_list = sort { substr($a, 15) cmp substr($b, 15) } @file_list;
    ... probably as fast as the ST and much simpler!

      Using the default sort handler should be even faster:

      my @sorted_list = map { substr($_, 23) } sort map { substr($_, 15, 23) . $_ } @file_list;

      Same thing, but uses very little overhead memory:

      my @sorted_list = @file_list; $_ = substr($_, 15, 23) . $_ for @sorted_list; @sorted_list = sort @sorted_list; $_ = substr($_, 23) for @sorted_list;
Re: Sort files descending by date
by Anonymous Monk on Jun 19, 2007 at 07:50 UTC
     $b->[1] is not what you think. Its the year, which is 2007 for all.
Re: Sort files descending by date
by fenLisesi (Priest) on Jun 19, 2007 at 08:09 UTC
    use strict; use warnings; my $SOME_DIR = '/home/scrat/scratchpad'; my $PATTERN = qr{ \A \w+ (\d\d\d\d_\d\d_\d\d_\d\d_\d\d_\d\d_\d\d\d) [\.\w]* \.xml \z }xms; opendir( DIR, $SOME_DIR ) || die "can't opendir $SOME_DIR: $!"; my @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { ($_ =~ $PATTERN) && [$_, $1] } grep { ($_ =~ $PATTERN) && -f "$SOME_DIR/$_"} readdir( DIR ) ; closedir DIR; printf "%s\n", join "\n", @sorted;
      (\d\d\d\d_\d\d_\d\d_\d\d_\d\d_\d\d_\d\d\d)

      Perhaps it's just me but I find all those \ds confusing and would prefer to use quantifiers.

      (\d{4}(?:_\d\d){5}_\d{3})

      Also, the requirement was for a descending sort so I think

      sort { $a->[1] cmp $b->[1] }

      should be

      sort { $b->[1] cmp $a->[1] }

      Cheers,

      JohnGG

      Update: I should have read the OP more carefully, the sort required is actually ascending as fenLisesi points out.

        process the oldest files first, and the newest files last
Re: Sort files descending by date
by FunkyMonk (Chancellor) on Jun 19, 2007 at 16:42 UTC
    If the files names are the same length and format (as would appear from your example data), I'd use a substr approach such as posted by ikegami.

    If there is a possibility of the format of the filenames changing, I'd use somthing like:

    my @sorted_list = map { $_ -> [1] } sort { $a->[0] cmp $b->[0] } map { [ do { (my $x = $_) =~ tr/0-9//dc; $x }, $_ ] } @file_list;

    However, having written it, I'm not particularly proud of using a do{} within a map.

      I'm not particularly proud of using a do{} within a map.

      Then don't do it :)

      map { (my $ts = $_) =~ tr/0-9//dc; [ $ts, $_ ] }, @file_list;
        Yes, that's much better. Thank you.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://621961]
Approved by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-03-28 19:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found