Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Delete all but the most recent backup file

by Kenosis (Priest)
on Jan 24, 2013 at 05:54 UTC ( [id://1015079]=note: print w/replies, xml ) Need Help??


in reply to Delete all but the most recent backup file

If I'm understanding you correctly, perhaps the following will be helpful:

use strict; use warnings; chomp( my @fileNames = <DATA> ); my @sortedFileNames = map $_->[0], sort { $b->[1] <=> $a->[1] } map { my ( $d, $m, $y ) = /(\d+)/g; [ $_, "$y$m$d" ] } grep /^backup_\d\d_\d\d_\d{4}.bak$/, @fileNames; shift @sortedFileNames; if (@sortedFileNames) { print "$_\n" for @sortedFileNames; #unlink @sortedFileNames; } __DATA__ backup_21_01_2013.bak file.txt backup_20_01_2013.bak what_is_this.doc backup_24_01_2013.bak never_open_this.docx backup_22_01_2013.bak stuff.ini backup_23_01_2013.bak more_stuff.ini deleteOldBackups.pl

Output (the files that would be deleted):

backup_23_01_2013.bak backup_22_01_2013.bak backup_21_01_2013.bak backup_20_01_2013.bak

If you populate @fileNames with the file names in the directory where the backups live, it will grep them only allowing backup-patterned files through. Then, using a Schwartzian transform, it sorts the backup file names in decending order and shifts off the first element (most recent backup file name) from @sortedFileNames. As it is now, the file names in @sortedFileNames are printed, but the unlink line can be uncommented, so all but the most recent backup files are deleted.

** Please thoroughly test and verify this on a copy of the backup directory before going live with it. **

Replies are listed 'Best First'.
Re^2: Delete all but the most recent backup file
by jagexCoder (Novice) on Jan 29, 2013 at 04:34 UTC
    Hi there! So I did some reading and have some queries:

    (1) Why is the Schwartzian transform read from bottom to top? Since in procedural programming it's usually top to bottom. Reading the Wikipedia article it makes sense however curious as to the behavior of this method.


    (2)Could you please explain the expression used in map{}? { my $stat = stat $_; [ $_, $stat->mtime ] } I understand that a scalar variable called $stat has been defined to the 'default variable' $_. My understanding is that when map() is run (the map on the bottom) it evaluates the expression within {} for each element in @fileNames and stores it in the default variable? So would this mean that each time an element is passed the default variable changes?

    (3) Further to (2), I understand that "$stat->mtime" is getting the last modified time since epoch for each value of $stat, my understanding is that each time the next element from the array is passed the mtime for that particular $stat is obtained. So what's the meaning of  [ $_, $stat->mtime ]. Since there's a comma separating the two.

    (4) The semicolon in { my $stat = stat $_; [ $_, $stat->mtime ] } is separating the two statements within this single expression. Is that correct?

    (5) I understand that in sort { $b->[1] <=> $a->[1] } a descending numeric sort is being performed. However what I don't get is the [1] $b and $a both share. Also what's the relationship between $b and $a? I found an example of this type of sort on the net however it did not explain why $b and $a are used. Do they simply represent two different locations in a list?

    (6)The  map $_->[0] does not appear to follow the format map({expression}, list). How is this different to the standard map function?

    Thanks for your help! Sorry for these questions just clarifying my doubts.
Re^2: Delete all but the most recent backup file
by jagexCoder (Novice) on Jan 27, 2013 at 14:54 UTC
    Thanks it certainly is useful and does perform deletion. However I have two queries: (1) Komodo Edit reports:
    Name "main::DATA" used only once: possible typo at bk_remove.pl line 7 +3. readline() on unopened filehandle DATA at bk_remove.pl line 73.
    This is referring to <DATA> in the code. (2) It appears the code does the deletion based on the date listed in the filename, while this is ideal I've done the previous coding based on the day modified only (using -M) and not the date on the filename so I would like to keep it this way. Any ideas on how I could modify this to the way I did it? Much appreciated for all answers, apologies I am not very good at perl however I do get the odd scripts here and there done when needed. Thanks again!

      You're most welcome!

      Yes, Komodo appears to just be alerting you about <DATA>, but should certainly know better, since there's a __DATA__ section.

      Try the following:

      use strict; use warnings; use File::stat; chomp( my @fileNames = <*.bak> ); my @sortedFileNames = map $_->[0], sort { $b->[1] <=> $a->[1] } map { my $stat = stat $_; [ $_, $stat->mtime ] } grep /^backup_\d\d_\d\d_\d{4}.bak$/, @fileNames; shift @sortedFileNames; if (@sortedFileNames) { print "$_\n" for @sortedFileNames; #unlink @sortedFileNames; }

      This stats each file for the modification time, using it in the sort. Also, note that a file glob's used to read directory files...

        Hi thanks, that works great! I modified the regular expression to the file naming format that we use at work. I just noticed that the system of sorting and preserving the latest backup and deleting the rest is the most efficient - my supervisor and dad (he's a programmer) said the same thing. I don't know why I stuck with the idea of using flags, I guess it's the little mistakes the not-that-experienced programmers make. This is a good learning experience! I'll figure out the syntax related to sorting that has been implemented and understand it fully. Thanks again to both of you - take care!
      I guess you'd have to replace the line
      map { my ( $d, $m, $y ) = /(\d+)/g; [ $_, "$y$m$d" ] }
      by something like
      map {[$_, -M $_]}
      However, currently I'm not often using these functions, so of course you should test it first, not that you end up keeping the oldest instead of the newest backup...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1015079]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-23 22:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found