Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Re4: Super Find critic needed

by BrowserUk (Pope)
on Jun 30, 2003 at 17:54 UTC ( #270243=note: print w/replies, xml ) Need Help??

in reply to Re4: Super Find critic needed
in thread Super Find critic needed

If the process is interupted after the new file has been created, but before the old file has been deleted, regardless of whether the new file was properly written and closed, when the system is restored and the script is re-run, the program will again find a file by the original name, and rename it to "$filename.$$". If the new file was was completely written and properly flushed, then no harm done, it will just be processed as though it was the original file, no further changes will be made and your back on track.

However, if the new file was only partially written when the interuption occured, then without adding an explicit check for the existance of a file called "filename.$$", then perl's rename function will silently blow the first backup away, over writing it with the partially incomplete version.

This implies that perl rename is implemented as either a copy or a delete followed by rename, as the OS rename (whether the command or the underlying system call), will not allow you to rename a file if a file with the new name already exists. At least this is the case under Win32, I'm not sure of the situation with other OS's.

It therefore falls to the programmer using Perl's rename to check for and handle the situation where the new name already exists using -e or similar. Once this check is in place, then you still need to add code to handle the situation where the backup does exist and arrange to delete the (potentially partial) new file created at the last pass and restore the backup. Perl's rename will do this ostensibly in one step, but as I just noted, in reality, at least on some systems, there are two steps involved. A delete, followed by a rename. If a second interuption occures between these two steps, then you get the situation where you have a backup with no original. If the Find::File or globing processes used to build the file list uses anything other than a fully wild match criteria, then a third pass won't even see the backup as it will only be looking for the original, which no longer exists. So whilst no data has been lost, it will require a manual intervention to restore it.

Yes. This is a paranoid view. To arrive here we need three failures to occur at exactly inopportune moments. However, I was involved in a project where the whole issue of automating the updating files in a production environment became the subject of a protracted investigation to determine a mechanism for ensuring that there were NO risks involved. The machines in question were used by cargo division of a large international airline to control the loading of freight on their fleet of 747 cargo aircraft. Accurate information of what freight had been loaded on the aircraft is paramount as the weight of the cargo and its distribution are critical information to how much fuel is required and to the handling and take-off characteristics of the aircraft when taking off from airports at high altitudes and/or hot conditions. To complicate matters, some of the servers in question were located in tin shacks on African and Russian airfields that were little more than dust strips, and with mains systems that were subjected to frequent power cuts that often lasted longer than the UPS's could maintain.

That was done using REXX not Perl, but most of the same problems arise. The final conclusions of the investigation was that there is no 100% reliable way to completly automate the process. It can be reduced to a margin of a very low probablity of occurance, but the only way to get to 100% is to have a manual verification as the final step of the process and only accept that the process has been completed in its entirity if that verifcation runs from begining to end without interuption.

In most real-life situations, 99% is probably good enough:)

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Replies are listed 'Best First'.
Re6: Super Find critic needed
by bbfu (Curate) on Jun 30, 2003 at 20:34 UTC

    ...A third pass won't even see the backup as it will only be looking for the original, which no longer exists. So whilst no data has been lost, it will require a manual intervention to restore it.

    Why not simply have the script check or the existance of any backups (before renaming the "original") and assume the worst (ie, even if there is an original, it must be corrupt).

    It seems to me that this would eliminate the need for manual intervention without adding any risk. After all, that's exactly what the person intervening will do anyway, is it not? Then, you could have any number of power failures, all at exactly the wrong times, and the worst that will happen is the program will completely reprocess the file each time power is restored. No?

    Black flowers blossom
    Fearless on my breath

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://270243]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2018-06-19 00:14 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (111 votes). Check out past polls.