Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Writing to a file atomically without renaming

by nomis80 (Sexton)
on Jun 30, 2005 at 18:29 UTC ( #471413=perlquestion: print w/ replies, xml ) Need Help??
nomis80 has asked for the wisdom of the Perl Monks concerning the following question:

Hello monk masters,

For the sake of exception safety, I want to write to a file atomically. In the process of writing to a file exceptions could be thrown if for example there is no more disk space. I know of the module IO::AtomicFile, which, it would seem, should make me happy.

However, this module uses the old trick of writing to a temporary file and then renaming it. This is nice for atomicity, but causes one problem: it clobbers the file attributes like owner and permissions. For example, I have this file:

-rw--w---- 1 user1 group1 ... filename

Let's assume I am running my program as "user2", which is a member of "group1". If I write to a temporary file and then move it, the owner of the file will have changed to "user2". Only root can change the owner of a file, so I'm stuck.

Is there a way to write to a file atomically without using rename()?

Thanks!

Comment on Writing to a file atomically without renaming
Re: Writing to a file atomically without renaming
by dragonchild (Archbishop) on Jun 30, 2005 at 18:36 UTC
    Atomic writes (and reads) are a function of the filesystem. Certain filesystems will do atomic I/O and others won't (without the rename trick). It's about as cut'n'dry as that.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Writing to a file atomically without renaming
by ikegami (Pope) on Jun 30, 2005 at 18:53 UTC
    flock can be used to prevent other people from using the file, but it's an elective system. That means that if a program doesn't try to lock the file before using it, it'll be able to read and modify file even if your application has it locked. I think that's the best you can do.

      Thank you for your reply, but I don't want atomic writing to solve a mutual access problem. I already lock my files using flock. What I need is to maintain the integrity of the file at all times (that is, exception safety). I don't want to start overwriting the file and then in the middle run out of space. I want some kind of two-phase commit system. And I want it to preserve the ownership of the file.

      Maybe I could do it by mmapping the file, writing my new stuff at the end so that I can erase it if things go wrong. Then when I'm done I just move my stuff up to the beginning of the file, erasing what was already there. That can't fail, I guess, so I think it could work. What do you think of that?

        how about using a symbolic link?

        For instance, suppose you have foo-1 and foo.link pointing to foo-1. To update it, first create foo-2, then change its permissions and finally point foo.link to foo-2.

        The point is that foo.link can have more relaxed permissions, i.e., 644

Re: Writing to a file atomically without renaming
by fmerges (Chaplain) on Jun 30, 2005 at 19:25 UTC

    Hi,

    You can use the Tie::File module, and move accross the file like a array, before writing you get the number of the last index of the array, then add data to the file, and then if it fails because of exception, you remove from the saved point towards.

    If it's binary data you should use seek for moving, etc...

    Could be a solution for what you want, if I understand well your problem.

    On the other hand, must it be stored in the filesystem, because, depending on your needs, on RDBMS with transactions... you don't get this "problem".

    Regards,

    |fire| at irc
      Yes, it must be stored in the filesystem. I'm starting to wonder how exactly RDBMS implement transactions. I don't think they use the rename() trick...
        They write blocks of data to holes in the file or at the end of the file, and if the transaction is committed they then change pointers in the tablespace to the new data, else if rollback they leave the pointers at the old data locations.


        -Waswas
Re: Writing to a file atomically without renaming
by waswas-fng (Curate) on Jun 30, 2005 at 19:33 UTC
    If this is on unix you can change the group of the file and its parent dir to a group that is common between both user1 and user2. If you also chmod 770 and g+s (group sicky bit) the parent dir, you should be in a situation where any file created is set to the group of the parent dir. This allows you to read and write to the file as power from both user1 and user2.


    -Waswas
      Very nice pragmatic solution! However I'm not in a situation where I can presume that I can chmod the parent dir at will.
        Modify IO::AtomicFile's source --
        Change: ### Open the file! Returns filehandle on success, for use as a constr +uctor: $self->SUPER::open($temp, $mode) ? $self : undef; To: ### Open the file! Returns filehandle on success, for use as a constr +uctor: $self->SUPER::open($temp, $mode) ? $self : undef; chown $uid, $gid, $temp;
        Where uid,and gid are static for this app or dynamically built somehow or passed to your version of IO::AtomicFile... chown $uid, $gid, "tempfile


        -Waswas
Re: Writing to a file atomically without renaming
by Transient (Hermit) on Jun 30, 2005 at 20:13 UTC
    I think this does was you want, but... I could be wrong...
    copy file_to_modify temp_file # modify file cat temp_file > file_to_modify
    I ran a test on my system and it overwrites the contents without affecting the owner or permissions... although I don't know if that redirection is guaranteed to be atomicm but you could make a third copy of the original, just in case, and remove it when everything is successfully completed.

    HTH!
      If you run out f space n the filesystem you have just clobbered the file. I don't think he is just looking for Atomic write -- but also rollback on err.


      -Waswas
        Which is why I suggested making the third copy, which could be "rolled back" in case of a problem. It's similar to DB transactions with redo logs. They don't have to worry about the file permissions problems, though.

        eval { copy file to temp_file copy file to backup_file modify temp_file cat temp_file > file verify file is OK remove temp_file and backup_file }; if ( $@ ) { remove temp_file cat backup_file > file remove backup_file }
        Worst case scenario the cat of the backup_file to temp_file fails and you're left with a corrupt "file" and the extra "backup_file" which would have to be moved back manually.
Re: Writing to a file atomically without renaming
by BrowserUk (Pope) on Jun 30, 2005 at 20:34 UTC

    Does File::Copy do the right thing with regard to permissions and ownership on your platform?

    If so,

  • Copy the original file to a tempfile.
  • Open and write to the tempfile.
  • Rename the tempfile to the original name.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: Writing to a file atomically without renaming
by ambrus (Abbot) on Jun 30, 2005 at 20:39 UTC

    No, not really. Not in UNIX systems at least. Of course, the word "atomic" is somewhat vague, you may want to define more precisely what you want.

    Here are the alternatives. You can use file locking, but that's atomic only if the other program reading the file uses them too. You might hear of mandatory locking, but that doesn't really make a write atomic, as it could interrupt You can use link instead of rename, but I doubt that would help. You can always write a single byte atomically to a file, which is enough if you just want to change a flag. You can write to datagram sockets atomically (with the maximum size possibly restricted), but that doesn't substitutes real files. You can somehow make sure that all other services that could read the file are stopped, like going into single user mode and running only the process that writes the file; this will make the write practically atomic, but that's probably not what you want either.

    You could create a mandatory lock or lease on the file, thus making it sure that no-one can open the file while you are doing the read; but someone can still have the file opened before you do that, and there is no way to tell if this is the case if it's some other user. If you are sure that only (non-setid) programs runnnig under your uid can have the file open, than this can be feasable.

    So, the situation is like this: you can only do an atomic write if either

    1. If you can co-operate with other programs reading or writing the file at the same time. (This is the most frequent case.)
    2. It's enough for you that the write happens at one time, you don't care that someone reading the file might see a paritally changed file; and you are sure that no-one else wants to write the file, only read. In this case, you can use mandatory locking or leases, or even real-time processes (but that requires root privilage at least).
    3. If you can get set-id privilage for the owner of the file. You could create a set-id program for that user that does nothing but checks privilage and changes the file. (There are examples for such programs, although not for atomic access: like crontab.)
    4. If you can arrange that it is not a problem that the file has a different owner. You can make the directory writable only by a certain group, thus making it secure to do this.
    5. If you use a database instead of a file.
    6. If this is on some exotic operating system variant that has such a feature.
    7. I think I've forgot an option... I'm a bit disorganized now. If I remember it, I'll update the node.
Re: Writing to a file atomically without renaming
by graff (Chancellor) on Jul 03, 2005 at 06:00 UTC
    Let's assume I am running my program as "user2", which is a member of "group1". If I write to a temporary file and then move it, the owner of the file will have changed to "user2". Only root can change the owner of a file, so I'm stuck.

    So you're saying that at the very point where user2 has just written a brand new version of the file, the original ownership and permissions should be in effect immediately, and user2 should be barred from having read access to the data that he just wrote himself? This seems a bit odd.

    I can imagine situations where it's important to make sure that user1 maintains ownership of a given file. And since you obviously have a technique that allows user2 to assume ownership, one possibility would be to make sure that user1 applies the same technique at some later time in order to take back ownership.

    In effect, the last person to write the file is the current owner. When user1 needs to own the file, he just has to write his own copy (using the standard atomic technique).

    If user2 is running a program that produces output that user2 is never supposed to see with his own eyes, then you have the wrong design. The data to be written by (but hidden from) user2 must be passed to a daemon process that is being run by user1 -- you need IPC to handle this sort of ownership issue.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://471413]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2014-09-20 03:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (152 votes), past polls