http://www.perlmonks.org?node_id=86954

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am creating a classifieds script. I am writing it so that an administrator can moderate the postings. I am storing the file to a temporary directory, and would like to save it to the first filename that is available. I am naming the files 1.dat, 2.dat, 3.dat. How could I name it the first available number.. so it all stays as low as possible?

Replies are listed 'Best First'.
Re: sequencial file naming
by bwana147 (Pilgrim) on Jun 08, 2001 at 20:16 UTC

    First, I suggest that you call them 0001.dat, 0002.dat, so that do appear sorted when you use ls.

    One way, would be to read the directory, sort the files, take the last one and increment it by one. This might become quite long if there are many files.

    Another way would be to store the last number somewhere (DB, file) whatever. Then, just out of mere paranoia, you can can still check whether the proposed file exists, and increment the number as long as it does:

    $file_n = do_something_to_get_the_number(); $file_n++ while -e "$file_n.dat";

    It is likely that you'll call -e only once, if your stored number doesn't go out of sync.

    --bwana147

Re: sequencial file naming
by Abigail (Deacon) on Jun 08, 2001 at 23:50 UTC
    Well, there are a few obvious ways of doing so, one is to keep track of the last used number in a file or database, the other is to use a loop. Either over the files in the directory, keeping track of the highest number, or just start counting, stopping as soon as the corresponding file cannot be found.

    However, the big question is, why?. This looks like a typical XY problem. You want to do X, and you think Y is the best way of doing so. Instead of asking about X, you ask about Y.

    All of the methods I mentioned need some form of locking - lock the file, or make sure not two processes go searching for the "next" number. This might make your program more complex, and potentially slow. And that's in the probably relatively rare case of adding a new file. It looks like you will be getting a lot of files, and a single directory with a lot of files will mean accessing a file by name is going to be slow (linear search through the directory data block).

    I don't know what you want to do with the files, but my gut feeling is shouting "shouldn't you use a database?".

    -- Abigail

Re: sequencial file naming
by marcink (Monk) on Jun 08, 2001 at 20:17 UTC
    If you're sure that you're the only one trying to find such a file (no race conditions), you can do it this way:

    #!/usr/bin/perl -w use strict; my $i = 1; $i++ while -f "file_$i.dat"; open OUTFILE, "file_$i.dat";


    The problem is that someone could create the file between this program's loop and open statements. To avoid it you could either use:

    $i++ while !sysopen( FILE, "file_$i.dat",O_RDWR|O_CREAT|O_EXCL);


    ...or (the recommended solution) use File::MkTemp (if you don't need the names to be small natural numbers).

    Update: I forgot to add that you need to 'use IO::File;' to get the O_* constants for sysopen.

    -mk
Re: sequencial file naming
by mbond (Beadle) on Jun 08, 2001 at 21:14 UTC
    We do this sort of thing a lot at work for surveys that need to be in text format so the various peopel that want to read them can.

    However, we occassional delete files as well that were either tests, or smart-@$$ responces. So to keep track of the files that have been deleted i load the directory, knock off the "." and ".." files and iterate through looking for the first avialable slot.

    foreach (@files) {
    if ($_ == $last+1) {
    open( ... );
    }
    $last = $_;
    }


    thats minus error checking, etc .. but its the jist of it.

    just make sure you check for race conditions (as mentioned above), and iterate to the next one if needed.

    Mbond.
Re: sequencial file naming
by mrmick (Curate) on Jun 08, 2001 at 20:36 UTC
    One way is to get a list of the matching files in the directory, sorting them and adding to the digit. This example outputs last digit and the new filename to STDOUT to demonstrate:

    (remember to use strict when programming)

    $dir='path to directory'; opendir D, $dir or die "Cannot open directory $dir: $!\n"; while( defined (my $file = readdir(D)) ) { if ($file =~ /\.dat$/){ push(@files,$file); } } closedir D; @files = sort @files; @filename = split(/\./,$files[$#files]); print $filename[0],"\n"; $filename = ($filename[0] + 1) . '.dat'; print $filename,"\n";

    Mick
(boo) Re: sequencial file naming
by boo_radley (Parson) on Jun 08, 2001 at 20:26 UTC
    windows me, perl 5.6
    done primarily to see what glob does This probably means that using this code under unix will set the box on fire or something. I dunno.

    @list=sort glob ("*.txt"); ($next)= $list[-1]=~/(\d+)/; ++$next; print "opening $next.txt"; open (FH, ">>$next.txt") or die "no open for you! $!"; #hi, vs! print FH "bueno"; close FH;
Use a timestamp as filename?
by hackmare (Pilgrim) on Jun 10, 2001 at 00:38 UTC

    Just a suggestion here... if you do not need the file names sequence to be gap-free, but nearly want them to be sequential, I suggest you use the epoch date (number of seconds since some time in the long, long past.

    Next, you check that the file does not exist, and use that ID. If it exists, increment by one, and check again.

    The upside is that unless you average 1 request per second, you're safe.

    The downside is that the filenames are ugly.

    On the command line, trz it with this: perl -e 'print "time()\n"

    Hackmare

Re: sequencial file naming
by John M. Dlugosz (Monsignor) on Jun 09, 2001 at 00:36 UTC
    I did something like that the other day. It was at home, so I don't have the exact code handy, but basically I did opendir, read into an array, grep on the fnames of interest, sort them, pull off the last (highest) one, and parse out the digits. It's a good one-liner (not including the open/close). It's indeed much simpler to use fixed number of digits. if you put the digits last you can use the magical increment operator on the whole string.

    —John