Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Updating files

by Amoe (Friar)
on Oct 22, 2001 at 22:41 UTC ( #120611=perlquestion: print w/replies, xml ) Need Help??
Amoe has asked for the wisdom of the Perl Monks concerning the following question:

I have a complicated data structure (AoH, to be precise) that is dumped and loaded, via eval and Data::Dumper Now the loaded file has to be updated as it is processed, removing hashrefs already processed. I know how to modify the structure, but the way I can think of of updating the file (by clobbering and overwriting the old file with the new file) seems rather inelegant. My question, oh berobed ones: In the spirit of Tim Toady, is there another way to do it? I don't have any code yet, as I felt that writing any might be redundant at the moment.

my one true love

Replies are listed 'Best First'.
Re: Updating files
by jeroenes (Priest) on Oct 22, 2001 at 22:56 UTC
    AS long as you stick to the text file, there is nothing else than overwriting that file. I suggest to flatten your structure by taking making new hash keys, consisting of a zero-padded number and the keys of the nested hash. You than use these newly generated keys to create a flat database (like BerkeleyDB). This database has a random acces mode, so you can update and add and delete at will.

    For the remainder of the code: The translation from AoH to flat access is rather simple. Just concatenate the numbers and keys.


    "We are not alone"(FZ)

Re: Updating files
by perrin (Chancellor) on Oct 23, 2001 at 00:21 UTC
    Your approach is fine, as long as there will only be one process working on this data at a time. If that ever changes, the suggestion above about using dbm files is probably the way to go.

    Incidentally, Storable is faster and might be worth switching to if you don't need the file to be human readable.

Re: Updating files
by fokat (Deacon) on Oct 23, 2001 at 00:00 UTC
    I once had to do something similar to what you describe, but with a number of producers and consumers. Also, I did not want to have to deal with databases.

    My solution involved creating a directory for each entry that was accepted by the system by one of the producers. Each one used a scheme that generated distinct names (and detected collisions with other instances, as mkdir will fail when attempting to create an existing directory).

    The consumers would lock an entry by creating a lock directory within the main directory. In my scenario, creating a directory was an atomic operation in the underlying FS where that application was running, and is indeed atomic today in a lot of FS.

    After achieving a succesful lock on a directory, the consumer simply procesed the data in the various files within, unlink()ing them as it proceeded. When done, the parent directory and then the lock directory were deleted in that order to prevent a second consumer getting into the same request.

    This managed to run a few hundreds of producers and consumers during more than a year, without a single race condition. The overhead of this solution was very small.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://120611]
Approved by root
[ovedpo15]: consider the following format of strings: some_data-doesn't- matter,value. how I get the value with regex? it should be after the last comma (last string)
[Corion]: Text::CSV_XS for all your CSV file parsing needs. Other than that, what problems do you have with perlre, and capturing everything after the comma? Also, index and substr would also work.
[choroba]: my ($value) = $string =~ /.*,(.*)/
[choroba]: The .* at the beginning matches the longest possible substring, so it eats any previous commas
[choroba]: But for CSV, use appropriate modules

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2018-05-27 08:50 GMT
Find Nodes?
    Voting Booth?