Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

How to modify/delete records in a flat-file database?

by JoeJaz (Monk)
on Apr 28, 2004 at 08:47 UTC ( [id://348758]=perlquestion: print w/replies, xml ) Need Help??

JoeJaz has asked for the wisdom of the Perl Monks concerning the following question:

Hi, using a Perl CGI script, I am trying to manage a list of people and attributes about the people in a colon-delimited flat file and I am trying to generate a perl script that will add, change, and remove entries to this file. For example, the file might look like this:
John Smith:26:healthy:Linux Joe Jaz:21:cool:Irix ....
I can easily read the file and parse out the information. However, I am not sure how to remove an entry from the file. The way I see it, to remove an entry, I would have to read in the whole file into a variable or array, remove the content that I don't want and then write over the original file. However, would this process cause any problems if multiple people were writing or removing entries to that file at the same time or if the CGI script were interrupted before it finished processing? How might I ensure that my original data remains uncorrupted even with multiple users? If anyone has any thoughs on this subject, I would be very greatful. Thank you very much for reading this. Joe

Replies are listed 'Best First'.
Re: How to modify/delete records in a flat-file database?
by mce (Curate) on Apr 28, 2004 at 09:03 UTC
    Hi,

    Have a look at the DBD::CSV package. This will treat a cvs file as a database, and you can delete, modify and add entries, completely transparent.

    I use it frequently, and it is really cool.

    And, in CPAN, there are many other CSV packages to parse/handle these files.


    ---------------------------
    Dr. Mark Ceulemans
    Senior Consultant
    BMC, Belgium
      Thanks, That's an interesting looking module. I'll play around with it and see if it will work for my situation. I appreciate your help. Have a nice night, Joe
Re: How to modify/delete records in a flat-file database?
by tinita (Parson) on Apr 28, 2004 at 09:06 UTC
    you have to open the file for reading and writing (perlopentut) and lock it (flock) so that only one process can open the file at the same time.
      Thanks! That is definately something that I will look into. This looks exactly like what I need. Thank you for the tip. Joe
Re: How to modify/delete records in a flat-file database?
by nite_man (Deacon) on Apr 28, 2004 at 09:43 UTC

    Hi Joe

    In additional to the above-listed approaches, I'd like to suggect you to try use CPAN module Tie::File to simplify work with file rows. That modile allows you access to the file rows via Perl array. So you can use all array functions.

    ---
    Schiller

    It's only my opinion and it doesn't have pretensions of absoluteness!

      HI, That looks like an interesting module. I especially like that it doesn't load the enire file into memory to make changes to it. The cache feature is pretty neat also. Thanks for sending that link my way. I appreciate your help. Joe
Re: How to modify/delete records in a flat-file database?
by matija (Priest) on Apr 28, 2004 at 10:59 UTC
    Those are exactly the type of problems which should cause you to switch away from the CSV file for the underlying "database".

    A colon (or comma etc) separated file is fine as long as you have a relatively small number of records, and you will be mostly searching them, only occasionaly adding records.

    As soon as you start changing records, deleting them, or trying to manage hunderds or more, a database comes in really handy. A database can complete such operations faster because it isn't hobbled by the CSV format: it can create indices, trade space for speed, and use other strategies which are not available to a CSV using program.

    There are a number of very lightweight databases, such as DBD::SQLite which can be installed without fuss and aren't difficult to administer.

    If you need the CSV format for other programs, or because humans want to also edit by hand, you can allways emit the CSV from the database after every change.

      From looking at the Document pages, this looks like a nice little database. I guess I am caught between deciding if there are going to be enough file operations to warrent a database, of if I should use a flat file. The database looks like a nice solution, however, and I might just use it if it will make my life easier in the long run. Thank you for sharing your advice and experience with me. Joe
Re: How to modify/delete records in a flat-file database?
by TwistedGreen (Sexton) on Apr 28, 2004 at 12:34 UTC
    If you still need to use a CSV (which may be useful if it needs to be human-readable, for example) there is another method. Instead of thinking of changing one row in a file, you should think of it as changing the entire file. This may be a bit backwards compared to some of those fancy CPAN modules, but it's simple and it works.

    Basically, what you'd want to do is read the entire file into a series of hashes, using the username as the key. Then you would make whatever change needs to be made to the hash in memory, and write the entire file back to the disk. This would have the effect of changing one line but would really be overwriting the entire file.

    I recently wrote a CGI script that will allow changes to be made to a CSV remotely, so I'll borrow code from it to illustrate this solution, based on your given data set.
    my @names; my %nums; my %stats; my %oses; my $i=0; my $csv="data.csv"; # read data from $csv open CSV, $csv or die; flock(CSV, 1); while (<CSV>) { chomp; ($name,$num,$stat,$os) = split(':'); $nums{$name} = $num; $stats{$name} = $stat; $oses{$name} = $os; $names[$i] = $name; $i++; } close CSV; #make necessary changes to %nums, %stats, %oses as required, or delete + an entry entirely # write data over $csv open (CSV, '>', $csv) or die; flock(CSV, 2); foreach my $name(@names) { print CSV join(':',$name,$nums{$name},$stats{$name},$oses{$name}). +"\n"; } close CSV;
    There are more efficient ways, of course, but this will get the job done.
      > How might I ensure that my original data remains uncorrupted even with multiple users?
      Unless there are really multiple updates at a time ...

      pelagic
        Oops, I missed that. It certainly brings more complexity to the problem... Database would definitely be the way to go then, I'd think. Even though I lock the files when they are read and written in my example above, two concurrent users could overwrite each other's changes.
Re: How to modify/delete records in a flat-file database?
by Anonymous Monk on Apr 28, 2004 at 10:05 UTC
    Is it too much to ask why you aren't using a "real" database? It doesn't even need to be some complicated extra process, just install DBD::Sqlite which is a self contained database in one file. All this icky stuff will be handled transparently for you =].
      That is certainly an interesting approach. I had discarded using a database earlier since I knew that it would require root access to install and configure (something I do not have on this particular server), but this option might actually get around that. I had not idea that a self contained database like this existed. That is very cool. Thanks for pointing me that way. Joe
Re: How to modify/delete records in a flat-file database?
by OzzyOsbourne (Chaplain) on Apr 28, 2004 at 12:30 UTC
    Although simplistic, is there any reason not to use Tie::file? The docs say that it can be used with multiple users.
Re: How to modify/delete records in a flat-file database?
by paulbort (Hermit) on Apr 28, 2004 at 16:53 UTC
    Some sort of database with transactional integrity would also be my first thought, but if you need to support multiple users making changes, and you can't change the format of the file (because you're interacting with something else, etc.), there is another way to get pretty close to insuring integrity: use another set of files to allow only one 'user' to change the file at a time.

    For example, if a program wants to change the file, it checks that directory for glob( 'flat_file_lock_*' ), and if it finds any files, it adds one to the last file name, and touches that file. If it doesn't find any, it touches 'flat_file_lock_0'. (Yes I know there's a race condition here, but if that's an issue, use a database.)

    Then each program just needs to wait until it is the lowest-numbered lock file, make its change, and delete its lock file. Primitive but functional.

    Another way would be to write your updates to a pipe, and have a separate process listen on that pipe. (Assuming you're not on Windows, where pipes are scary.)

    --
    Spring: Forces, Coiled Again!
      That is an interesting model. That sounds kind of like a semaphore for operating system process scheduling. Very unique to use that in this way. The pipe idea is something that I will look into also. Thanks for your time and ideas. Joe
Re: How to modify/delete records in a flat-file database?
by wolfi (Scribe) on Apr 28, 2004 at 17:28 UTC
    barring any syntax errors:
    use Fcntl qw(:DEFAULT :flock); open (MYFILE1, "<", "$file") or sysopen (MYFILE, "$file", O_RDONLY) or die "$!"; flock (MYFILE1, LOCK_SH) or die "$!"; chomp (@input=<MYFILE>); close MYFILE1; # pretending you then did your work on the input, split at the ":" # and stuck the results back into @output... open (MYFILE2, ">", "$file") or sysopen (MYFILE, "$file", O_WRONLY) or die "$!"; flock (MYFILE2, LOCK_EX) or die "$!"; foreach $item(@output){ print MYFILE2 "$item"."\n";} close MYFILE2;

    doing the double open-sysopen technique helps avoid races for opening the file and having that content altered in those few miliseconds.

    it's rare, if you have a small site, that two users would access the file at the same time - but i wouldn't take the risk. Besides, w/CGI work - if someone is ~intentionally~ messing w/your system -> you want to ensure that you're opening the right data and not something that was swapped in by a nefarious user. (This is the reason for using 2 separate filehandles too -> MYFILE1 and MYFILE2).

    LOCK_EX (exclusive locks) are needed, when writing to files - otherwise, LOCK_SH (shared locks) are all that's needed, i believe, when reading.

    also: i'm not sure, if i'd actually use DIE like this in a cgi system, if the user is going to see the output. You probably want to 'die' gracefully -> using WARN, &subroutine, or some other action instead.

    for more info - see Camel -> pgs 419-422, 571-573, 712, 714-715, 808-810

      Thanks for your advice and ideas about preventing race conditions. I'll also have to look into that graceful ending of the program. There are a few places in my code where that would be advantageous. Thanks again for looking at my problem. Have a nice night, Joe

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://348758]
Approved by pelagic
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2024-04-18 06:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found