http://www.perlmonks.org?node_id=542862

hmbscully has asked for the wisdom of the Perl Monks concerning the following question:

I support an html form that I inherited six years ago. The form is around nine years old. A the time of its creation, the submissions were low enough (a few thousand a year) that the choice was made to have the form (with an insane amount of client-side javascript validation) use a simple perl script and email each submission as a discrete message to be dealt with, essentially by hand on the receiving end. The other alternative would've been build a java web application with an Oracle backend, but it was decided that was prohibitively expensive at the time based on the volume of submissions.

Fast forward to now and the submissions to the form (which essentially remains unchanged except for a flatfile backup that I put in a year or so because we were losing some emails) for the first six months of this fiscal year alone are over half a million, still being processed in the same old way! The client has finally realized that this cannot continue. There has been the promise of an Oracle backed web-based application solution from IT for several years now but the realization of that solution appears to be still several years off.

I should mention that I am not part of the IT dept. I work in our web department as the web technician and maintaining perl scripts is only part of my job. Because I am not part of the IT bureucracy, I am often asked by clients to come up with creative solutions not involving massive IT projects to help them fill the gaps until larger solutions come into place. This is one of these such issues.

What the client wants is for me to change the script so it doesn't send emails anymore but instead writes each request to a fixed-length text file that will accumulate the requests and then be passed to them once a day. They will import into Access (or something else I'm not entirely sure or responsible for the import part) and do what they need to do with the data. I guess the key point is that I do not have access to a database and I'm working in flatfiles.

I'm not worried about the writing the properly formatted file part. My concerns are arising when I consider the volume of requests this form will handle and a new request/requirement that the file, which has sensative personal information in it, must be encrypted at all times.

I've started doing reading on GPG and the modules that exist to support that need. But before I completely commit to this work (not surprisingly this isn't the only project I'm working on), I'm trying to figure out if this is even a valid task that I'm attempting to do. It seems to me that if the form is getting several thousand requests a day, that locking the file, unencrypting, writing the new data to the file, reencrypting, and unlocking the file for every single submission may simply not work. That I'm going to lose data.

My personal level of perl expertise I'd put somewhere around advanced beginner. I have a degree in comp sci and I'm entirely self-taught in perl and I'm the only person in my company who works in perl. I love the language but because its not the only thing I do, I don't have the time to spend trying to advance my skills unless its for a specific project that requires me to do so. I'm saying this just because I'm realizing that a lot of my code might not be written in the best way to optimize performance. I wonder if I'm hesitating on this project because of my lack of knowledge on how to write optimized code or if my spidey sense is tingling because this is a Bad. Idea., regardless of my level of expertise, and I need to tell the client that the html form/perl script/flatfile/no database solution has finally come to the end of its usefulness and they have to deal with this another way.

My questions are:

  1. Is this a bad idea in the first place?
  2. If so, what's a good way to explain this to the client in layman's terms?
  3. Or is it only a bad idea when putting the GPG issue into the mix?
  4. If its not a bad idea, what GPG modules are suggested for use? I've been looking and there seem to be a lot. How do I figure out which ones are the better ones?
  5. If its not a bad idea, do I still need to flock the files before/after I encrypt/unencrypt?

many thanks!