http://www.perlmonks.org?node_id=188710

hacker has asked for the wisdom of the Perl Monks concerning the following question:

There's a couple of snags in this issue...

I have a script which sits as an alias in /etc/mail/aliases, and intercepts an incoming email addressed to it, parses information from it, and performs actions based on very specific criteria found in it, creates a binary file, and attaches the file and sends it along in a reply to the original sender.

That's the 40,000-foot view. I've run into a few snags in the design, and I'd like to clean them up a bit. Right now, the script works in all the situations I've thrown at it, including completely invalid data, ^C'ing the system process the script calls, and so on. (Many thanks also go to those in the Chatterbox for their patience and tolerance during phases where I wasn't understanding the use of a model or perl function, especially tye for consistantly pounding me over the head with the concrete clue-bat with nails in the end of it.)

The user sends an email to reflector@mydomain.com, and in the body of the email, they populate a small template of data, which looks like this:

[template] url = http://www.foo.com/ # URL to fetch AvantGo = No # apply heuristics maxdepth = 2 # depth to traverse bpp = 4 # pits-per-pixel compression = zlib # or 'DOC' or 'None' title = My Document # title of output
This is then sent along to a script which parses it with Mail::Internet using the following:
my $message = new Mail::Internet ([<>]); my $from = $message->get('From'); my $subject = $message->get('Subject'); my $received = $message->get('Received'); my @body = @{$message->body()}; print_help_message() if ($subject =~ /help/); my $line=""; my @unwrappeddata; foreach (@body) { chomp $_; next if /^#/; if (m/^[^\s=]+\s+=\s*/ || m/^\[.*\]$/) { $line =~ s/^#/\n#/m; $line .= "\n"; push @unwrappeddata, $line; $line = $_; } else { $line .= $_; } last if /\[end_template\]/; } $line .= "\n"; push @unwrappeddata, $line;
This now gives me a "corrected" body of the message in @unwrappeddata (thanks theorbtwo for the regex help).

The next step is to parse this with Config::IniFiles to get the key=value out of the template. I was originally using Config::Simple, but found that it considered 'foo = ' as a valid key and value pair (note the space after the = sign, it considered that the value, not ideal). I then tried AppConfig, but it too had a slightly more complicated syntax, in that I had to define each value before I could use it.

The first snag is that Config::IniFiles likes to have a file, or something that looks like a filehandle or stream. My content is in @unwrappeddata, and I'd like to pass this to Config::IniFiles. At the suggestion of ChemBoy and jeffa, I investigated IO::String, but didn't quite understand how to get the contents of @unwrappeddata into something Config::IniFiles could grok, so I went with an alternate approach:

use Digest::MD5 qw(md5 md5_hex md5_base64); my $date = UnixDate("today","%b %e, %Y at %T"); my $md5file = md5_hex($date); my $workpath = "/path/to/workdir"; open(INFILE, ">$workpath/$md5file.msg") or die "$?"; print INFILE @unwrappeddata; close ('INFILE');

This puts the data in a file, which I can then pick up with Config::IniFiles as follows:

my $ConfigFile = "$workpath/$md5file.msg"; tie my %ini, 'Config::IniFiles', (-file => $ConfigFile); my %template = %{$ini{"template"}}; $avantgo = $template{'AvantGo'}; $pl_url = $template{'url'}; $bpp = $template{'bpp'}; $maxd = $template{'maxdepth'}; $compr = $template{'compression'}; $filename = $template{'filename'}; $title = $template{'title'}; $avantgo = "No" unless $template{'AvantGo'};

Now I have all the keys in scalars I can manipulate, and I can unlink() that file that I dropped @unwrappeddata into.

I'd like to rid myself of writing this file, and the IO associated with it, but this part works so far, and functionality is important over efficiency right now. Suggestions?

These values are then used to push into a system() command in list-mode (thanks again to tye for pushing me in that direction) like this:

my $buildcmd = "/path/to/foo/binary"; my @buildcmd; push @buildcmd, "-p"; push @buildcmd, "$workpath"; push @buildcmd, "-P"; push @buildcmd, "$workpath"; push @buildcmd, "-H"; push @buildcmd, "$pl_url"; $maxd < 3 ? $maxd : $maxd=2; push @buildcmd, "--maxdepth=$maxd"; push @buildcmd, "--bpp=$bpp" if ($bpp); push @buildcmd, "--$compr-compression" if ($compr); push @buildcmd, "-N"; push @buildcmd, "$title"; push @buildcmd, "-V3"; push @buildcmd, "-f"; push @buildcmd, "$workpath/$md5file"; system($buildcmd, @buildcmd); if (stat("$workpath/$md5file.pdb")) { my $buildfile = stat("$workpath/$md5file.pdb") or die $?; $buildsize = $buildfile->size; $success = 1; } else { print_error_message(); }

I realize this isn't efficient, but I don't know any easier way to do this without all of these push() commands in this order. Suggestions?

My second snag is that I need to validate the content in those keys. Basically 'maxdepth' can only contain a single digit, and can't be greater than 3 (bandwidth limitation for now). I've shoved that into a conditional like this:

if ($maxd && $maxd =~ /^\d$/) { $maxd < 3 ? $maxd : $maxd=2; $maxd_msg = "$maxd level" . ($maxd > 1 ? "s" : "") . " down"; }

What I want to do, ideally, is to validate missing values in the original email, and send back the user a useful message. If they forget to put in a value for 'maxdepth', a required value, I want to kick them back an email stating something useful to them about forgetting that value. Similar for the other required options.

Is there a "waterfall" type of design approach to doing this? Fall through a validation loop and provide errors/successes depending on the values found/tested true?

I'm already testing the status_line value from the HEAD request to the URL as well as using URI to make sure the scheme and protocol are both valid (i.e. accept http:// reject file://). I don't quite know how to elegantly "fall through" an error/validation loop, testing for proper syntax in the template, and then test for acceptable proper values within that syntax.

Did that make sense? I hope I provided enough code and explanation to be useful to the other monks.