Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Please critique this script -- Read emails and write URLs into bookmarks

by talexb (Chancellor)
on Oct 06, 2016 at 18:39 UTC ( [id://1173428]=note: print w/replies, xml ) Need Help??


in reply to Please critique this script -- Read emails and write URLs into bookmarks

Hmm .. well, I don't see any error-checking, you're using a regex for something that isn't clear (re-formatting the content of the subject line?), and you're assuming there is only going to be a single URI in each message.

Alex / talexb / Toronto

Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

  • Comment on Re: Please critique this script -- Read emails and write URLs into bookmarks

Replies are listed 'Best First'.
Re^2: Please critique this script -- Read emails and write URLs into bookmarks
by Anonymous Monk on Oct 06, 2016 at 18:59 UTC

    There is only one valid URL per message. This is a situational constraint (also written in the spec). I know that because these are 'Emails-to-self' that contain at max two URLs (one in the plain text part, one in the HTML part if it exists)

    Maybe it's a bit difficult to read but the regex is transforming the subject into a suitable filename, escaping problematic characters.

    I tried to be sensible about the error checking and assumed the modules do error handling for their part, outside of that there's hardly any code in its own right.
    Now that I think of it though, input should have been checked. That was a mistake.
    What else do you think needs explicit error handling?

      I would check the following:

      • Input arguments -- if they are files, do they exist, and are they readable and the right type?
      • Object from Path::Tiny -- check that this is valid.
      • Perhaps check after the call to URI::Find->new so see that there is something in the @uris list?
      To me, that's basic defensive programming. Check everything -- you never know when Something Weird will happen; it's better that your script captures it rather than the script dying for some unknown reason.

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

        I light of my brainfart yesterday I felt compelled to show a revised script including proper error handling:

        I blame it on the meds!

        #!/usr/bin/perl # read emails I sent myself containing just a link # and make a HTML-redirect based bookmark out of them # # SPEC # read given files as internet messages # use subject as title and filename, escape as needed # use one URL in body as href (if there a two they are both the same) # and output one HTML file for each bookmark use strict; use warnings; use Carp qw(cluck confess); use Encode; use Email::Simple; use HTML::Entities; use Path::Tiny; use URI::Find; use version; our $VERSION = qv('0.0.1'); if ( @ARGV == 0 ) { confess 'Invalid number of arguments' } foreach my $in_filepath (@ARGV) { if ( not path($in_filepath)->exists ) { cluck "$in_filepath: Input file '$in_filepath' does not exist" +; next; } my $in_file = Path::Tiny->new($in_filepath)->slurp or confess "$in_filepath: Fatal error in module"; my $message = Email::Simple->new($in_file) or confess "$in_filepath: Fatal error in module"; if ( not length $message->as_string or $message->as_string =~ /\A\s*\z/ ) { cluck "$in_filepath: Invalid message"; next; } if ( not length $message->header('Subject') ) { cluck "$in_filepath: Invalid subject"; next; } my $subject = decode( 'MIME-Header', $message->header('Subject') ) or confess "$in_filepath: Fatal error in module"; my $title = HTML::Entities::encode($subject) or confess "$in_filepath: Fatal error in module"; my $out_filepath = $subject =~ s/[%\/\?<>\\:\*\|":]/_/gr . '.URL.H +TML'; # TODO IMPR URL::Search might be nicer than URI::Find for some cas +es my @uris; my $finder = URI::Find->new( sub { push @uris, $_[0] } ); $finder->find( \$message->body ); if ( @uris == 0 ) { cluck "$in_filepath: Invalid number of URIs"; +next; } if ( path($out_filepath)->exists ) { cluck "$in_filepath: Output file '$out_filepath' exists"; next; } path($out_filepath)->spew( "<!DOCTYPE html><title>$title</title><meta http-equiv=\"refresh\" cont +ent=\"0; url=@uris\">", ) or confess "$in_filepath: Fatal error in module"; }

        I forgot to check URI::Find->new as well. Thanks for that.

        About the others, maybe I should elaborate my ideology here:

        I usually prefer technical error messages to 'clean' ones. Especially when I'm not perfectly sure where the errors could come from or if I don't have the time to check all odds and ends. The rationale being that my own error messages could only be worse.

        The errors are handled. They produce descriptive (if not overly verbose) messages. But in a script of this nature developer and user are usually quite close. So this verbosity turns out to be very useful.

        However, checking the number of input arguments and checking if $uris is not empty are clearly omissions which I will fix.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1173428]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (6)
As of 2024-04-19 08:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found