Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

A Little review for a little DBI and CGI?

by coolmichael (Deacon)
on Mar 28, 2001 at 05:19 UTC ( #67704=perlcraft: print w/replies, xml ) Need Help??

   1: #!/usr/bin/perl -w
   3: #
   4: # This is my first useful piece of code, and I would like
   5: # comments from people in the know, and anyone else.
   6: # Specifically, what have I don't wrong, what have I done
   7: # well? Is there a better way to do it, without using a
   8: # database? 
   9: # 
  10: # I know about some problems, like I should probably be 
  11: # using html::template, rather than having the first and 
  12: # second half of the pages sitting in different files
  13: # Also, it doesn't find all the results. I'll post 
  14: # some sample data below, for you to look at.
  15: # Thank you in advance.
  16: # Once all the changes have been made, should I update
  17: # this post to show the improvements you've suggested?
  18: # This was my first go with CGI, databases, and SQL.
  19: # I will be grateful for any suggestions.
  21: ####
  22: #Data Sample
  23: #Here are the first three lines of the CSV file
  24: #"H0001-12","0810827085",$40.00,"FUNCTIONAL SINGING VOICE",,"MUSI"
  25: #"H0001-13","0921513097",$5.00,"DIGNITY OF DUST - FREIBERG",,"ENGL"
  26: #"H0001-14","0919626726",$5.00,"HDOGRAM","PK PAGE","ENGL"
  28: #!/usr/bin/perl -w
  30: use strict;
  31: use DBI;
  32: use CGI;
  34: $|++;
  36: my @names;
  37: my $connectstr;
  38: my $dbh;
  39: my $sql;
  40: my $sth;
  41: my $count=0;
  42: my $q;
  43: my $search;
  44: my $criteria;
  46: $connectstr="DBI:CSV:f_dir=/home/httpd/data;" .
  47:                         "csv_eol=\n;" .
  48:                         "csv_sep_char=,;" .
  49:                         "csv_quote_char=\"";
  50: @names=qw(Consign ISBN Price Title Author Subject);
  52: $q=CGI->new;
  53: print $q->header(-expires=>"-1d");
  55: open HTML, "startpage" or die "opening startpage: $!\n";
  56: print while(<HTML>);
  57: close HTML or warn "closing startpage: $!\n";
  59: $search=$1 if ($q->param('search') =~ /^(Title|Author|ISBN|Subject)$/);
  60: die "from bad input!\n" unless ($search);
  62: $criteria=$1 if($q->param('criteria') =~ /(\w*)/);
  63: die "from bad input!\n" unless ($criteria);
  64: $criteria =~ tr/a-z/A-Z/;
  66: print $q->p("Searching for $search matching $criteria");
  68: $dbh=DBI->connect($connectstr)
  69: 	or die "opening connection: $DBI::errstr; stopped\n";
  70: $dbh->{'csv_tables'}->{'onshelf'} = {'col_names' => [@names]}; 
  72: $sql="SELECT * FROM onshelf WHERE $search like ?";
  74: $sth=$dbh->prepare($sql)
  75: 	or die "preparing $sql: $DBI::errstr stopped\n"; 
  77: $count=$sth->execute("%$criteria%")
  78: 	or die "executing $sql: $DBI::errstr stopped\n";
  80: $sth->bind_columns(\my ($consign, $isbn, $price, $title, $author, $subject));
  82: print $q->p("Found $count results");
  84: print $q->start_table({-border=>"1"});
  85: while($sth->fetch())
  86: {
  87:         print $q->start_Tr(),
  88:               $q->td({-width=>'90', -valign=>"top"}, $consign),
  89:               $q->td({-width=>'100', -valign=>"top"}, $isbn),
  90:               $q->td({-width=>'180', -valign=>"top"}, $title),
  91:               $q->td({-width=>'150', -valign=>"top"}, $author),
  92:               $q->td({-width=>'50', -valign=>"top"}, $subject),
  93:               $q->td({-width=>'60', -align=>"right", -valign=>"top"},$price),
  94:               $q->end_Tr();
  95: }
  96: print $q->end_table();
  98: $dbh->disconnect();
 100: open HTML, "endpage" or die "opening end page: $!\n";
 101: print while(<HTML>);
 102: close HTML or warn "closing HTML: $!\n";
 105: #
 106: #Updated March 27, as per tye's suggestions. Thanks Tye.
 107: #Updated March 27th, again, as per dkubb's suggestions.
 108: #

Replies are listed 'Best First'.
(dkubb) Re: (2) A Little review for a little DBI and CGI?
by dkubb (Deacon) on Mar 28, 2001 at 10:08 UTC

    coolmichael, if this is your first useful piece of code, as you say, you are doing quite well. I can see that the advice from more experienced monks is influencing your coding style. With that said, I have a few comments:

    • You should place a T beside the -w on your first line. This will turn on taint mode, which should be on inside a CGI, and for that matter, any perl script that accepts user supplied data. Read perlsec to see why this switch is so important.

    • On line 38, you use something called indirect object syntax. This is personal preference, but I try to always use direct object syntax, like:

      my $q = CGI->new;

      For a good explanation of why this could cause problems, check out this warning in perlobj.

    • You are doing alot of checking against the DBI calls, die'ing if there is a problem. You should look into using the RaiseError attribute when creating your database handle. In DBI::connect it is the 4th argument, but you can also embed it into your DSN definition on line 56, like so:

      my $connectstr="DBI:CSV(RaiseError=>1):f_dir=/home/httpd/data;" .

      It's your choice how to use this, but the net effect is a reduction of debugging code.

    • Have you thought of embedding the column names inside the CSV file? DBD::CSV will read the first line of the file, and figure out the column names for you.

    • I noticed that you had the column names in two places, once in the regex, and once in the @names initialization. If you wanted to, you could abstract this out and keep the names in a single place. For example, you could do something like this:

      use constant COLUMNS => [qw(Consign ISBN Price Title Author Subject)]; my $regex = join '|', @{COLUMNS()}; my ($search) = $q->param('search') =~ /^${regex}$/; die 'Bad search criteria' unless defined $search;

      The only thing about this technique, is it will open up your database to be searchable by the Consign column. This one is totally your preference, it's just that when I see the same data in two places red flags are raised, as there is the chance for that information to diverge.

    • I think some of the things in the DSN are unecessary. I believe with regular quote-comma format, like the one you are using, the only necessary attribute to define is csv_eol. The others you are defining are that module's documented defaults.

    • On line 65, you are placing a variable called $criteria right into the SQL statement. You are also getting the variable right from the user. If someone wanted to be malicious, imagine if they submitted something like the following for "criteria":

      %" AND something LIKE "sensitive data

      Your SQL query would then become:

      SELECT * FROM onshelf WHERE Title LIKE "%%" AND something LIKE "sensitive data%"

      Obviously, this isn't a real world example, but it illustrates my point, which is to always validate the user input AND try to use placeholders in your SQL query:

      my $statement = qq{ SELECT * FROM onshelf WHERE $search LIKE ? }; my $sth = $dbh->prepare($statement); $sth->execute("%$criteria%");

      The difference with this method is that the information passed to $sth->execute will be quoted. Combine this with checking the criteria parameter for validity, will make your code more, but not absolutely, secure. Never trust information you are getting from the user.

    • This one is more of a neat trick. One thing that always bugged me with bind_columns was that I'd need to define my lexically scoped variables, then bind them in two steps. That was until I figured out I could do this:

      $sth->bind_columns(\my($consign, $isbn, $price, $title, $author, $subject));
    • You may want to reconsider using a SELECT * in your SQL query. There was an excellent thread a few months ago regarding this: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI. It's a node I would definately recommend reading, it was very educational for me.

    That's all the suggestions I have for now. All in all your code is quite good, please don't take the length of the review as an insult. I wanted to explain each point so that you, and others, understood the significance of each point I was trying to make.

      I am thrilled to have so many comments. Thank you dkubb.

      I've got taint checking on now, and I use $q=CGI->new. Eventually, I want to write a function that dies gracefully, printing an error to the web browser before it dies. I don't think I want to use CGI::Carp "fatalsToBrowser" as that gives too much information to the nasty people that might be using the stuff. I've changed the sql statement and untainted $criteria, so it has to be only letters and numbers. It was a bit of a pain getting the place holder to work, but eventually...

      I don't take such a long critique personally. I'm quite happy to recieve positive and constructive comments. Thank you again.

      Unfortunatly, now that it's working so well, I've discovered a bug and need some help. The data is comming from a paradox database. Paradox is able to export it to CSV but isn't smart enough to escape the quote in the titles. I've been looking for a regex on the monastery to add escapes, but haven't found one yet. Do you have any suggestions?

        Ugh. What if you have a title of 'Why "foo", "bar", and "baz"?' and it gets written to a CSV file as: ...,16,"Why "foo", "bar", and "baz"?",20,...
        then how do you expect to be able to tell which "s need to be escaped??

        Well, I'll try me best... Let's assume that no title contains a string matching /",\S/ and that there is never whitespace after a comma in your CSV file.

        s{ \G( [^",]+ | "(.*?)" )(,(?=\S)|$) }{ if( ! $2 ) { $1.$3; } else { my $f= $2; $f =~ s/"/""/g; '"'.$f.'"'.$2; } }gx;

        If you do have whitespace after commas, then an alternate solution would be to assume that all titles that contain "s always contain an even number of quotes and that the first character after the first quote of a pair isn't a comma:

        s{ \G( [^",]+ | "((?: [^"]+ | "" | "[^",][^"]*" ))*" )(,|$) }{ if( ! $2 ) { $1.$3; } else { my $f= $2; $f =~ s/"/""/g; '"'.$f.'"'.$2; } }gx;
        I hope one of those helps. (Sorry, they aren't tested. Just tell me which one matches your situation and I'll be happy to help if there are bugs.)

                - tye (but my friends call me "Tye")
        Add escapes? quotemeta() is your friend.

(tye)Re: A Little review for a little DBI and CGI?
by tye (Sage) on Mar 28, 2001 at 05:31 UTC

    48: open HTML, "startpage" || die "opening startpage: $!\n";

    That should be or instead of ||. Common mistake which means your die will never run. This is the only serious error I found (though repeated a few times).

    49: print while(<HTML>);

    I was going to suggest an alternative here but I decided I actually like your method best.

    50: close HTML || die "closing startpage: $!\n";

    I tend to use warn rather than die when reporting a failure of close. Usually close only fails when a buffer flush fails (usually due to lack of disk space) so there are cases where you want to take more drastic action, but I don't see this as one of them.

    All in all a pretty clean chunk of code.

            - tye (but my friends call me "Tye")
      It isn't that I doubt you, but I've been getting die error messages from the script while testing it. I tried a few things on the command line

      C:\WINDOWS\Desktop>perl -e "2==1 or die qw(dying)" dying at -e line 1. C:\WINDOWS\Desktop>perl -e "2==1 || die qw(dying)" dying at -e line 1. C:\WINDOWS\Desktop>
      I'm using ActiveState Perl v5.6.0 on win98 (above) and 5.005_03 linux below
      [michael@subtext michael]$ perl -e "2==1 or die qw(dying);" dying at -e line 1. [michael@subtext michael]$ perl -e "2==1 || die qw(dying);" dying at -e line 1.
      My copy of the camel is at work (second perl book I bought, the first one was the llama). I'll see what it says about || and or tomorrow.

      update:You're completly right Tye. I didn't know the brackets would change things. I've made the change as you suggested.

      Also, thank you for your comments. In the reply, I forgot to thank you. That'll teach me to post at 1:00 am.

      Edit: chipmunk 2001-03-30

        open HTML, "startpage" || die "opening startpage: $!\n";

        is parsed as: open HTML, ( "startpage" || die "opening startpage: $!\n" )
        but the expression "startpage" is always true so the die will never be executed. To test, just change "startpage" to "smartmage" or some other non-existant file name.

        The problem is a matter of precedence.

        An alternate solution is: open( HTML, "startpage" ) || die "opening startpage: $!\n";
        but I'd still use or for that for the sake of robustness (in case someone comes along and remove the parens).

                - tye (but my friends call me "Tye")
Re: A Little review for a little DBI and CGI?
by andye (Curate) on Mar 28, 2001 at 18:48 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlcraft [id://67704]
Approved by root
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2018-06-20 13:49 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (116 votes). Check out past polls.