Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

(dkubb) Re: (2) A Little review for a little DBI and CGI?

by dkubb (Deacon)
on Mar 28, 2001 at 10:08 UTC ( #67758=note: print w/replies, xml ) Need Help??

in reply to A Little review for a little DBI and CGI?

coolmichael, if this is your first useful piece of code, as you say, you are doing quite well. I can see that the advice from more experienced monks is influencing your coding style. With that said, I have a few comments:

  • You should place a T beside the -w on your first line. This will turn on taint mode, which should be on inside a CGI, and for that matter, any perl script that accepts user supplied data. Read perlsec to see why this switch is so important.

  • On line 38, you use something called indirect object syntax. This is personal preference, but I try to always use direct object syntax, like:

    my $q = CGI->new;

    For a good explanation of why this could cause problems, check out this warning in perlobj.

  • You are doing alot of checking against the DBI calls, die'ing if there is a problem. You should look into using the RaiseError attribute when creating your database handle. In DBI::connect it is the 4th argument, but you can also embed it into your DSN definition on line 56, like so:

    my $connectstr="DBI:CSV(RaiseError=>1):f_dir=/home/httpd/data;" .

    It's your choice how to use this, but the net effect is a reduction of debugging code.

  • Have you thought of embedding the column names inside the CSV file? DBD::CSV will read the first line of the file, and figure out the column names for you.

  • I noticed that you had the column names in two places, once in the regex, and once in the @names initialization. If you wanted to, you could abstract this out and keep the names in a single place. For example, you could do something like this:

    use constant COLUMNS => [qw(Consign ISBN Price Title Author Subject)]; my $regex = join '|', @{COLUMNS()}; my ($search) = $q->param('search') =~ /^${regex}$/; die 'Bad search criteria' unless defined $search;

    The only thing about this technique, is it will open up your database to be searchable by the Consign column. This one is totally your preference, it's just that when I see the same data in two places red flags are raised, as there is the chance for that information to diverge.

  • I think some of the things in the DSN are unecessary. I believe with regular quote-comma format, like the one you are using, the only necessary attribute to define is csv_eol. The others you are defining are that module's documented defaults.

  • On line 65, you are placing a variable called $criteria right into the SQL statement. You are also getting the variable right from the user. If someone wanted to be malicious, imagine if they submitted something like the following for "criteria":

    %" AND something LIKE "sensitive data

    Your SQL query would then become:

    SELECT * FROM onshelf WHERE Title LIKE "%%" AND something LIKE "sensitive data%"

    Obviously, this isn't a real world example, but it illustrates my point, which is to always validate the user input AND try to use placeholders in your SQL query:

    my $statement = qq{ SELECT * FROM onshelf WHERE $search LIKE ? }; my $sth = $dbh->prepare($statement); $sth->execute("%$criteria%");

    The difference with this method is that the information passed to $sth->execute will be quoted. Combine this with checking the criteria parameter for validity, will make your code more, but not absolutely, secure. Never trust information you are getting from the user.

  • This one is more of a neat trick. One thing that always bugged me with bind_columns was that I'd need to define my lexically scoped variables, then bind them in two steps. That was until I figured out I could do this:

    $sth->bind_columns(\my($consign, $isbn, $price, $title, $author, $subject));
  • You may want to reconsider using a SELECT * in your SQL query. There was an excellent thread a few months ago regarding this: Topics in Perl Programming: Table-Mutation Tolerant Database Fetches with DBI. It's a node I would definately recommend reading, it was very educational for me.

That's all the suggestions I have for now. All in all your code is quite good, please don't take the length of the review as an insult. I wanted to explain each point so that you, and others, understood the significance of each point I was trying to make.

Replies are listed 'Best First'.
Re: (dkubb) Re: (2) A Little review for a little DBI and CGI?
by coolmichael (Deacon) on Mar 28, 2001 at 13:38 UTC
    I am thrilled to have so many comments. Thank you dkubb.

    I've got taint checking on now, and I use $q=CGI->new. Eventually, I want to write a function that dies gracefully, printing an error to the web browser before it dies. I don't think I want to use CGI::Carp "fatalsToBrowser" as that gives too much information to the nasty people that might be using the stuff. I've changed the sql statement and untainted $criteria, so it has to be only letters and numbers. It was a bit of a pain getting the place holder to work, but eventually...

    I don't take such a long critique personally. I'm quite happy to recieve positive and constructive comments. Thank you again.

    Unfortunatly, now that it's working so well, I've discovered a bug and need some help. The data is comming from a paradox database. Paradox is able to export it to CSV but isn't smart enough to escape the quote in the titles. I've been looking for a regex on the monastery to add escapes, but haven't found one yet. Do you have any suggestions?

      Ugh. What if you have a title of 'Why "foo", "bar", and "baz"?' and it gets written to a CSV file as: ...,16,"Why "foo", "bar", and "baz"?",20,...
      then how do you expect to be able to tell which "s need to be escaped??

      Well, I'll try me best... Let's assume that no title contains a string matching /",\S/ and that there is never whitespace after a comma in your CSV file.

      s{ \G( [^",]+ | "(.*?)" )(,(?=\S)|$) }{ if( ! $2 ) { $1.$3; } else { my $f= $2; $f =~ s/"/""/g; '"'.$f.'"'.$2; } }gx;

      If you do have whitespace after commas, then an alternate solution would be to assume that all titles that contain "s always contain an even number of quotes and that the first character after the first quote of a pair isn't a comma:

      s{ \G( [^",]+ | "((?: [^"]+ | "" | "[^",][^"]*" ))*" )(,|$) }{ if( ! $2 ) { $1.$3; } else { my $f= $2; $f =~ s/"/""/g; '"'.$f.'"'.$2; } }gx;
      I hope one of those helps. (Sorry, they aren't tested. Just tell me which one matches your situation and I'll be happy to help if there are bugs.)

              - tye (but my friends call me "Tye")
        Unfortunatly, neither situation applies. Some of the titles have only one quote in them, some have three or four. Some of the quotes have commas after them. The solution I've decided to go with is editing the paradox database to get rid of the commas. Find and Replace, yeah, baby, yeah. Groovy.

        I think it'll be the only reliable way to do it, and it'll probably fix some of the errors the database has been having.

        /me crosses his fingers.

      Add escapes? quotemeta() is your friend.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://67758]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (2)
As of 2017-11-19 00:11 GMT
Find Nodes?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:

    Results (278 votes). Check out past polls.