Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: Massive Memory Leak

by afoken (Chancellor)
on Dec 08, 2009 at 13:03 UTC ( [id://811732]=note: print w/replies, xml ) Need Help??


in reply to Re: Massive Memory Leak
in thread Massive Memory Leak

Not using placeholders is not only a security problem. When you use placeholders, you allow DBI, DBD::whatever, and the database to cache a parsed form of your query. This can speed up things dramatically, even with simple SQL statements.

And you can get completely rid of any quoting problems for values you want to pass to the database. Use a placeholder and pass the actual value to execute(), no matter what it contains. You don't even have to know what quoting rules apply to your database.

Background information:
For most databases, the DBD can pass SQL statement and values separately to the database, so even the DBD does not need to know quoting rules. The database can cache a precompiled version of the query, and needs to parse the query only once, no matter how often you use it. For those unlucky databases that do not support placeholders, the DBD provides all required quoting rules, and DBI and DBD take care of injecting properly quoted values into the query. At this point, at least DBI and DBD can cache a precompiled version of the query, so DBI and DBD are still more efficient in that worse case than your code. And because a lot of the DBI/DBD code is written in C / XS, it is usually much faster that everything you can code in perl.

Oh, and by the way: What happens if one of the values you want to insert contains a single quote? Right, your code dies, because you do not quote properly. If you still insist on quoting your values manually, at least use DBIs quote method to quote the values properly.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^3: Massive Memory Leak
by martin_ldn (Initiate) on Dec 08, 2009 at 15:53 UTC

    There are some fair points there. To get around the ' (or \) problem I simply replaced the character with a - beforehand, which was a workaround. I have now looked at the DBI docs and have modified my program to be much better in terms or architecture and elegance. I have attached it in case it is of help to someone else :-) Check it out!

    #!/usr/bin/perl use HTML::TableContentParser; use HTML::Strip; use DBI; use strict; use warnings; # Connect to database and create parser object my $db = DBI->connect ("DBI:mysql:newsbms","newsbms", "newsbms", { RaiseError => 1, PrintError => 0}); for my $path( 'modified', 'deleted' ) { print "\nProcessing the '$path' entries...\n\n"; # Create counters my $counter = 0; my $query_counter = 0; # Open the directory my $dirname = "/home/martinn/monitoring/newsBMS/$path/"; opendir(DIR, $dirname) || die ("Could not open $dirname"); # Prepare the MySQL statement my $query = "INSERT INTO"; if ($path eq 'modified') { $query = $query . " modified (id, name, title, duration, library, modified, user, rev) VALUES ( ?, ?, ?, ?, ?, ?, ?, ? )"; } if ($path eq 'deleted') { $query = $query . " deleted (name, title, duration, deleted, library) VALUES ( ?, ?, ?, ?, ? )"; } $query = $query . " ON DUPLICATE KEY UPDATE duplicates=duplicates+ +1"; my $statement = $db->prepare($query); # Loop through all files in the directory while (defined(my $filename = readdir(DIR))) { # Skip special "files": '.' and '..' next if $filename =~ /^\.\.?$/; $counter++; # Open and read the html file into a single string open(HTMLFILE, $dirname.$filename) || die ("Couldn't open $fil +ename"); binmode HTMLFILE; my $html = join("", <HTMLFILE>); close(HTMLFILE); # Parse the html table my $tcp = HTML::TableContentParser->new; my $tables = $tcp->parse($html); # Issue the MySQL queries for my $t (@$tables) { for my $r (@{ $t->{rows} }) { my @values; for my $c (@{ $r->{cells} }) { # Remove the html tags from the cells my $stripper = HTML::Strip->new(); $c->{data} = $stripper->parse($c->{data}); # Add cell to the end of the array push(@values, $c->{data}); } $statement->execute(@values); $query_counter++; # Basic activity monitor if ($query_counter % 5000 == 0) { print "Issued $query_counter MySQL queries.\n"; } } } } # Close the directory closedir(DIR); # Finish the MySQL statement $statement->finish(); print "\nDone the '$path' table.\n"; print "Processed $counter files and issued $query_counter MySQL qu +eries.\n"; } # Disconnect from the database $db->disconnect(); print "\nProgram Finished.\n";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://811732]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2024-04-18 20:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found