Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

help deriving mysql query from perl script

by alison-g (Novice)
on Jul 30, 2012 at 06:43 UTC ( #984364=perlquestion: print w/ replies, xml ) Need Help??
alison-g has asked for the wisdom of the Perl Monks concerning the following question:

Greetings great and generous monks

I've inherited the task of extracting data from a mysql dbase.

My predecessor used this perl script called from a browser. It just keeps timing out - and I don't know if it's really a time out or a script error and I know nothing about perl and I can't get my cpanel shell access to work and they want the data yesterday and they are a non profit group recording marine creature sightings and have no money and AAARGH.

I thought the simplest way might just be to figure out what the mysql query equivalent would be then I can just run it directly on the dbase

<if that's even possible??

anywhere here is the code and insert relevant deity here bless youse all!!

#!/usr/bin/perl $whichmethod = $ENV{'REQUEST_METHOD'}; if($whichmethod eq "GET"){ $forminfo = $ENV{"QUERY_STRING"}; }else{ read(STDIN, $forminfo, $ENV{'CONTENT_LENGTH'}); } @key_value_pairs = split(/&/,$forminfo); foreach $pair (@key_value_pairs){ ($key,$value) = split(/=/,$pair); $value =~ s/\+/ /g; $value =~ s/%([0-9a-fA-F][0-9a-fA-F])/pack("C", hex($1))/eg +; $FORM_DATA{$key} = $value; } $where_clause = $FORM_DATA{'where_clause'}; $group_clause = $FORM_DATA{'group_clause'}; print "Content-type: text/html\n\n"; $cginame="http://$ENV{'SERVER_NAME'}$ENV{'SCRIPT_NAME'}"; print "<form action='$cginame' method=post>\n"; print "<p><b>Enter the where clause (boolean condition)</b>:<br>"; print "For example, event <> \"Training\""; print "<textarea rows='3' cols='80' name=where_clause></textarea>\n"; print "<p><b>Enter the group clause (list of fields separated by #).</ +b><br>"; print "Potential fields are any from the rwsurvey, rwlit, rworganism, +rwlocation, rwRHSites, rwSeasons.<br>"; print "Examples are:<br>"; print "&nbsp;&nbsp;rh_name [Reef Health site name]"; print "&nbsp;&nbsp;year(date); month(date);<br>"; print "&nbsp;&nbsp;season; if(month(date)<12,year(date),year(date)+1) +[the year but with December included in the following year]<br>"; print "Therefore seasonal data summaries would be:<br>"; print "&nbsp;&nbsp;rh_name#if(month(date)<12,year(date),year(date)+1)# +season<br>"; print "<textarea rows='3' cols='80' name=group_clause></textarea>\n"; print "<br><input type=submit></form>\n"; die unless length($where_clause) > 0; $where_clause = "rwsurvey.type = 'LIT' and ".$where_clause; @groups = split('#',$group_clause); $group_clause = join(",",@groups); use DBI; $dbh = DBI->connect("DBI:mysql:reefwatch:xxxxxxxxxxxxxx", "xxxxxxxx", +"xxxxxxx"); if (!$dbh) { $dbh = DBI->connect("DBI:mysql:reefwatch:localhost", "root", "mysq +l"); die "Cannot connect: $D: $DBI::errstr\n" unless $dbh; print "<p>Connected locally</p>"; } $linked_tables = "rwsurvey, rwlit, rworganism, rwRHSites, rwSeasons, r +wlocation where rwlit.organism_id = rworganism.org_id and rwsurvey.su +rvey_id = rwlit.survey_id and rwRHSites.site_id = rwsurvey.site_id an +d rwRHSites.exposure = rwsurvey.exposure and rwlocation.site_id = rws +urvey.site_id and season_id = if(month(date)>0,floor(month(date) % 12 +/3),-1) and "; $output_counter= 0; ############################################################ # # Get summary of surveys and metres for each group # ############################################################ $stmt = "select $group_clause, count(distinct rwlit.survey_id), count( +distinct CONCAT(rwsurvey.survey_id,'_',metre)) from $linked_tables $w +here_clause GROUP BY $group_clause"; #print "<p>$stmt"; #com $sth = $dbh->prepare($stmt) || print "Can't prepare $stmt: ".$dbh->err +str()."\n"; $sth->execute || print "can't execute the query: ".$sth->errstr()."\n" +; while (@res = $sth->fetchrow) { $line = join(',',@res); push (@group_val_list, $line); } #print @group_val_list; #com ############################################################ # # Extract the data for each group # ############################################################ foreach $line (@group_val_list) { $output_counter++; print "A,$output_counter,$line<br>"; @vals = split (/,/, $line); $last_id = $#vals - 2; @vals = @vals[0..$last_id]; $extra_clause = ""; for ($i=0; $i<=$#vals; $i++) { $extra_clause .= " and $groups[$i] = '$vals[$i]'"; } ############################################################ # # Type B: Raw data # ############################################################ $stmt = "SELECT rwsurvey.survey_id, metre, transition, code from $ +linked_tables $where_clause $extra_clause order by rwsurvey.survey_id +, metre, transition"; # print "<p>$stmt</p>"; $sth = $dbh->prepare($stmt) || print "Can't prepare $stmt: ".$dbh- +>errstr()."\n"; $sth->execute || print "can't execute the query: ".$sth->errstr(). +"\n"; $last_survey_id = -1; $total_metres = 0; $total_surveys = 0; undef %cover; while (($survey_id, $metre, $transition, $code) = $sth->fetchrow) { $output_counter++; print "B,$output_counter,$survey_id,$metre,$transition,$code<br>"; if ($survey_id != $last_survey_id) { $total_surveys++; if ($last_survey_id != -1) { $total_metres += ($last_metre - $min_metre + 1); } $last = ($metre-1)*100; $min_metre = $metre; if (($transition != 0) or ($code != 'START')) { $output_counter++; print "D,$output_counter,'Problem: $survey_id, $last_survey_id +, $transition, $code'<br>"; } } else { $current = ($metre-1) * 100 + $transition; $last_metre = $metre; $delta = $current - $last; #print "delta = $delta<br>"; $cover{$code} += $delta; if ($delta < 0) { $output_counter++; print "D,$output_counter,'Problem: negative $current, $last, $ +metre, $transition, $code'<br>"; } $last = $current; } $last_survey_id = $survey_id; } $total_metres += ($last_metre - $min_metre + 1); #print "<p>$total_surveys surveys (total of $total_metres metres)" +; ############################################################ # # Type C: Percentage cover # ############################################################ while (($lifeform, $total_length) = each(%cover)) { $percent_cover = sprintf("%0.1f",$total_length / $total_metres); $val_list = join(',', @vals); $output_counter++; print "C,$output_counter,$val_list,$lifeform,$total_length,$percen +t_cover<br>"; } }

Comment on help deriving mysql query from perl script
Download Code
Re: help deriving mysql query from perl script
by Corion (Pope) on Jul 30, 2012 at 07:03 UTC

    The easiest way to find out what SQL queries your script sends to the database to look at the MySQL query log.

    The other easy way is to turn on SQL tracing on your database handle in DBI.

    Before turning to such heavy handed approaches, especially when you're not even sure whether the script works or not, I would rip out the "web" part of the program.

    Luckily, your program is basically structured as two parts, the part above

    $where_clause = "rwsurvey.type = 'LIT' and ".$where_clause;

    ... which mostly deals with generating the web form, and the part below, which deals with talking to the database and output. So you can just take the lower part, fix up $where_clause with the appropriate value, and you've eliminated the web server as source of errors and can inspect the error messages on the console.

      Thanks Corion...

      Unfortunately "So you can just take the lower part, fix up $where_clause with the appropriate value" is just what I don't know how to do

      Sorry forgot to mention don't know MYSQL either - just enough to run other folks' queries

        Then maybe first use the other half of the program to fill/create $where_clause and then print $where_clause instead of running it against the database.

        If you don't know how to do that either, I don't know how to be of more help to you. You will have to show us the output and log entries you get when you run the modified/split program and show us the source of that split program.

Re: help deriving mysql query from perl script
by tospo (Hermit) on Jul 30, 2012 at 08:44 UTC

    Hmmm, the bad news is: if you don't know any Perl or SQL then I think you will really struggle to get this done - sorry for being a bit brutal here....

    Anyway, since this seems to be for a good cause, I think you can get some support here for sure but it may require a bit of work.

    So, first of all: when you run this in a browser, can you give us the exact "where clause" and "group clause" you enter into the web form? There are two statements for queries to run against the database in that script and they both take these free-form chunks of SQL code. Since you say yourself that you don't know SQ, my suspicion would be that you entered SQL chunks that don't do what you think they are doing and that the DB is choking on a large amount of data to process.

    To get tot he bottom of all this, you need to familiarise yourself with a tool for interacting with your database directly, i.e. someting to run queries against the DB and see the outcome. I would use the MySQL shell in a terminal and I also use this one here quite a bit if I want a GUI and graphical output: DbVisualizer - there is a free version that does all you need to do.
    There are of course plenty of other tools to choose from.

Re: help deriving mysql query from perl script
by patcat88 (Deacon) on Jul 30, 2012 at 22:32 UTC
    IANAL, but if you are a non-profit, you might be able to hire cough cough solicit people by giving them tax exempt receipts for their "donations" of services to your org.

      Thanks everyone!

      I figured out that I just needed to get the data that would be entered in the two user entry forms and incorporate them into the sql queries (and I do know how to run them in various remote sql thingies and phpmyadmin) But I couldn't really figure out 1. what exactly the queries would be and 2. the bits that perl was doing - all the whiles and pushes and the output etc. So I hard coded $where_clause and $group_clause which are: event <> "Training" and (year(date)-2011)*12+month(date) > 6 and rh_name#if(month(date)<12,year(date),year(date)+1)#season respectively. I also added $where_clause to $linked_tables rather than the queries "anding" them. Don't ask me why but it fixed syntax error barefoot doodah message number 63 trillion. Ran it from ssh and lo and behold after addressing 742 kershillion more error messages the darling actually ran and produced an output, eventually. Yeehah BUT, and that is a big but, unfortunately it produced 116,500 rows of data rather than the 1,000 to 1,500 expected! (and I suspect it hadn't finished - just filled up temp table) My perl learner badge somewhat tarnished I return to the monks in the hope that someone will prefer just love to solve it.

      Haven't I done all the hard work? ;-)

      PS here's the code I ended up running

      and if it helps anyone I could email them what last years output looked like

      #!/usr/bin/perl $where_clause = "event <> 'Training' and (year(date)-2011)*12+month(da +te) > 6"; $group_clause = "rh_name#if(month(date)<12,year(date),year(date)+1)#se +ason"; print "Content-type: text/html\n\n"; die unless length($where_clause) > 0; $where_clause = "rwsurvey.type = 'LIT' and ".$where_clause; @groups = split('#',$group_clause); $group_clause = join(",",@groups); use DBI; $dbh = DBI->connect("DBI:mysql:conserva_reefwatch:localhost", "conserv +a", "w399eWkWp1"); if (!$dbh) { $dbh = DBI->connect("DBI:mysql:reefwatch:localhost", "root", "mysq +l"); die "Cannot connect: $D: $DBI::errstr\n" unless $dbh; print "<p>Connected locally</p>"; } $linked_tables = "rwsurvey, rwlit, rworganism, rwRHSites, rwSeasons, r +wlocation where rwlit.organism_id = rworganism.org_id and rwsurvey.su +rvey_id = rwlit.survey_id and rwRHSites.site_id = rwsurvey.site_id an +d rwRHSites.exposure = rwsurvey.exposure and rwlocation.site_id = rws +urvey.site_id and season_id = if(month(date)>0,floor(month(date) % 12 +/3),-1) and ".$where_clause; $output_counter= 0; ############################################################ # # Get summary of surveys and metres for each group # ############################################################ $stmt = "select rh_name,if(month(date)<12,year(date),year(date)+1),sea +son, count(distinct rwlit.survey_id), count(distinct CONCAT(rwsurvey. +survey_id,'_',metre)) from $linked_tables GROUP BY rh_name,if(month(d +ate)<12,year(date),year(date)+1),season"; #print "<p>$stmt"; #com $sth = $dbh->prepare($stmt) || print "Can't prepare $stmt: ".$dbh->err +str()."\n"; $sth->execute || print "can't execute the query: ".$sth->errstr()."\n" +; while (@res = $sth->fetchrow) { $line = join(',',@res); push (@group_val_list, $line); } #print @group_val_list; #com ############################################################ # # Extract the data for each group # ############################################################ foreach $line (@group_val_list) { $output_counter++; print "A,$output_counter,$line<br>"; @vals = split (/,/, $line); $last_id = $#vals - 2; @vals = @vals[0..$last_id]; $extra_clause = ""; for ($i=0; $i<=$#vals; $i++) { $extra_clause .= " and $groups[$i] = '$vals[$i]'"; } ############################################################ # # Type B: Raw data # ############################################################ $stmt = "SELECT rwsurvey.survey_id, metre, transition, code from $ +linked_tables $extra_clause order by rwsurvey.survey_id, metre, tran +sition"; # print "<p>$stmt</p>"; $sth = $dbh->prepare($stmt) || print "Can't prepare $stmt: ".$dbh- +>errstr()."\n"; $sth->execute || print "can't execute the query: ".$sth->errstr(). +"\n"; $last_survey_id = -1; $total_metres = 0; $total_surveys = 0; undef %cover; while (($survey_id, $metre, $transition, $code) = $sth->fetchrow) { $output_counter++; print "B,$output_counter,$survey_id,$metre,$transition,$code<br>"; if ($survey_id != $last_survey_id) { $total_surveys++; if ($last_survey_id != -1) { $total_metres += ($last_metre - $min_metre + 1); } $last = ($metre-1)*100; $min_metre = $metre; if (($transition != 0) or ($code != 'START')) { $output_counter++; print "D,$output_counter,'Problem: $survey_id, $last_survey_id +, $transition, $code'<br>"; } } else { $current = ($metre-1) * 100 + $transition; $last_metre = $metre; $delta = $current - $last; #print "delta = $delta<br>"; $cover{$code} += $delta; if ($delta < 0) { $output_counter++; print "D,$output_counter,'Problem: negative $current, $last, $ +metre, $transition, $code'<br>"; } $last = $current; } $last_survey_id = $survey_id; } $total_metres += ($last_metre - $min_metre + 1); #print "<p>$total_surveys surveys (total of $total_metres metres)" +; ############################################################ # # Type C: Percentage cover # ############################################################ while (($lifeform, $total_length) = each(%cover)) { $percent_cover = sprintf("%0.1f",$total_length / $total_metres); $val_list = join(',', @vals); $output_counter++; print "C,$output_counter,$val_list,$lifeform,$total_length,$percen +t_cover<br>"; } } +

        Forget this Perl script for now and just run your queries through PhpMyAdmin. If the query produced 100k+ result rows than you either got the query wrong or that's just simply the amount of data in your database now. You need to look at the ouput in PhPMyAdmin to see if that makes sense. It's pretty much impossible to tell just from the script without knowing your database schema.
        There are two queries in that script: are both producing "the wrong amount" of data?
        Once you are satisfied that these queries extract the data you need and you go back to working with this script to process the outputs, you do need to work on those errors of course. There can't be that many in a short script like that. Don't be fooled by a screen-full of error messages. Often, this is just caused by one simple syntax error that could easily be identified if you posted your error messages here.
        But as I said: you need to figure out those queries first. Otherwise, the whole Perl script is pointless.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://984364]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2014-08-28 06:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (257 votes), past polls