Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
laziness, impatience, and hubris
 
PerlMonks  

Table data

by Anonymous Monk
on Oct 25, 2010 at 14:00 UTC ( #867239=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

How to get table data in an array. Html page code is given below

<html> <title>Hprices</title> <form style="margin: 0px; padding: 0px;" id="selectHistoricalFormId" action="" method="post"> <fieldset><legend>Search Historical prices</legend> <table style="margin-top: 4px;"> <tbody> <tr> <td><label>From: </label></td> <td><input class="hasDatepicker embed" id="Fro +mDate" type="text"><img title="..." alt="..." src="historical_prices_files/calendar.gif" class="ui-datepicker-trigger"></td> <td><label>To: </label></td> <td><input class="hasDatepicker embed" id="ToD +ate" type="text"><img title="..." alt="..." src="historical_prices_files/calendar.gif" class="ui-datepicker-trigger"></td> <td><input class="searchButton" onclick="histo +rical.load();return false;" value="Search" type="button"></td> <td> <div class="linkIcon_xls" id="exportExcel">&nb +sp;</div> </td> </tr> </tbody> </table> </fieldset> </form> <div id="historicalOutput"><table id="historicalTable" class="tablesorter" border="0" cellpadding="0" cellspacing="0"> <thead> <tr> <th class="header headerSortUp" title="Date">Date</th> <th class="header" title="High price">High price</th> <th class="header" title="Low price">Low price</th> <th class="header" title="Closing price">Closing price</th> <th class="header" title="Average price">Average price</th> <th class="header" title="Total volume">Total volume</th> <th class="header" title="Turnover">Turnover</th> </tr> </thead> <tbody> <tr class="odd" id="historicalTable-"> <td>2010-08-03</td> <td>294.77</td> <td>294.77</td> <td>294.77</td> <td></td> <td>1</td> <td>13,000</td> </tr> <tr class="even" id="historicalTable-"> <td>2010-08-02</td> <td>294.77</td> <td>294.77</td> <td>294.77</td> <td></td> <td>1</td> <td>143,801</td> </tr><tr class="odd" id="historicalTable-"> <td>2010-07-30</td> <td>294.77</td> <td>294.77</td> <td>294.77</td> <td></td> <td>1</td> <td>219,800</td> </tr><tr class="even" id="historicalTable-"> <td>2010-07-29</td> <td>294.77</td> <td>294.77</td> <td>294.77</td> <td></td> <td>1</td> <td>70,800</td> </tr><tr class="odd" id="historicalTable-"> <td>2010-07-28</td> <td>302.14</td> <td>302.14</td> <td>302.14</td> <td></td> <td>1</td> <td>1,924,345</td> </tr><tr class="even" id="historicalTable-"> <td>2010-07-27</td> <td>302.14</td> <td>302.14</td> <td>302.14</td> <td></td> <td>1</td> <td>54,325</td> </tr><tr class="odd" id="historicalTable-"> <td>2010-07-26</td> <td>302.14</td> <td>302.14</td> <td>302.14</td> <td></td> <td>1</td> <td>67,855</td> </tr></tbody> </table></div> </html>

I tried to use below perl code to get the value, but didn't work out

my $body_part=""; while(<DATA>) { chomp($_); $body_part .= $_; } $body_part =~s/^.*?<div id=\"historicalOutput\" .*?>.*<tbody>(.*?)<\/t +body>/$1/sigm; my @data = $body_part =~ m/<tr>(.*?)<\/tr>/sigm; my @consolidated_data=(); foreach my $newdata (@data) { @consolidated_data = $newdata =~ m/<td.*?>(.*?)<\/td>/sigm; }
Thanks

Comment on Table data
Select or Download Code
Re: Table data
by jethro (Monsignor) on Oct 25, 2010 at 14:23 UTC
    foreach ... @consolidated_data = ... }

    This would overwrite previous results every time the foreach loop executes. You need

    foreach ... push @consolidated_data, ... }

    Hint: Use Data::Dumper to print out the variables in your script. Then you find out exactly where the scripts starts to misbehave and often why. You also will find out if your regexes match what you want them to match

Re: Table data
by sundialsvc4 (Monsignor) on Oct 25, 2010 at 14:51 UTC

    To troubleshoot things like these, you need to see them.   Use an HTML dumper (possibly on the client side, e.g. FireBug) to see what the HTML data that is being sent actually consists of.   Then, when you are preparing the HTML output, use Data::Dumper (as previously noted) so that you can see what the data structure that you are trying to output actually contains.   Then, dump the HTML output so that you can see what you built.

    It isn’t enough to simply observe that the browser’s page-output does not look as you expected it to.   There are just too many reasons why this should be, for you to “hazard a guess.”   That is a waste of time.

Re: Table data
by Monkomatic (Sexton) on Oct 26, 2010 at 01:31 UTC

    This might help you out a bit.

    http://perldoc.perl.org/perlretut.html

    moritz was kind enough to suggest it for my problem which was similar. And I was able to come up with a solution for a similar problem.

    http://www.perlmonks.org/?node_id=866173

    Especially the part on global matching

    Global matching

    Perl will assign the matches temporarily to $1 $2 $3 until the next match. You can assign them more permanently by building an array and heading onward to the next match.

    I know the match reg ex is a little complicated for this example but you get the general idea..

    1. $x = "cat dog house"; # 3 words 2. $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches, 3. # $1 = 'cat' 4. # $2 = 'dog' 5. # $3 = 'house'
Re: Table data
by mojotoad (Monsignor) on Oct 26, 2010 at 02:15 UTC
    Hi there, Perlbeginner1. This works:

    use HTML::TableExtract; my $te = HTML::TableExtract->new( attribs => { id => 'historicalTable' }, ); $te->parse_file(\*DATA); $te->first_table_found->dump(1);

    Cheers,
    Matt

    edit: some non-pm context is here. recent post history is here.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://867239]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-04-21 14:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (495 votes), past polls