Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Using HTML::TableExtract

by RayRay459 (Pilgrim)
on Sep 07, 2001 at 02:01 UTC ( [id://110762]=perlquestion: print w/replies, xml ) Need Help??

RayRay459 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks
I am writing a script that will query a url and get the html code from the page and then parse it with HTML::TableExtract. I am getting and error in my log file that says:
HTML::TableExtract=HASH(0x1df15b8)

Does anyone know how i can fix this or maybe show me a better way to get my information. The information i need is in a table on this page.
Here's a copy of my code. Please, any help would be appreciated. THNX.
Ray
# Ray Espinoza # GetAPC.pl # This script will take a list of ip addresses of APCs and # send the desired output to a textfile. ###################################################################### #!D:\perl\bin -w use LWP::UserAgent; use HTTP::Request; use HTML::TableExtract #################################################################### +# # Asking for Output file names. #################################################################### +# print "Enter the name of the output file:\n"; chomp($OutFile = <STDIN>); #################################################################### +## # Opens the outfile for appending #################################################################### +## open(OUT,">$OutFile") || die "Can't create $OutFile: $!"; #################################################################### +## # This will log into the apc and grab the html and stuff it into an # array and then i will take that array and add a new line to every # line and stuff that into a variable #################################################################### +### $ua = new LWP::UserAgent; $url = 'http://192.168.1.1/pdumaina'; #print $url; $request = new HTTP::Request('GET',$url); $request->authorization_basic('login', 'password'); $ua->timeout(10); $response = $ua->request($request); $responsecode = $response->code(); if ($responsecode != 200) { print "Failed Request: $responsecode\n"; } else { # login successful, let's get the html code into a variable @ARRAY_OF_LINES = (split "\n", $ua->request($request)->as_stri +ng); foreach $line (@ARRAY_OF_LINES) { $html_code .= $line . "\n"; } } #################################################################### +#### # This will attempt to use HTML::TableExtract to look for these spec +ific # headers in the html tables and print out the values in the table. #################################################################### +#### $te = new HTML::TableExtract( headers => [qw(Outlet Device Name)] ); $te->parse($html_code); print OUT $te; close(OUT) || warn "Couldn't close $OutFile";

Replies are listed 'Best First'.
Re: Using HTML::TableExtract
by Chmrr (Vicar) on Sep 07, 2001 at 02:13 UTC

    When you do a print OUT $te; towards the bottom, you are attempting to print out the HTML::TableExtract object itself, not the results that it has garnered! From the HTML::TableExtract documentation, the following in place of that line might produce something a little more useful:

    foreach $ts ($te->table_states) { print "Table found at ", join(',', $ts->coords), ":\n"; foreach $row ($ts->rows) { print " ", join(',', @$row), "\n"; } }
    In summary, you probably have gotten the tables you wanted -- you just need to do something useful with them.

     
    perl -e 'print "I love $^X$\"$]!$/"#$&V"+@( NO CARRIER'

Warnings Re: Using HTML::TableExtract
by Zaxo (Archbishop) on Sep 07, 2001 at 07:38 UTC

    A minor point having nothing to do with HTML::TableExtract.

    You should change your splatline to unix form:

    #!/usr/bin/perl -w
    and make it the first line of the script. Windows doesn't pay attention to the splatline, but perl for Windows does read options from it. You are probably not seeing warnings that would help you debug.

    After Compline,
    Zaxo

      That is not the case if you are setting the -T switch for taint checking -- the shebang line must point to the Perl interpreter, otherwise your script will fail. So it's a good habit to get into. You can use forward slashes (#! c:/perl/bin/perl.exe -w), so it doesn't have to look too ugly.

      This does depend on the way the script is called, for instance, if you say perl -T script it'll work fine, but if you use Win32 file associations (such as running from Win32 Apache -- which is where I was bitten by this), the script will fail. It took me ages to track that down when it first happened. In fact, I didn't solve it. It was only much later reading the AS documentation that I serendipitously found the solution.

      --
      g r i n d e r
Re: Using HTML::TableExtract
by tfrayner (Curate) on Aug 30, 2002 at 18:18 UTC
    Hi, Okay, coming rather late (about a year) to the discussion...

    I can't bear to see a thread go unfinished, especially when I've blatantly copied the original poster's code and edited it into working (the example web site used below is of personal interest to me and my loved ones):

    #!/usr/bin/perl -w use strict; use LWP::UserAgent; use HTTP::Request; use HTML::TableExtract; my $ua = LWP::UserAgent->new(timeout => 10); my $url = 'http://sourceforge.net/tracker/?atid=440764&group_id=36855& +func=browse'; my $request = HTTP::Request->new('GET',$url); my $response = $ua->request($request); if ($response->is_success){ my $te = new HTML::TableExtract( headers => ['Request ID','Summary +'] ); $te->parse($response->content); foreach my $ts ($te->table_states) { foreach my $row ($ts->rows) { print (join("\t", @$row)."\n"); } } } else { print "Error: ".$response->status_line."\n"; }
    N.B. this node isn't worth ++ing; it would only encourage this kind of behaviour :-)

    Tim

      beautiful! (a decade later!)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://110762]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (7)
As of 2024-04-23 10:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found