http://www.perlmonks.org?node_id=347866

dannoura has asked for the wisdom of the Perl Monks concerning the following question:

hi,

I'm trying to print out the href attribute of all links in a document but not succeeding for some reason. I keep getting the error:

Can't use string ("</a>")as a HASH ref while strict refs in use

It's probably some stupid mistake but I'm new to HTML::TokeParser and perl so bear with me. The code is:

#! c:\perl\bin -w use strict; use HTML::TokeParser; my $p = HTML::TokeParser->new(shift||die); while (my $token = $p->get_token) { print $token->[2]{'href'}, "\n" if ($token->[1] eq 'a'); }

And the relevant part of the html is:

<td nowrap> <font size="-1"><a href="/home.cfm" target="_top" class=" +novisit">Home</a> &nbsp;|&nbsp;<a href="/SpecimenSearch.cfm" target="_top" clas +s="novisit">Specimen&nbsp;Search</a>

-----------------------------------

Any comments about coding style are welcome.

Replies are listed 'Best First'.
Re: HTML::Tokeparser and $attr problem
by Mr. Muskrat (Canon) on Apr 24, 2004 at 18:10 UTC

    You are not checking the token type. You only want to check the starting 'a' tag.

    #! c:\perl\bin -w use strict; use HTML::TokeParser; my $p = HTML::TokeParser->new(shift||die); while (my $token = $p->get_token) { print $token->[2]{'href'}, "\n" if ($token->[0] eq 'S' && $token->[1 +] eq 'a'); }

      Thanks. That did it.

Re: HTML::Tokeparser and $attr problem
by Rex(Wrecks) (Curate) on Apr 24, 2004 at 18:29 UTC
    I see you got a solution, I just thought I would point out that I recently needed to learn HTML::TokeParser as well and found the HTML::TokeParser Tutorial to be very valuable and accurate.

    I have also found the Tutorials page to be a nice place to go when RTFM has not helped.

    Heh, don't get me wrong, I am not trying to get you not to post here, just presenting a resource many here forget about.


    "Nothing is sure but death and taxes" I say combine the two and its death to all taxes!
Re: HTML::Tokeparser and $attr problem
by Ovid (Cardinal) on Apr 25, 2004 at 07:22 UTC

    You can make this problem easier to solve by switching to HTML::TokeParser::Simple. (Disclaimer: I wrote it)

    use strict; use HTML::TokeParser::Simple; my $p = HTML::TokeParser::Simple->new(shift||die); while (my $token = $p->get_token) { print $token->return_attr('href'), "\n" if $token->is_start_tag('a'); }

    Note that the module is a drop-in replacement for HTML::TokeParser. If you switch to it, your code will function exactly the same and you can simply rewrite the messy bits.

    Cheers,
    Ovid

    New address of my CGI Course.