Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^3: HTML stripper in WWW::Mechanize doesn't seem to work

by Nkuvu (Priest)
on Aug 01, 2005 at 02:19 UTC ( [id://479777]=note: print w/replies, xml ) Need Help??


in reply to Re^2: HTML stripper in WWW::Mechanize doesn't seem to work
in thread HTML stripper in WWW::Mechanize doesn't seem to work

You can do this, but you'll have to do something like a join first.

Consider the simpler example:

my @mango = ('one', 'two', 'three', 'penguin'); my $result = @mango; print "Result is $result\n"; # prints 4 $result = join ' ', @mango; print "Result is $result\n"; # prints "one two three penguin"
If the content subroutine returns an array and you assign it in scalar context, you get the count of the things in the array. For your particular code you'll want something like: $stripped_html = join ' ', $webcrawler->content( format => "text" );

Replies are listed 'Best First'.
Re^4: HTML stripper in WWW::Mechanize doesn't seem to work
by sk (Curate) on Aug 01, 2005 at 03:49 UTC
    I don't think  content() returns an array. I checked the source code and this is what i see

    sub content { my $self = shift; my $content = $self->{content}; return $content unless $self->is_html; ### More stuff there... ..... ..... return $content; }

    Looks like it is just a scalar. Does not look like any reference to an array either so I guess it is a scalar. So he should be able to store it in one element of his array.

      Good point. As mentioned previously, when I first replied to this I didn't have WWW::Mechanize installed and based my reply on the OP's comment. But later this evening I did install the module. And what do you know, the original script worked just fine. I did test the script with my suggestion for using an array and join, and it worked fine. Of course joining a scalar does nothing, so that obscured the success of the original script (my change worked despite the join, not because of it).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://479777]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2024-04-23 21:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found