Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^3: Any pure-perl html to text? (Or: missing a perl equivalent to 'lynx -dump')

by blazar (Canon)
on Oct 15, 2006 at 17:36 UTC ( [id://578405]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Any pure-perl html to text? (Or: missing a perl equivalent to 'lynx -dump')
in thread Any pure-perl html to text? (Or: missing a perl equivalent to 'lynx -dump')

Gosh! You didn't even take a look at what lynx -dump produces, did you?

He didn't claim it would produce the same output, nor comparable one. He just pointed out it has a method for outputting plain text, which it has. Indeed I think it more or less amounts to the as_text() of the whole parse tree of the wanted page. Lynx and its variations are full fledged browser, so it is natural they go beyond the capabilities of a simple parser, aiming at being presentation friendly. But that's quite a lot of work. You may hack/roll your own by inserting horizontal and vertical whitespace suitably around individual elements before printing them as_text. Needless to say, this is necessarily going to be quite a lot of work, but maybe just inserting newlines after every single one of them may make everything more clear. Oh, and at the very least take care of paragraphs and breaks. But if you also want line wrap that's a whole another story. (A call for Text::Wrap, most probably.)

OTOH did you look at the outcome of your post (as is recommended)?!? It screwed up the whole view for this thread. Use <code> tags around the stuff you pasted, although it's not strictly code. At least that has smart line wrap...

Update: the post has been fixed, hence the above comment does not apply any more.

Ciao

Replies are listed 'Best First'.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://578405]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-03-29 08:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found