Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: Convert HTML Email message to plain text

by bajangerry (Sexton)
on Oct 29, 2009 at 17:52 UTC ( [id://804004]=note: print w/replies, xml ) Need Help??


in reply to Re: Convert HTML Email message to plain text
in thread Convert HTML Email message to plain text

Ok, that is more than a little bit confusing for me as there seems to be 50 ways to do this.
  • Comment on Re^2: Convert HTML Email message to plain text

Replies are listed 'Best First'.
Re^3: Convert HTML Email message to plain text
by merlyn (Sage) on Oct 30, 2009 at 00:48 UTC
    There's at least 50 ways to do it because there's no "right" answer. You're losing semantic information when you go from HTML to plain text, so you have to be the judge of how lossy you want the transfer to be, and what proxies you want to have in the text form for things that cannot be represented.

    -- Randal L. Schwartz, Perl hacker

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

      There are people who write HTML emails and actually use the HTML for the semantic information HTML can provide? Amazing. I always thought 90% of such emails was written by people who may know HTML, but have no idea how to code in it, or how to configure their mail program to send anything else, and the remaining 10% uses it for BLINK and coloured fonts.

      "I write HTML email for the semantic information" sounds like "I read Playboy for the articles". It's possible, but no significant number of people actually do it.

        No. Imagine the semantic content differences of:
        • He had three pieces of pie.
        • He had three pieces of pie.
        • He had three pieces of pie.
        • He had three pieces of pie.
        • He had three pieces of pie.
        And now remove the bolding. You can't distinguish those. The semantic content is in the bolding. If you had only ASCII, you might add *markup* or _something_ to provide the right emphasis. But if you just rip out the HTML coding, you have indeed lost something.

        This is quite common. I don't know why you think it's rare. :)

        -- Randal L. Schwartz, Perl hacker

        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

        A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://804004]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-24 11:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found