Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
You're both right and wrong here. Let me first quote a bit of the HTML 4 specification:

4.3 The text/html content type

HTML documents are sent over the Internet as a sequence of bytes accompanied by encoding information (described in the section on character encodings). The structure of the transmission, termed a message entity, is defined by RFC2045 and RFC2616. A message entity with a content type of "text/html" represents an HTML document.

Here, the HTML specification explicitly indicates a reliance upon RFC2045 (and by association, RFC2046).

An HTML document is stored as a native plain-text document either on a user's PC or a web server. Newlines at this point are native to the OS.

Do not make the assumption that HTTP servers are filesystem-based! It just so happens that a filesystem-oriented document root, with URI contents represented as files, is the simplest and most prevalent way HTTP servers are implemented, but do not blindly assume that HTTP was written with the intent that a URI should map to a file on a filesystem. That's just not accurate.

But because most HTTP servers do this, they have to do a certain number of things to ensure that a requested resource is being delivered in a fashion consistent with HTTP and MIME. This usually involves examining a file extension for a MIME type, and delivering the contents of the file in a fashion consistent with that MIME type. If an HTML document is being stored on a filesystem with native newlines, an HTTP server that relies on filesystem-oriented content should take steps to ensure that "special cases" like newlines are addressed as well. Conversion should be performed by the web server as a consequence of the web server's filesystem-oriented implementation of an HTTP service.

What is the alternative? Turn HTML files into what are effectively binary files due to their quirkly (with respects to the native text format) line endings? If not, what else is supposed to be converting newlines here?

Think of this from the user agent's point of view. It's expecting content with the MIME type of text/html. MIME explicitly states that text/html must have line endings in CRLF fashion. How does it get that way? If the server isn't responsible for it, what is?

The HTTP servers have assumed this responsibility as a consequence of choosing a filesystem-oriented mechanism for storing content. They have to live with content stored with native line endings and should thus be responsible for getting that converted into something appropriate when delivering text/* content over the Internet.


In reply to Re: Re: Re: Re: Re: Handling Mac, Unix, Win/DOS newlines at readtime... by Fastolfe
in thread Handling Mac, Unix, Win/DOS newlines at readtime... by strredwolf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-14 16:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found