Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: How do I know what encoding was used for form input?

by itub (Priest)
on Aug 10, 2005 at 20:15 UTC ( [id://482747]=note: print w/replies, xml ) Need Help??


in reply to How do I know what encoding was used for form input?

You can never be 100% sure, but well-behaved user agents usually submit the data in the encoding that was used in the page containing the form. You can also specify which charsets you accept by using the "accept-charset" attribute of the form element. Some user agents might also specify the charset in the Content-Type header for POST requests.
  • Comment on Re: How do I know what encoding was used for form input?

Replies are listed 'Best First'.
Re^2: How do I know what encoding was used for form input?
by jhourcle (Prior) on Aug 11, 2005 at 13:11 UTC

    Although I'll second the suggestion of using the 'Accept-charset' header, I'm not so sure about user agents responding in the same encoding as the page

    From RFC 2616 (HTTP/1.0):

    I'm still not sure how to handle form data in the QUERY_STRING -- from section 2.1 of RFC 2396 (URI Syntax):

    (If anyone knows of a followup RFC, I'd love to know what the number is)

    And for the original poster, although Joel's article is a good start, it's intended as a quick overview -- I'd also suggest you take a look at A tutorial on character code issues

      That's right, there's no real standard way of telling a client how the URI for a GET should be encoded (and even if there is for POST, it seems most clients don't comply). However, practical experience with mainstream browsers lead to this conclusion (http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html).
      By now (2005) the robust way to deal with this issue is to send out forms pages encoded in utf-8, expecting the forms input to be submitted back using that encoding. This has been in practical use for a couple of years now (e.g at Google) and can be expected to work with any current HTML4-compatible browser. However, there are other browsers still in use which don't fit this description, so it still seems relevant to look at the theory and compare it with observations.

      I've used this approach for several websites and it works with all the (reasonably recent) browsers I've tested.

      "In theory, theory and practice are the same, but in practice, they never are."

        Thanks for the reference -- I know sgifford had given it as well, but he seemed to just be quoting it, rather than mentioning the information it contained.

        I hadn't seen the 'buzzword' concept presented before, but it seems like a simple hack to validate what's being sent back to you.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://482747]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-04-19 19:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found