Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^5: HTML parsing module handles known and unknown encoding

by grantm (Parson)
on Nov 17, 2011 at 23:42 UTC ( #938708=note: print w/replies, xml ) Need Help??


in reply to Re^4: HTML parsing module handles known and unknown encoding
in thread HTML parsing module handles known and unknown encoding

I was merely pointing out that if the HTML includes an encoding ("charset") declaration, then XML::LibXML's parse_html method will honour it. I guess that's not much use if the HTML doesn't include a declaration.
  • Comment on Re^5: HTML parsing module handles known and unknown encoding

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938708]
help
Chatterbox?
[choroba]: which usually means its input wasn't handled correctly
[Corion]: choroba: Yeah, I think that would be the good solution
[LanX]: I suspect the first string which comes from the DB ...
[LanX]: ... but this part is already in production for a year now
[Corion]: LanX: The "good" approach here would be to use the appropriate DBI parameters to make the driver decode strings properly. But that will have a ripple-on effect of messing up all the places where manual decoding happens ;)
[LanX]: which means albeit being broken UTF8 it'll be handled correctly
[LanX]: and the problem only occurs since we changed the emails to base64
[LanX]: my main problem will be to cnvince my colleagues that our productive code is broken oO ... so in the end I will just make a workaround :-/
LanX hates UTF8 for causing knots in his brain and stomach
[Corion]: LanX: Yes, that's the main problem - you have lots (and lots) of workarounds in various places and stages of the processing, and to clean that mess up requires action across the complete codebase. And it's almost impossible to do it piece-by-piece

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (10)
As of 2017-01-16 14:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you watch meteor showers?




    Results (150 votes). Check out past polls.