Re^2: Convert strings with unknown encodings to html


Problems? Is your data what you think it is?
	PerlMonks

Re^2: Convert strings with unknown encodings to html

by Pascal666 (Scribe)

on Jul 01, 2015 at 01:32 UTC ( [id://1132731]=note: print w/replies, xml )

Need Help??

in reply to Re: Convert strings with unknown encodings to html
in thread Convert strings with unknown encodings to html

I included examples of each in the above test program. Note that some of the examples are multiple bytes (#1 below, for example, is two characters, one of three bytes and one of two). Best I can tell, the formats are:

1. UTF-8: chr(226).chr(152).chr(134), chr(195).chr(161)
2. CP1252: chr(150), chr(153)
3. HTML: '&reg;', '&AElig;'
4. ASCII: '&'
5. Unicode codepoints: chr(63743), chr(991), chr(9760));
[download]

Obviously the database is a bit 'special'. Unfortunately it is provided by a 3rd party, a very large company, and I have no control over their input sanitization.

Comment on Re^2: Convert strings with unknown encodings to html Download Code

Replies are listed 'Best First'.
Re^3: Convert strings with unknown encodings to html by Anonymous Monk on Jul 01, 2015 at 01:49 UTC
Obviously the database is a bit 'special'. Unfortunately it is provided by a 3rd party, a very large company, and I have no control over their input sanitization. :) complain	[reply]

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://1132731]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others musing on the Monastery: (7)

As of 2024-04-23 18:49 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found