Your skill will accomplish what the force of many cannot |
|
PerlMonks |
A problem with dash typographyby hsmyers (Canon) |
on Sep 08, 2015 at 18:36 UTC ( [id://1141367]=perlquestion: print w/replies, xml ) | Need Help?? |
hsmyers has asked for the wisdom of the Perl Monks concerning the following question:
I have a great deal of text that has both endashes and emdashes (– and — respectively) within html files as plain text. Since my editor gladly converts this (nary a complaint) I usually don't pay any attention. However I recently noticed a problem with HTML::Entities encode_entities function; i.e.
produces: rather than: Now that I've spotted the problem, I can easily do the necessary regex massage and have it go away, but I was wondering if anyone knows the necessary Unicode/UTF-8 incantation magic to avoid the problem in the first place (if in fact that is what is)? Note that the emdash is translated to — instead of „ I have not checked the other typical HTML typographical elements as yet, these are so common that the problem surfaced fairly quickly. Note:I leave the typos as written, but I really meant — and – *sigh* Note: https://stackoverflow.com/questions/631406/what-is-the-difference-between-em-dash-151-and-8212 seems pertainent... --hsm "Never try to teach a pig to sing...it wastes your time and it annoys the pig."
Back to
Seekers of Perl Wisdom
|
|