![]() |
|
Perl-Sensitive Sunglasses | |
PerlMonks |
validating unicode chars in their smallest formby damian45 (Novice) |
on Jun 13, 2010 at 01:18 UTC ( [id://844384]=perlquestion: print w/replies, xml ) | Need Help?? |
damian45 has asked for the wisdom of the Perl Monks concerning the following question:
hi monks
I'm validating some mixed English and Japanese utf-8 input . It sometimes contains a-z A-Z 0-9 entered not only from the common ascii compatible unicode range, but also this unicode range xFF10 - xFF5E
http://en.wikibooks.org/wiki/Unicode/Character_reference/F000-FFFF
for example
A (unicode x0041)
A (unicode xFF21 http://www.decodeunicode.org/u+FF21)
I understanding that to be safe I need to interpret unicode characters I accept only as their smallest unicode representation So question is, can I use some function/module of Perl to do this, or do I have to manually convert them with a mapping. All the experimenting I've done so far, it seems like I'll have to manually do it. This surprise me if I is supposed to interpret them in their smaller representation. cheers for any feedback, sorry for my english damian
Back to
Seekers of Perl Wisdom
|
|