Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^3: unicode normalization layer

by DrWhy (Chaplain)
on Sep 17, 2009 at 05:18 UTC ( [id://795776]=note: print w/replies, xml ) Need Help??


in reply to Re^2: unicode normalization layer
in thread unicode normalization layer

This is certainly the simplest approach I've seen so far, and I'll definitely keep it in mind for future use. However, I'm currently using something closer to graff's approach. I need to have a count of the invalid items encountered in the input stream, so I've defined a CHECK function to be used by :encoding(utf8) that ticks up a counter of the number of bad things found and then returns the unicode WTF?! character to replace it in the input stream.

As for the relative speed of getline (<>) and read block, I was recently working with a system where benchmarking showed the speed difference between the two approaches was quite substantial -- 7-8 times difference -- which is why I wanted to avoid getline in this case, especially since my processing needs are not specifically line-oriented.

--DrWhy

"If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://795776]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2025-03-16 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    When you first encountered Perl, which feature amazed you the most?










    Results (54 votes). Check out past polls.