|Perl: the Markov chain saw|
remiah: I studied your module and test script: you've done a very good job - it's working. Thank you for that.
But, what this effectively does (as UNK noted in his answer here: http://stackoverflow.com/questions/13209474/ ), is re-encoding the data before inserting it into the tied array and the tied file; so the array does not contain Unicode data in internal Perl representation, but instead simply contains the imported UTF-8 strings.
Now in my project, I am doing regex comparisons and substitutions against the tied array; so if I go this route, I'll have to re-decode the array element before any processing, and re-encode it again.
What do you think?
Many thanks for your well-thought-out answer.
In reply to Re^3: Tie::File failing with Unicode/UTF-8 encoding?