Re^2: Demarcate Regexes with Unicode

by toro (Beadle)
on Sep 16, 2011 at 08:57 UTC

in reply to Re: Demarcate Regexes with Unicode
in thread Demarcate Regexes with Unicode

its a delimiter not a sigil

Well, it's a section sign, but lawyers sometimes call it Sigil. (I know that namespace is already occupied in this circle.)

Because its not on the standard keyboard!

If you're on a Mac it's quite easy to make, and if you're on Ubuntu it's pretty easy to make. On Windows too, I still remember

Alt + Num0141
from typing Spanish on a US keyboard.

Anyway, I admit this approach is not for everybody. I like your suggestions (but I don't have a key for ).

[ELISHEVA]: Beat me to it!
[erix]: ( but you know how to perldoc DBI::CSV, right? )
[ELISHEVA]: yes - I set "f_encoding"
[erix]: still doesn't work?
[afoken]: sorry, forget File::BOM. I did not see DBD.
[ELISHEVA]: perldoc - of course
[erix]: hm, where's tux when you need him?
[ELISHEVA]: what's curiouser is that the underlying Text::CSV_XS appears to be BOM-friendly
[afoken]: DBD::CSV has a csv_class attribute. You could subclass Text::CSV_XS to handle BOMs, e.g. through File::BOM.
[ELISHEVA]: Text::CSV_XS already can handle boms - see its detect_bom flag

