Unicode Transliteration

by philkime
I am looking for a good transliteration module able to cope with a range of scripts. There is Unicode ::Transliterate which is old but still seems to compile and work with current ICU libraries. However, it's said to be alpha quality and has a lot of compiler warnings. Lingua::Translit doesn't have very many scripts (no Indic) but is extensible. I tried to write a mapping for Latin<->Devanagari but it's all data rule driven which makes defining new mappings a pain - no coding, just XML rules with no control over NFC/NFD etc. Lingua::Deva seems to work but it's just for Devanagari and I'd prefer something more general. So, does anyone know what happened to PICU - the "wrapper for ICU"? ICU is the way to go but as far as I know, there has never been a decent perl wrapper for this. I heard rumours that perl6 would use ICU internally but that was, like a lot of perl6 news, years and years ago ...

Re: Unicode Transliteration
by Corion

    I don't know if you're tied to ICU, but I've had good experience with Text::Unidecode, which turns Unicode strings (back to) Roman text data.

      Thanks for the recommendation but I should have said that I need conversion between strict standards-based scripts like IAST and Devenagari and that module just (quite well apparently) lets you do a helpful ASCII transliteration. Specifically, I need to be able to do things like IAST Sanskrit -> Devanagari as this is the way to collate such languages.
Re: Unicode Transliteration
by zwon
    However, it's said to be alpha quality and has a lot of compiler warnings.
    But does it work for you? BTW, I see only three deprecation warnings when build it on Ubuntu with libicu52
      I rebuilt and fixed the warnings and it does now appear to build cleanly - a real tribute to the backwards compat of ICU ... I have to wait until my Sanskrit source can verify if the transliteration looks ok ...
        Apparently not. It seems that ICU doesn't support IAST, only the more general and different ISO15919. Ah well, perhaps I will have to fight with Lingua::Translit.
Re: Unicode Transliteration
by captainjames
    > So, does anyone know what happened to PICU - the "wrapper for ICU"?

    I was the co-author of PICU back in 2002. The source is still online, but I don't believe you will be able to build it 2 decades later without significant effort.

    Results (81 votes). Check out past polls.