Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^3: Normalizing diacritics in (regex) search

by hippo (Archbishop)
on Nov 25, 2025 at 10:41 UTC ( [id://11166803]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Normalizing diacritics in (regex) search
in thread Normalizing diacritics in (regex) search

last but not least, it doesn't provide me equivalent classes for specific latin characters. Just one function unidecode to "flatten" all input to latin characters if possible.

Sorry, in that case I have misunderstood your requirements as I took it that this "flattening" is what you were after when you said "Of course I could do the normalization manually and map à á ä å ... -> a and so on." - never mind.


🦛

  • Comment on Re^3: Normalizing diacritics in (regex) search

Replies are listed 'Best First'.
Re^4: Normalizing diacritics in (regex) search
by LanX (Saint) on Nov 25, 2025 at 14:00 UTC
    No! No need to apologize, I was asking for input.

    You just asked if I tried that module and I wanted to share my insights.*

    The unidecode mapping à á ä å ... -> a would force me to normalize all search data.

    The reverse a -> à á ä å allows to fix the search term. By replacing every a with a character class [àáäå] etc.

    Both approaches have their pro and cons, I prefer to have the choice. :)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

    *) reworded

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11166803]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2026-04-21 13:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.