Regex for Surnames

by Anonymous Monk
on Aug 29, 2004 at 00:19 UTC

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi I'm a perl beginner and im looking for a regex to accept surnames and nothing else. would you agree with me that a surname should only contain the 26 letters of the alphabet, dashes(-), and spaces?

Re: Regex for Surnames
by shenme (Priest) on Aug 29, 2004 at 00:25 UTC
    Some example code for Lingua::EN::NameParse
    $correct_case = &case_surname("DE SILVA-MACNAY",$lc_prefix); # De Silv +a-MacNay
    would seem to agree, but perhaps you should check out the complete description of that module?
Re: Regex for Surnames
by guha (Priest) on Aug 29, 2004 at 00:29 UTC

    I think it is a little bit more complicated than that.

    What about Carolina Klft, our heptathlon champion from the Olympic Games and what about IBM is that a surname?

Re: Regex for Surnames
by fokat (Deacon) on Aug 29, 2004 at 02:06 UTC

    I hope that among the 26 letters of the alphabet you're referring to, is the ... My surname is Muoz, and frankly I'm quite tired of being Mu#oz, Munoz or even Mu_oz. Also, you have to think about and other fancy characters.

      Remember not to put a minimal length > 1 on Surnames. I've been told, when I suggested a minimal length of 2, that there is a chinese colleague who's surname is X.
      Strange... I thought it would be Muñoz ;-)
Re: Regex for Surnames
by dave_the_m (Monsignor) on Aug 29, 2004 at 00:45 UTC
    Then there's O'Brian etc


Re: Regex for Surnames
by CountZero (Bishop) on Aug 29, 2004 at 12:09 UTC
    And we should not forget the Turkish, Russian, Greek, Arabs, Finnish, Icelandic, Indian, Chinese, Japanese, Koreans, ...

    It is not without reason Unicode was invented!


Re: Regex for Surnames
by ambrus (Abbot) on Aug 29, 2004 at 11:48 UTC

    I'd think that a surname could contain an apostrophe (') too.

    Thus, you might want something like

    param("surname")=~/^[\pL\- '.]+$/ and $surname = $1;

    Update there are surnames with dots too, so I've added it too.

