Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re^2: Perl Module for identifying country name

by maheshkumar (Sexton)
on Aug 03, 2012 at 15:07 UTC ( #985260=note: print w/replies, xml ) Need Help??

in reply to Re: Perl Module for identifying country name
in thread Perl Module for identifying country name

Actually what I want is just to find which country names are there in a text file

for grep i think i will need to mention if it is United States or Germany right? This way I can miss the country name Canada if it is in the file

  • Comment on Re^2: Perl Module for identifying country name

Replies are listed 'Best First'.
Re^3: Perl Module for identifying country name
by CountZero (Bishop) on Aug 03, 2012 at 15:54 UTC
    You can use a regular expression to find all (English) country names.
    (?-xism:(?:S(?:a(?:int (?:(?:Vincent and the Grenadine|Kitts and Nevi) +s|Lucia)|o Tome and Principe|(?:udi Arabi|mo)a|n Marino)|o(?:uth (?:( +?:Afric|Kore)a|Sudan)|lomon Islands|malia)|(?:(?:lov(?:ak|en)|yr)i|ri + Lank)a|w(?:(?:itzer|azi)land|eden)|e(?:ychelles|negal|rbia)|i(?:erra + Leon|ngapor)e|u(?:riname|dan)|pain)|B(?:o(?:(?:snia and Herzegovi|ts +wa)n|livi)a|a(?:h(?:amas|rain)|ngladesh|rbados)|u(?:r(?:kina Faso|und +i|ma)|lgaria)|e(?:l(?:arus|gium|ize)|nin)|r(?:azil|unei)|hutan)|M(?:a +(?:l(?:a(?:ysia|wi)|dives|ta|i)|urit(?:ania|ius)|c(?:edonia|au)|rshal +l Islands|dagascar)|o(?:n(?:(?:tenegr|ac)o|golia)|zambique|ldova|rocc +o)|icronesia|exico)|C(?:o(?:(?:sta Ric|lombi)a|te d'Ivoire|moros)|a(? +:m(?:bodia|eroon)|pe Verde|nada)|(?:entral African|zech) Republic|h(? +:i(?:le|na)|ad)|(?:roati|ub)a|yprus)|T(?:u(?:rk(?:menistan|ey)|nisia| +valu)|a(?:(?:jikist|iw)an|nzania)|rinidad and Tobago|o(?:nga|go)|imor +-Leste|hailand)|A(?:(?:n(?:tigua and Barbud|dorr|gol)|(?:l(?:ban|ger) +|ustr(?:al)?)i|r(?:gentin|meni))a|(?:fghanist|zerbaij)an)|P(?:a(?:l(? +:estinian Territories|au)|(?:pua New Guine|nam)a|kistan|raguay)|o(?:r +tugal|land)|hilippines|eru)|N(?:e(?:therland(?:s Antille)?s|w Zealand +|pal)|i(?:ger(?:ia)?|caragua)|or(?:th Korea|way)|a(?:mibia|uru))|G(?: +u(?:inea(?:-Bissau)?|(?:atemal|yan)a)|e(?:orgia|rmany)|re(?:nada|ece) +|a(?:mbia|bon)|hana)|E(?:(?:(?:quatorial Guin|ritr)e|(?:thiop|ston)i) +a|(?:(?:l Salv|cu)ad|ast Tim)or|gypt)|L(?:i(?:(?:b(?:eri|y)|thuani)a| +echtenstein)|e(?:banon|sotho)|a(?:tvia|os)|uxembourg)|U(?:nited (?:St +ates of America|Arab Emirates|Kingdom)|zbekistan|kraine|ruguay|ganda) +|D(?:e(?:mocratic Republic of the Congo|nmark)|ominica(?:n Republic)? +|jibouti)|I(?:r(?:a[nq]|eland)|nd(?:ones)?ia|celand|srael|taly)|K(?:( +?:azakh|yrgyz)stan|iribati|osovo|uwait|enya)|R(?:(?:(?:oman|uss)i|wan +d)a|epublic of the Congo)|H(?:o(?:n(?:g Kong|duras)|ly See)|ungary|ai +ti)|V(?:enezuela|anuatu|ietnam)|J(?:a(?:maica|pan)|ordan)|F(?:i(?:nla +nd|ji)|rance)|Z(?:imbabwe|ambia)|(?:Yeme|Oma)n|Qatar))

    BTW, you will not find "United States" with this regex since the official name is "United States of America".


    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://985260]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2018-06-25 07:06 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.