It seems like the biggest hurdle is getting a list of countries. To overcome this, you could use Locale::Country to do: @country_names = all_country_names();
in reply to Perl Module for identifying country name
Then you could
do something like put the countries in one hash and the words from the file in another hash and look for overlapping keys. use List::Compare:
$lc = List::Compare->new( \@country_names, \@words_in_file );
@countries_in_file = $lc->get_intersection;
Edit: Now that I think about it a little more, it might be better to use your array of country names to grep through your file contents (after replacing new lines with spaces) to avoid issues with multi-word countries names.