Re: Determining gender based on first name

I'd never heard of Text::GenderFromName until reading your post, but skimming the docs has shown me that it has 2 very clearly documented "American" biases...

The raw data is based on US SSA sampling
It uses Text::DoubleMetaphone as a fall back in some cases

You can solve the first bias by using your own list of of names -- i'm sure someone somewhere online has a (free) list of common Spanish names ... you can just assume any name in only one list has a weight of "1" and if a name is in both lists, eyeball it and guess a weight based on your personal opinions.

The second bias may not actually be that bad (I don't know how well the Double Metaphone algorithm does with Spanish names) but it can easily be turned off (the perldoc's even have an example of doing this) giving you just the simple weighted comparison.

(of course, if you are providing your own name data, and not using metaphones, you are basically just using it to do two hash lookups and pick the one with a higher value ... which is about 2 lines of code)

Comment on Re: Determining gender based on first name

Replies are listed 'Best First'.

Re^2: Determining gender based on first name
by bobdole (Beadle) on Jan 02, 2008 at 23:02 UTC

[reply]


Don't ask to ask, just ask
	PerlMonks