http://www.perlmonks.org?node_id=154575


in reply to Estimating Vocabulary

Well I suppose that depends on your defintion of word. am, are, is, was - are these each words? Also IIRC the English language is purported to have a lexicon on the order of 320,000 words*. The average American vocabulary has been in steady decline since the early twentieth century at which point I believe it was on the order of several thousand words*. A few things to consider:
  • dictionaries may contain archaic forms
  • does your dictionary contain proper nouns? do you care?
  • the content of the language is not evenly distributed across the lexicon, e.g. a single word (sans modifiers) for "love" and a plethora for shades of blue.
  • * I shall attempt to find evidence to support this. An enlightening thread, but then again it is usenet... Apparently this is a pretty hotly contested topic.

    --
    perl -pe "s/\b;([st])/'\1/mg"