The Grammarphobia Blog

Lex appeal: Does size matter?

Q: How many words do most native English speakers know? Do Brits know more than Americans? How many do language mavens know? How about Shakespeare, Samuel Johnson, etc.? And what about age or educational level?

A: We’re afraid this will disappoint you. Many of your questions are impossible to answer. And even if we could contrive numbers for you, they wouldn’t be very meaningful.

We had a post on our blog a few years ago about the difficulty of counting words and comparing the lexicons of different languages.

The linguists Robert P. Stockwell and Donka Minkova, who discuss this in their book English Words: History and Structure (2001), write:

“A question which everyone wonders about, and often asks of instructors, is ‘How many words does English have?’ And even more commonly, ‘How many words does the typical educated person know, approximately?’ There are no verifiable answers to these questions.”

They do say that Shakespeare is known to have used about 30,000 different words in his plays, and that “a really well-educated adult” may have a vocabulary of up to 100,000 words—“but this is a wildly unverifiable estimate.”

As for the size of the lexicon, they conclude: “Nobody knows how many words English has.”

The linguist David Crystal said more or less the same thing in a 1987 article in the journal English Today:

 “How many words are there in English? And how many of these words does a native speaker know? These apparently simple little questions turn out to be surprisingly complicated. In answer to the first, estimates have been given ranging from half a million to over 2 million. In answer to the second, the estimates have been as low as 10,000 and over ten times that number.”

We can tell you that the biggest English dictionaries have about half a million words, but that’s no help because dictionaries are selective.

The editors at Oxford Dictionaries Online and Merriam-Webster’s Online discuss the difficulties of counting the number of words in English.

The principal problem in coming up with a number is which words to count. Are “do” and “does” two separate words? How about “doing,” “doer,” “don’t,” and “undo”? What about “cat” and “cats,” not to mention “catlike,” “catty,” and “anti-cat”?

That 30,000-word estimate for Shakespeare, as Stockwell and Minkova say, would drop to “about 21,000 if you count play, plays, playing, played as a single word,” and do the same in similar cases.

Do features like prefixes (“anti-,” “re-,” “un-,” etc.) and suffixes (“-ly,” “-er,” “-ing”) swell the number of possible words we count? Is a word with two meanings (say, “cleave”) counted as one word or two? Should we count symbols, acronyms, initialisms, spelled-out numbers? The questions go on and on.

We’ve also found varying statements about the number of words the average person knows or uses.

In their book Theory of Language (1999), the linguists Steven Weisler and Slavoljub P. Milekic estimate that “an average-educated English-speaking adult knows more than 50,000 words.”

But they say a person’s “lexical capacity” is larger. As current events and new technology create the need for new language, the authors write, “English-speakers are free to make up new words and to create new uses of existing words at the spur of the moment.”

You ask about age and educational level and how they affect vocabulary. Here’s what the British language writer Michael Quinion says on his website World Wide Words:

“It’s common to see figures for vocabulary quoted such as 10,000-12,000 words for a 16-year-old, and 20,000-25,000 for a college graduate. These seem not to have much research to back them up.”

So much for vocabulary size. But how many of those tens of thousands of words do we actually use? According to the Collins Corpus, an analytical database of English, “around 90% of English speech and writing is made up of approximately 3,500 words.”

That doesn’t sound like a lot, but let’s call it a day. We’ve run out of our daily quota of words.

Check out our books about the English language