Is Google a linguistic hit?

Q: You repeatedly use Google frequency as an indicator of word value. On July 11, for example, you wrote that “insoluble” is more popular than “indissoluble” because it gets umpteen times as many hits. Why is Google now the language arbiter? Because it’s of linguistic value? Or because you (like most of us) find it’s easy?

A: By no means is Google a “language arbiter” or “an indicator of word value.” And we never say as much.

We consult Google for only one reason: it can provide evidence of written usage. If a particular usage gets a million-plus hits, this is certainly evidence that it’s in common use.

It’s possible for a language researcher to fine-tune a Google search and get some very interesting data.

For example, using the Google Ngram Viewer, you can research the use of a particular word or phrase in books (which, unlike the Internet as a whole, are edited).

You can then compare those results with the phrase’s frequency on the Internet as a whole. This would give you a rough idea of the phrase’s frequency in common usage as opposed to edited publications.

And Google Books is a very handy repository of searchable older (and sometimes newer) books.

We use it almost every day to find early examples of a usage or to expand on citations from the Oxford English Dictionary.

Google Timeline is another useful tool to find early examples of a usage, though it’s a bit clunky and you have to verify each citation before using it.

A recent search for the earliest appearance of the word “television,” for example, produced an inconceivable 1880 hit. Why?

It turns out that the word “television” and the date “1880” both appeared in a 1992 article in the Baltimore Sun about an upcoming biography of H. L. Mencken:

“He was born in 1880, when Geronimo was at his height, and he died after we had television and radio.”

Getting back to your question, does Google have linguistic value? In our opinion, it does.

