Home Up Questions?




SPELLING CHECKER




James L. Peterson




This paper has been published. It should be cited as

James L. Peterson, ``Spelling Programs'', Concise Encyclopedia of Computer Science, Edwin D. Reilly (Editor), John Wiley & Sons, West Sussex, England 2004, pages 716-717.


SPELLING CHECKER

For articles on related subjects, see

In addition to supporting conventional word processing, a computer can provide more advanced assistance such as spelling checking, spelling correction, and grammar checking. The general operation of a spelling checker is simple: it checks each word in a document or file for correct spelling. Allegedly incorrect spellings are reported to the human user, who can then correct the errors. A spelling corrector checks each word (just like a spelling checker) and, in addition, will try to suggest the correct spelling for each misspelled word that is found. Spelling checkers can only guarantee that a word is some correctly spelled word, not necessarily the one you meant. If "or" is mistyped as "of or "affect" is misused for "effect," a spelling checker will not report an error. Trying to detect these kinds of error requires a much more complicated program called a grammar checker.

An interactive spelling checker reads a document and presents each spelling error to the user as it is found. The user can see each alleged error in context and either change it immediately or leave it alone. A word processor (or email formatter or Web page compositor) may incorporate spelling checking (or correction) directly into its processing, either in response to a menu option or continually as text is entered.

Most checkers use a word list to define the set of correctly spelled words. The word list may be the list of all words in a dictionary (without the definitions) or may be accumulated from existing documents. As long as its word list has no incorrectly spelled words, a checker will never "miss" an incorrectly spelled word that is not, coincidentally, some other legal word. On the other hand, spellers often report correctly spelled words as possible errors. These may be proper names, technical terms, or uncommon words that are not in the system word list. Most systems allow a user to augment its main word list with local auxiliary word lists for special subjects, authors, or documents. Doctors and lawyers, for example, generally use extensive auxiliary word lists designed for the specialized vocabularies or their fields.

A spelling corrector is invoked when an incorrectly spelled word is found. Its problem is to produce a list of possible correct spellings for the error. For correction, the set of correctly spelled words is thought of as a set of points in a multidimensional space. The corrector tries to find the nearest neighbor or neighbors of the spelling error in that space. If an error produces one candidate correction that is much closer to the error than other possible corrections, the speller may suggest an automatic correction.

A common typing error is to repeat an entire word, especially at the end of one line and the beginning of the next, creating such obvious errors as "a a" and "the the." Some checkers check for duplicate adjacent words, but would then report spurious errors in those sentences with validly repeated words, such as "I knew that that boy had had the measles."

Grammar checkers try to find errors in sentences or phrases rather than separate words. While it is possible to look for certain simple errors (incorrect use of "a" or "an," capitalization errors, or use of certain incorrect word combinations), the general problem of detecting true errors of grammar is still a research problem. The output of current grammar checkers that attempt to criticize writing style or to assess intrinsic readability are often laughable.

Bibliography

James L. Peterson
Home   Comments?