WO2000057291A1

WO2000057291A1 - Spelling correction method using improved minimum edit distance algorithm

Info

Publication number: WO2000057291A1
Application number: PCT/US2000/000260
Authority: WO
Inventors: Mark Kantrowitz
Original assignee: Justsystem Corporation
Priority date: 1999-03-24
Filing date: 2000-01-06
Publication date: 2000-09-28
Also published as: AU2492200A

Abstract

A computer method of spelling correction comprises the steps of: a) storing a dictionary of valid words, b) for each input string to be checked comparing the input string to words in the stored dictionary to identify input strings not in the dictionary, c) for each input string not found in the preceding step, generating test words by a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution, d) comparing the edited input string generated in the preceding step with words stored in the dictionary and e) generating a candidate word or candidate list of the words.

Description

SPELLING CORRECTION METHOD USING IMPROVED MINIMUM EDIT DISTANCE ALGORITHM

BACKGROUND Current spelling correction software detects nonword spelling errors by checking whether the word or text string appears in a dictionary of valid words. Once a misspelled word is detected it is either automatically corrected or a candidate list of possible corrections is displayed. Algorithms for selecting the correction or displaying a candidate list of possible corrections use a word similarity metric to measure the distance from the misspelled word to words in the dictionary. The closest matches are treated as candidates. The most popular word similarity metric is minimum edit distance, that is, the minimum number of insertions, deletions, transpositions and substitutions required to transform the misspelled word into a valid word. Computing the edit distance to every word in the dictionary is time consuming. To reduce the number of required comparisons, candidate generation algorithms typically partition the dictionary according to word length and the first two letters of the word. Edit distances are only calculated for selected dictionary partitions. Stepping through the dictionary partition, each word is compared to the misspelled word and the edit distance therebetween is calculated. Now the dictionary partitioning used with standard edit distance leads to a reduction in accuracy. For example, the partitioning on the first letter means that it cannot correct errors that occur in the first letter (about 7% of all spelling errors) .

There is another approach. Reverse minimum edit distance is a candidate generation algorithm which applies possible edits to the misspelled word and then compares the edited word to words in the dictionary to discover which words are within a given number of edits from the misspelled word. For an n-letter nonword, there are 25n possible substitutions, 26 (n+1) possible insertions, n possible deletions, and n-1 possible transpositions for a total of 53n + 25 possible edits. For a seven letter nonword, that means a total of 396 possible words just for an edit distance of one. For an edit distance of two, the number of possible words goes up by the square yielding 156,816 possible words (not counting the edit distance one possibilities) . This is a much more time consuming algorithm than candidate generation algorithms based on word similarity metrics described in the preceding paragraph. Hence, modern word processing programs do not use reverse minimum edit distance algorithms. The standard minimum edit distance algorithm is generally preferred over the reverse minimum edit distance algorithm. The standard minimum edit distance algorithm computes the edit distance between the misspelled word and every word in the applicable dictionary partition. The number of minimum edit distance calculations is equal to the number of words in the partition. The cost of computing edit distances is only manageable because the set of potential corrections is limited. The reverse minimum edit distance algorithm applies all possible edits at the distances 1 or 2 and so on to a misspelled word blindly generating a large list of candidates each of which must then be tested against the valid dictionary. The number of candidates generated and the dictionary references required is normally considered prohibitive. Reverse minimum edit distance was described by

Ralph E. Gorin, in SPELL: Spell check and correction program, Stanford University, 1971. His implementation was limited to edit distance one. He applied all possible single errors (insertions, deletions, substitutions and transpositions) to the input string and proposed as candidate corrections the results that yielded valid words.

Mor and Fraenkel in "A hash code method for detecting and correcting spelling errors", Communications of the ACM 25 (12) , pp. 935-938 (1982) disclose a hashing method for efficiently retrieving all words within edit distance one of the misspelled word. The hash table is too big to be practical, however. Mays, Damerau, and Mercer in "Context based spelling correction", Information Processing & Management

27(5) , pp. 517-522 (1991) disclose reverse minimum edit distance with an edit distance of one to test an algorithm for valid word spelling correction.

Kernighan, Church and Gale in "A spelling correction program based on error frequencies", Proceedings of COLING-90, 2, pp. 205-210 (1990) used a corpus of spelling errors for which only a single correction existed to compute the frequency of occurrence for every correction and used these to rank candidate corrections.

It is an object of the present invention to implement spelling correction with a reverse minimum edit distance algorithm which is not nearly as time consuming as prior algorithms of this type, which allows for more complex edits than those used in previous word similarity metrics and reverse minimum edit distance algorithms, for example, long-distance transpositions and larger substitutions, and which actually yields more accurate results.

SUMMARY OF THE INVENTION Briefly, according to this invention, there is provided a computer method of spelling correction which comprises a step for calculating minimum edit distances using a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution. The restricted set of edit operations consists of only the most common edits

(generally at distance 1 or 2) required to correct errors based upon a training corpus of documents with uncorrected spelling errors. However, the set of edits may also include common complex edits such as long-distance transpositions, multiple letter corrections and missing space errors . According to one embodiment of this invention, a computer method of spelling correction comprises the steps of: a) storing a dictionary of valid words; b) for each input string to be checked, comparing the input string to words in the stored dictionary to identify input strings not in the dictionary; c) for each input string not found in the preceding step, generating test words by a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution; d) comparing the edited input string generated in the preceding step with words stored in the dictionary; and e) generating a candidate word or list of candidate words from edited input strings that are found in the dictionary. The members of the restricted set of edit operations are selected based upon a training set of the most common spelling errors. The members of the restricted set of edit operations may be selected based on the letter n-grams containing more than the letter or letters to be edited. A unique feature according to this invention is the use of edit operations that consist of only the most common edits to correct errors and at the same time allow more complex edits than used in prior algorithms, although these more complex edits relate to common errors.

According to another embodiment, the edit operations are restricted to distance one and if no valid edited input strings are found at edit distance one, allowing edits at distance two. According to another embodiment, the edit operations are restricted to distances one and two. According to yet another embodiment, all possible edits are allowed if no valid edited input strings are found at edit distances one or two. Preferably, the edit operations include long-distance transpositions, multiple letter insertions, multiple letter substitutions, multiple letter deletions and missing space errors at edit distance one. The substitution edits may include non- alphabetic characters.

The dictionary may be stored in a data structure selected from hash tables, binary trees, or tries, for example. The candidate list may be sorted by combinations of word length, word frequency or error frequency. According to one preferred embodiment, a search is made for missing space errors by testing complementary portions of a nonword for being valid words with a frequency above a given threshold. Particularly useful applications of the computer methods disclosed herein are spelling correction in text files (documents), command lines and query statements.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The preferred computer method according to this invention comprises testing an input string against a dictionary to determine if it is a valid word. If the input string is a nonword because it is not found in the dictionary, for example, because it is misspelled or two words run together, a reverse minimum edit algorithm is implemented to find every word that is edit distance one away from the input string where the possible edits are limited to only those that are common spelling errors.

Spelling errors are considered common, for example, based upon experience and/or a statistical study of errors found in a corpus of documents that have not had spellings corrected. The corpus of documents used to identify common spelling errors is preferably selected from documents relating to the specific academic or business field in with which this reverse minimum edit algorithm is used. Moreover, the corpus of documents may be typist specific. If a valid word is not found at edit distance one, the next step is to look for valid words at edit distance two. If a valid word still has not been found, a search is made for missing space errors (two words run together) . The final step is to return a correct word or a list of possible correct words. This method has a number of applications ranging from correcting words provided in the command line to correcting errors in a text document.

More specifically, the computer method according to this invention involves a number of substeps . It first classifies the case of an input string as uppercase, lowercase, initial-caps or "McDonald" style and then converts the string to all lowercase letters. The original case is later restored to the corrected word. The lowercase string is then tested for membership in a dictionary. If the string is found in the dictionary, but only in non-lowercase, the case of the input string is changed to match that in the dictionary. If the string matches a word in the dictionary, it is accepted as correct. If the string is not present in the dictionary its case will be applied to corrections, except if the input string is lower case and the correction is not lowercase .

The reverse minimum edit distance algorithm for edit distance one then iterates over the letters of the input string, attempting at each position to find a correction at edit distance one away. It does this by applying each allowable edit to the input string at that position and checking whether the result is a word in the dictionary. Allowable edits are a subset of all possible edits chosen to correct common spelling errors. If a valid word is found it is put in a candidate list.

The edit distance two reverse minimum edit distance algorithm is similar, but after making the first edit to the string, it repeats the process on the resulting string looking for another possible edit starting after the current position. This is more efficient than the naϊve method, which would apply every possible edit to the resulting string using the code implemented for edit distance one. There is no need to check for edits before the current position because they will have been checked in previous iterations. There is no need to check for edits at the current position since they would undo or replace edits just completed.

When checking for missing space errors, the input string is split into words with no less than three characters and tested for each of the words to have a frequency of occurrence above a certain threshold. The frequency of occurrence information is computed using a training corpus. Essentially, a large collection of documents is assembled and the frequency of occurrence of every word in the collection is computed. The collection of documents could be a set of documents from the user's academic or business field, a generic set such as a large collection of newswire articles, or even generated from the user's own past writings. The frequency information may be used in several places including when sorting candidate corrections . The set of allowable edits may be selected using a program that analyzes a corpus of spelling errors and their corrections to identify the frequency of all single edits present in the corpus. The source code for testing the analysis program is included immediately before the claims. For example, the analysis program for a particular corpus of documents tested found the following substitutions for the letter a.

A 1 c 1 e 344 h 1 i 195

1 2 n 1 o 90 r 2 s 22 t 1 u 17 v 1 y i z 1

From this we see that the letter a is most often improperly replaced with the letters e, i, o, s and u. If we eliminate all substitutions with their frequency count of 1 or 0 , we are left with 7 transformations instead of

25, a three-fold reduction. In some cases, we allowed low frequency substitutions if they involved adjacent keys on the keyboard. Overall, this resulted in 181 substitutions for all letters, instead of the original 650. Weighting the letters by frequency of occurrence yields a weighted average of 8.65 substitutions per letter. This should result in a three-fold speedup.

The algorithm disclosed includes substitutions for non-alphabetic characters, such as replacing a semicolon with the letter 1 and the digit 3 with the letter e or the digit 5 with the letter s.

Of course there are other edits. The most frequent deletions were: e, i, 1, s, t, r, n, a, o, u, c and m. The most frequent insertions were e, s, i, n, r, t, 1, p, g, a, c and space. The most frequent transpositions were ei, ie, le, re, ne, el, ro, er, al , na, it and si. The most frequent larger substitutions were as follows: y for ie te for ght f for ph ie for y urns for a e for ia al for le

Larger substitutions were also found to be useful in improving accuracy of the edit distance one algorithm.

All told, the algorithm includes 65 common spelling patterns, such as the prefix un becoming im before p.

Others capture the confusions in words with double letters.

For example, misspelling beginning as beggining is equivalent to substituting gi for in. The restrictions on permitted edits can be limited not just on the letters affected by the edits, but also on zero or more letters of context on either side of the edit. For example, the ie-->ei transposition is a common edit. In a simple implementation, transpose ("ie") would be an allowed edit and the computation would proceed, but we could, if we wished, restrict whether this edit was allowed based on the context in which it appears. For example, we might only allow it if the previous letter was a "c". Thus, instead of including ie --> ei as an allowed edit, we would include cie --> cei as an allowed edit. Similarly, the transposition ne - - > en could be restricted, if desired, to mnet --> ment .

Since long-distance transpositions are much less likely, these edits were limited to the exchange of vowels around consonants d, g, 1, n, r, s, t, v and the exchange of the consonants 1, m, n around a vowel. The set of possible edits is restricted to the most common edits and more complex edits are added in order to improve the efficiency of reverse minimum edit distance by limiting the number of generated candidates. The limited set of edits could be applied to other spelling correction algorithms.

For example, if the limited set of edits and the more complex edits are used with standard edit distance algorithms, it would have the following consequences. The cost of computing the distance between two words would be reduced. Standard minimum edit distance algorithms are driven by the letters in the words being compared, so when they consider an edit, they know exactly what letters are involved. They do need to consider different possible edits. For example, if the current position in the misspelled word starts with an R and the dictionary word starts with a P, it could be that the R is an insertion (e.g., if the next letter after the R is a P) , or it could be a transposition, a substitution or a deletion. Each possibility leads to a branch in the minimum edit distance computation. (The computation increase is quadratic, not exponential, due to the use of dynamic programming, but there is still extra computation for each such branching point . ) Some of the branches may be pruned by considering only the most common edits, as with the reverse minimum edit distance algorithm. For example, since a P/R substitution is not very common, that possibility can be skipped. The same kind of restricted set of edits can be used with standard edit distance algorithms. Moreover, if the number of edits is cut by a factor of three, that leads to a significant speedup in computing the distance between two words. Since the edits are limited to only the most common, there is a reduction in the number of words that will be considered as close, but the set of close words will be the same as the set generated with the reverse minimum edit distance algorithm disclosed herein, so the same increase in accuracy applies here as well. Moreover, if no close candidates are found, it is not necessary to recompute everything from scratch in order to allow all possible edits to find a close match. The method simply backtracks to the point where edits were disallowed by saving the partial computation, thereby avoiding the need to start from scratch. Statistics on the number of times each uncommon edit was disallowed could be kept and used to prioritize which uncommon edits to allow first.

Various data structures may be used to store the valid word dictionary including binary trees, hash tables and "tries". The latter is a data structure described in detail by Donald Kuth in The Art of Computer Programming, Vol . 3 , (Addison Wesley) .

The source code immediately before the claims contains the complete listing of two programs written in the Perl language which is described, for example, in

Learning Perl 2nd Edition by Schwartz and Christianson

(O'Reilly & Associates Inc. 1997) . One listing implements the reverse minimum edit algorithm as disclosed herein and the other permits statistical testing of a corpus of documents to identify common spelling errors.

The method according to this invention demonstrates a speed increase of 13 to 26% for edit one distance and a speed increase of 44 to 50% for edit two distance. The edit one distance method is fast enough to be useful for correcting the spelling of documents and queries in an information retrieval system. The method according to this invention increases the number of cases in which there is only one correction in the candidate list and the percentage of those for which this unique candidate is the correct correction. If there is more than one candidate, sorting the list by word length, word frequency and the frequency of the edit tends to move the correction to the top of the candidate list. The method recognizes all of the nonword errors by checking whether the word is present in a valid dictionary.

The method according to this invention demonstrates a first guess accuracy of about 75%, far beyond the state of the art. When only one candidate correction was proposed by the algorithm, the first guess accuracy improved to about 95%. The speed and accuracy of the algorithm when there is only one candidate correction makes it possible to use it for automatic substitution of corrections as the user types.

The edit distance metric is usually implemented using dynamic programming (bottom-up) or memorization (top- down) with the following recursion: edist (wordl, il, j 1, word2 , i2 , j2) = if (wordl (il,jl) = word2 (i2 , j2) ) { return 0 } else { return min { edist (wordl , il, kl ,word2 , i2 , k2) + cost (subst , wordl , kl , kl+1 , word2 , k2 , k2+l) + edist (wordl , kl+1, j 1 , word2 , k2+l , j2) ,

edist (wordl, il,kl,word2, i2,k2) + cost (delete, wordl, kl, kl+1) + edist (wordl , kl+1 , j 1 , word2 , k2 , j 2 ) ,

edist (wordl, il, kl , word2 , i2 , k2) + cost (insert , wordl, kl, word2 , k2 ,k2+l) + edist (wordl, kl, j l,word2 , k2+l , j2) , edist (wordl , il , kl , word2 , i2 , k2 ) + cost (transpose, wordl, kl , kl+2 , word2 , k2 , k2+2) + edist (wordl, kl+2, j 1, word2 , k2+2 , j2)

}

for all kl such that il <= kl <= jl and for all k2 such that i2 <= k2 <= j2 }

where the above is computing the minimum edit distance between the portions of wordl and word2 designated by indices il to jl and i2 to j2, respectively. The simplest edit distance implementation has the costs set to 1 for nontrivial edits (e.g., substituting P for R) and 0 for trivial edits (e.g., substituting P for itself) . More complex edit distance algorithms will use other cost figures to reflect the frequency of a given edit, for example .

The overall structure of the algorithm above is to split the input and target words each into three parts: the part containing the potential edit, the part before the edit and the part after the edit. The parts before the edit are compared recursively using the same algorithm and likewise for the parts of the edit, and the resulting scores are added to the score for the current edit to compute an overall score for that edit, and the minimum score over all possible types of edits at all possible positions is returned as the result. Although this may seem computation intensive, efficiencies are gained because much of the computation overlaps. Saving partial computations makes the resulting algorithm quadratic instead of exponential.

The restricted set of edits may be applied to this algorithm as follows. First, additional clauses are added to the min list corresponding to the more complex edits. The form of the clauses is similar. In fact, all edits may be treated as just different complex substitutions. For example, transposing "i" and "e" in "wierd" could be thought of as substituting "ei" for "ie". All insertions, deletions, substitutions and transpositions, as well as our more complex edits, are nothing more than substitutions of one n-gram for another. Thus, we minimize edist (wordl, il,kl,word2, i2,k2) + cost (subst , wordl, kl, kl+m, word2 ,k2 , k2+n) + edist (wordl, kl+n, j 1, word2 ,k2+m, j2) over the range of values of m and n. Standard edit distance allows m and n to each be 0 or 1 or 2 (with 2 restricted to cases where the two substrings are transpositions of each other) . More complex edits might allow m and n to be 3 or even 4. Second, the summand is only computed when the substitution described by the cost line is one of the restricted set of common edits. Thus, if m and n are both 1 (a substitution) , we check whether the substitution is one of the list of common substitutions before doing the recursive edist computations for the parts before and after the edit. (The recursive computations are the expensive part.) A control statement is added into the minimization process, doing a test for the edit corresponding to each of the summands before executing the recursive sums. By restricting the allowed edits to the most common (about 1/3 of the possible edits) , the test succeeded only 1/3 of the time. This will lead to a factor better than three speedup .

The way one calculates the speed of standard edit distance is to realize that the recursive process is essentially filling a table based on all possible values of the indices il, jl, i2 and j2. The running time of the algorithm is the size of the table. Other common optimizations can avoid the need to fill the entire table (e.g., if only words expected to be within edit distance 3 are compared, the words can be processed iteratively instead of recursively, leading to a semi-linear algorithm) . When the possible edits are restricted, the amount of computation is cut down by a factor of 3 to 4. (If all misspelled words involved only one error, the reduction would be a factor of three speedup, but since some misspelled words involve multiple errors, the speedup in those cases is greater. Assuming that 80% of all spelling errors involve a single edit, this means an estimated speedup of a factor of . )

When computing the minimum, a variable is maintained with the current minimum value. Call it minval . The first time a recursive computation is performed, minval is set to the result. Every subsequent time, the result is compared to minval. If it is lower than minval, minval is set to it . All of the possible ways of decomposing the computation are iterated and at the end, the then current value of minval is returned as the result of the edist computation.

Before another summation for possible comparison with minval (each iteration) , the edit under consideration is first compared with the set of allowed possible edits.

(There are many possible ways to compare an edit with a list of possible edits. Representing the list of possible edits as a linear list would have a running time equal to half the number of possible edits. A much better representation would be a binary tree which would have a running time equal to the length of the longest edit, effectively a constant.) If the edit is allowed, the summand is computed, including the recursive edist computations. If the edit is not allowed, the program skips to the next iteration.

Having thus defined the invention in the detail and particularity required by the patent laws, what is desired protected by Letters Patent is set forth in the following claims. spell.pl 1/35 pJSpellGrams/revminedist/ 98/04/21

# ! /usr/local/bin/perl

# Written by Mark Kantrowits, Research Scientist, nikant3jprc.com.

# Start date: April 7, 1998.

# Lasc modification: April 16, 1998. This is a variation on reverse minimum edit distance that limits TΓ the collection of edits to only the most frequent:. The indention

# is to significantly speed up the algorithm. However, ir: also

# winds up improving the accuracy as well, a counter-intuitive result. r Average # of substitutions is 6.96 (average) or 8.65 (freg weighted).

# This is compared to 26 for full minimum edit distance. So a factor o; 3-4 speeduc #

# 3994 substitutions 24.8%

# 3674 deletions 22.8%

# 6191 insertions 38.5%

# 2225 transpositions 13.8%

# 16084 TOTAL

# So 83.3% of errors in the cor us are single errors. #

# 19298 total errors in corous . #

# Assumes 5 seconds of startup time (to load dictionaries) .

# edist= =1 edist= =2

# NAIVE 57.0% (19. .7 ms) 59.5% (702 ms)

# SMART 59.8% (14, .7 ms) 62.7% (372 ms)

# SMART1 60.6% (15, .1 ms) 63.5% (348 ms)

SMART2 61.1% (17. .0 ms) 64.0% (374 ms)

# SMART3 61.6% (17. .3 ms) 64.2% (373 ms)

SMART4 61.6% (17. .3 ms) 64.2% (388 ms)

# SMART2 fixed a few slight bugs in SMART1 (e.g., insertions as last letter)

# SMART3 added long distance transpositions and longer substitutions.

# SMART4 is like SMART3, but adds them to edist=2 as well.

The following table shows the overall accuracy for errors for which the algorithm provided a unique correction. edist=l edist=2

SMART1 94.1% 92.4% SMART2 94.4% 93.0% SMART3 94.9% 93.5%

# Set this variable to 1, 2, 3, or 4 to choose the appropriate version. $version = 3 ; Set loose_count to 1 to count membership, firscguess to 1 to count membership where the correction is the first in the list.

# For example, ΞMART1 with edist=I contains the correct correction 32% if of the time, and has an overall first-guess accuracy of 73.9%. The 60.5%

# accuracy figure is the percentage of errors for which SMART1 comes up

# with the correction as the only answer. need to rerun tne overaj..

# first-guess stats with other sorting orders, to see :r.ev can imcrove 2/35 spell.pl

98/04/21 p.7SpellGrams/revminedist/

the accuracy to the theoretical maximum of 82%. Also get first-guess

# accuracy scores for edist=2 and the other versions.)

$loosa_cσunt = 0; Sfirstguess = 0;

# If this variable is set to 1, it forces the edit distance 2 computation,

# even if it finds a correction at edit distance 1. $force_e2 = 0;

# TODO:

# 0. Integrate into ps2ascii . #

# 1. When there is more than one correction, they are currently

# sorted by frequency, with word length disambiguating between

# equal frequency. Perhaps they should be sorted first by length,

# second by frequency. if 2. Replace hash table with a trie? a. ft 3. Review code for possible efficiency hacks.

# 4. Test sensitivity of algorithm to dictionary size. if 5. Review the list of the unique corrections that it gets wrong.

# USAGE: * π spell.pl -d -v4 -stats -e 2 -split

# -d Turns on debug mode. When measuring performance on the

# error corpus, prints a list of the misses. When doing

# generic spellcor, returns all matches when there's more

# than one.

# -stats Measures performance on the error corpus. #

# -e #

# If # is 2, does edist of 2, otherwise edist of 1. #

# -split If present, also looks for word boundary errors.

# -f file Checks spelling of every word in a file.

# -v # Version of code to use. Default = 4.

# Typical invocations :

# spell.pl -d -e2 -split recieve

# spell.pl -stats

# spell.pl -stats -e2

# spell.pl -f tmp.txt

$debug = 0;

Sstats = 0;

Sstatsfile = "errors.txt",-

$edist2 = 0;

Ssplitw = 0;

Sfi derrs = 0;

$finderrfile = " " ;

Scmdline = 0; spell.pl 3/35 pJSpeϋGrams/revminedist/ 98/04/21

# This variable controls whether the frequencies are updated S using the frequencies from the document. If a docword is if in the dictionary, it increments the count. If a docword is TΓ not in the dictionary and is not found to be an error, it is π added to the dictionary. This will enable the program to

# correct misspelled versions of the document's words. #

# Note that we're using a greedy approximation. What we should

# be doing is compiling frequency statistics on the document

# words which don't appear in the dictionary, cross-compare the words in the list, and for any pair pick the more frequent as

# the correct spelling. Then do the spellcor. But this approximation

# is easier to imolement. #

# Also, we should be increasing the dictionary frequency of the

# corrections, but that isn't important for this application.

Sfrequpdt = 1;

# This controls whether file spell checking is conservative in

# the treatment of words with an initial capital or not. $conservcase = 1;

# This controls whether it counts the number of unique corrections if that were not the correct correction.

$unique_accuracy = 1;

# This controls whether it prints out the unique corrections that % were incorrect.

$print_unique_fail = 0; while (SARGV) {

$arg = shift SARGV; # print "$arg\n"; if ($arg =- /^ΛW) { if ($arg =- /^Λ-d/i) {

$debug = 1; } elsif ($arg =- /^Λ-stats/i) {

$stats = 1; } elsif ($arg =- /^Λ-split/i) {

$splitw = 1; } elsif ($arg =- /^Λ-e/i) C if (length ($arg) == 2) { $arg2 = shift SARGV; } else {

$arg2 = substr (Sarg, 2 ) ; } if ($arg2 eσ "2") {

$edist2 = 1; } else {

$edist2 = 0; } ; elsif (Sarg =- /^Λ-v/i) { if (length (Sarg) == 2) { $arg2 = shift SARGV,- } else (

$arg2 = substr (Sarg, 2 ) ; 4/35 speil.pl

98/04/21 p:/SpellGrams/revminedist/

} if ($arg2 eq "1") {

$ ersion = 1; } elsif ($arg2 eq "2") {

$version = 2; } elsif ($arg2 eq "3") C

$version = 3; } elsif ($arg2 eq "4") {

Sversion = 4; } else {

Sversion = 4; } - print "Running version Sversion. \n" ; } elsif ($arg =- /^Λ-f/i) { $arg2 = shift SARGV; $finderrs = 1; $finderrfile = $arg2; } } else [ unshift (SARGV, ($arg) ) ; last; } } Scmdline = 1 if (!$finderrs &ά !$stats);

# print "Scmdline Sdebug $finderrs Sstats Sedist Ssplitw SARGV\n" ,-

# Load in the spelling dictionary, open (DICT, "words . tx " ) ; while (<DICT>) { chom ;

$word = $_;

Sword =- tr/A-Z/a-z/;

Sdic {Sword}=1; } close (DICT) ;

# The following variable is the minimum frequency required for

# each of the two parts of a split. This eliminates many spurious splits.

# To replace this constraint with a simple requirement that the two

# parts be words, set it to 1. $minsplitfreq = 10;

TΓ Load in the frequency statistics. These are based on the Tipster

# article word frequency statistics, culled of Tipster-specific

# aspects. But any frequency stats should do as well, open (DICT, " tf eq. txt" ) ; while (<DICT>) { chomp;

($count,$word) = split (Λt/) ;

Sword =- tr/A-Z/a-z/,-

$dict{Sword} += Scount; } close (DICT) ,- if ( Scmdline) ( foreach Sword (SARGV) {

( Scorr, Snumcor) = έspellco ( Sword) ; spell.pl 5/35 p:/Spe!!Grams/revminedist/ 98/04/21

print "$word —> $corr\n" ;

($finderrs) { open(ERRFILE, *$finderrfile" ) ; $las aos = 0; Sprev = * ^■ ,- while (<SRRFILE>) { chom ; foreach $word (spli (/\s+/ ) ) { if -(Sprev =- / [\ . \?\ ! ] $/ ) {

$lasteos = 1; } else {

$lasteos = 0; }

Sword =- s/\W+$//; Sword =- s/^Λ\W+//;

Stmp = Sword; $tmp =- tr/A-Z/a-z/ ; if (Stmp eq $word || Slasteos || ! $conservcase) if (SdictCStmp} == 0 &ά length($tmp) > 4) { ($corr,$numcor) = &spellcor (Sword) ; if (Sword ne Scorr &ά $corr ne " " ) { printf "%s —> %s\n". Sword, $corr; } elsif (Sfrequpdt) { $dict($tmp}+÷;

}

} elsif (Sfrequpdt && lengt ($tap) > 4) {

$dict{$tmp}++; }

Sorev = $word; } } } close (ERRFILE) ;

if (Sstats) {

Serrcount = 0; Ssuccess = 0; $ failure = 0; openfERRS, "$statsfile" ) ; while (<ERRS>) { chomp;

($l,$r) = split!/ --> /); # $tmp = $1; Stmp =- tr/A-Z/a-z/; $errcount++;

($correction, $numcor) = ispellco ($1) ; if ($correction eq $r | | ($loose_count && ( ($firstguess && Scorrection =- /^Λ$r/i) || ( !$firstguess άi Scorrection =- /Sr/i)))) { Ssuccess++; } else {

Sfailure÷÷ if ( $unigue_accuracy &i Sπumcor == 1) if ($print_unique_ ii ϋ Snumcor == 1) ( printf "$1 —> Sr (%s)\n", Scorrection; } elsif (Sdebug) [ printf "$1 —> Sr (%si\n", Scorrection; } 6/35 spell.pl

98/04/21 p:/Spe!!Grams/revminedist/

1

} close (ERRS) ; printf "Accuracy: %.3f (Ssuccess/Serrcount) \n" , Ssuccess/Serrcount; if (Sunique_accuracy) { printf "Unique Correction Accuracy: %.3f (%d/%d)\n",

Ssuccess/ ($success-5 ilure) , Ssuccess, (Ssuccess÷Sfailure) }

} sub spellcor { - local ($word) = θ_; local ($length, $i, $unmod_word, Stmp); local ($correction, θcorrections, $numcor, Scase); if (($word =- /^Λ[a-z-\_\;\'\^,\"\,\.\?\!\=\/]+$/i ||

Sword =- /^Λ[a-z-\_\;\'\^,\"\,\.\?\!\=\/]-\d[a-z-\_\;\'\¹\"\,\.\?\!\=\/]÷$/i II

Sword =- /^Λ[a-z-\_\;\'V\'\.\.\?\!\=\/]- Q$/i ||

Sword =- /^Λ\d[a-z-\;\'\^,\"\,\.\?\!\=\/]-^t$/i) Si

Sword !- A's$/i &£ Sword !- A'$/i) {

Snumcor = 0; Scorrection = ""; θcorrections = ();

Slength = length( Sword) ;

Scase = &id_case (Sword) ;

$unmod_word = Sword;

Sword =- tr/A-Z/a-z/; for ($i = 0; $i < Slength; $i++) {

&st>ellcor_pos ( Sword, $i, Slength) ; } if ($force_e2 || (Snumcor == 0 4i $edist2)) { for ($i = 0; $i < Slength; $i++) (

&SDeilcor_pos2 (Sword, $i, Slength) ;

} } if (Ssplitw &ά Snumcor != 1) { Stmp = 4split_word($word) ; if ($tmp ne $word) (

$correction = $tmp; push(θcorrections, Scorrection) ; Snumcor++; } } } else {

# print "$word slipoed through...\n" ,- } if (Snumcor == 1) { retur (Scorrection, Snumcor); } else C if (Sdebug == 1 | | ($loose_count &ά Sstats)) { return! join( " , " , sort (by_freq θcorrections )), Snumcor) ; } else [ return! $unmod_word, Snumcor) ; } } spell.pl 7/35 p:/SpellGrams/revminedist/ 98/04/21

} sub spellcor_pos { local (Sword, $i, Slength) = @_; local ( $left, Smiddle , $right, $m2 , $r2 ) ;

Sleft = substr ( Sword, 0, $i) ; Smiddle = substr (Sword, $i,l) ; Sright = substr (Sword, Si-¹-!) ;

# substitutions if (Smiddle. eq ";") {

&check($left, "\'",$right) if (Si < $length - 1) ;

&check(Sleft, " 1 ",$right) ; } elsif ($middle eq "\") {

&check($left, "V", Sright) if (Si < Slength - 1) ; } elsif ($middle eq "\_") {

&check($left,"\-", Sright) if ($i < $length - 1) ; } elsif (Smiddle eq "\$") {

&check($left, "s",$right) if ($i < Slength - 1 &£ $i > 0); } elsif (Smiddle eq "\=") £

&check($left, "\", Sright) if ($i < Slength - 1 && $i > 0) } elsif (Smiddle e "V") { fccheck ( Sleft, " ",$right) if ($i < Slength - 1 &ά $i > 0) } elsif (Smiddle =- /^Λ\α$/) { if (Smiddle eq "0") (

&check( Sleft, "o\ Sright) ;

} elsif (Smiddle eq "1") { &check ( $left, " 1' , Sright) ; &check($left, "i", Sright) ; &check($left, "e'.Sright) ;

} elsif (Smiddle eq "3") { &check($left, "e',$right) ;

} elsif ($middle eq "9") { &check($left, *o", Sright) ;

} } elsif (Smiddle eq "a") {

&check($left, "e",$right) ;

&check($left, "i",$right) ;

&check($left, "o',$right) ;

&check($left, "s",$right);

&check ($left,"u",$righ ) ;

&check($left, "z",$right) ; } elsif ($middle eq "b") {

4check($left, "d" , $right) ;

&check($left, "g",$right) ;

&check ($left , " h" , Sright ) ;

&check($left, "1", Sright);

4check($left, "n",$right) ;

&check($left, "p", Sright) ;

&check($left, "t" , Sright) ;

&check( Sleft, "v", Sright) ; } elsif (Smiddle e " c " ) ( icheck! Sleft, "d", Sright)

4check($laft, "e" , Sright)

&check( Slef , "g" , Sright)

&check(Sleft, "k", Sright) icheckt Sleft, "n" , Srighc) 8/35 spell.pl

98/04/21 p:/SpellGrams/revminedist/

&check(Sleft , Sright) ; &check(Sleft "t" , Sright) ; &check(Sleft -v" , Sright) ; &check(Sleft "X" , Sright);

} elsif (Smiddle eeqq "d") { &check(Sleft " "bb" , Sright) ; &check(Sleft " "cc" , Sright) ; &check( Sleft "e" , Sright) ; &check(Sleft "f " .Sright) ; icheck (Sleft "g^« , Sright) ; &check(Sleft "n" , Sright) ; &check(Sleft , Sright) ; &check(Sleft "s" , Sright) ; &check(Sleft , Sright) ;

} elsif (Smiddle eeqq "e'J { &check($left " "aa" , Sright) ; &check(Sleft " "cc" , Sright) ; &check(Sleft " "dd" , Sright) ; icheck (Sleft "g" , Sright) ; ichec (Sleft , Sright) ; icheck(Sleft , Sright) ; &check(Sleft ^»o" , Sright) ; icheck (Sleft "r" .Sright) ; icheck (Sleft "s" , Sright) ; &check(Sleft "t" .Sright); icheck(Sleft ^• " , Sright) ; &check(Sleft "w" , Sright) ; icheck (Sleft "y" , Sright) ;

} elsif (Smiddle eq "f" ) { icheck (Sleft "d" .Sright); icheck (Sleft "g" , Sright) ; icheck (Sleft " "oO" ,$right) ; icheck (Sleft " "pp" .Sright) ; icheck(Sleft ""rr" , Sright) ,- icheck (Sleft ""tt" , Sright) ; icheck (Sleft ^•"vv" .Sright) ;

} elsif (Smiddle eeqq "g") ( &check(Sleft ""bb" .Sright); icheck (Sleft ""cc" , Sright) ; icheck($left ""dd" , Sright) ,- icheck(Sleft "e" .Sright); icheck(Sleft ■ a * , Sright ) ; icheck (Sleft "h" .Sright) ; icheck (Sleft "j" .Sright); icheck (Sleft "n" , Sright ) ; icheck(Sleft "q" .Sright); ichec (Sleft "t" , Sright); ichec (Sleft ^»v" , Sright) ;

} elsif (Smiddle eq "h"J { icheck (Sleft "c" , Sright ) ; ichec (Sleft "g" , Sright) ; icheck(Sleft "j" .Sright) ; icheck( Sleft "k" .Sright) ,- ichec (Sleft " 1 " , Sright) ,- ichec (Sleft , Sright) ; icheck (Sleft .Sright) ;

} elsif (Smiddle eq "i"i t ichec (Sleft "a" , Sright) ; spell.pl 9/35 p:/SpellGrams/revmiπedist/ 98/04/21

icheck(Sleft e" , Sright) ; ichec (Sleft 1", Sright); icheck(Sleft n" , Sright) ; icheck (Sleft "o" .Sright) ; ichec (Sleft ^■s" .Sright) ; icheck(Sleft "U* .Sright) ; icheck(Sleft _"y» , Sright) ; elsif (Smiddle eq "j") { icheck (Sleft "g", Sright) ; icheck(Sleft "h", Sright) ; icheck(Sleft "n", Sright) ; elsif (Smiddle eq "k") { icheck(Sleft "c", Sright) ; ichec (Sleft "g", Sright) ; icheck(Sleft "i", Sright) ; icheck(Sleft "1", Sright) ; icheck(Sleft "n", Sright) ; icheck(Sleft "o", Sright) ; ichec (Sleft •t", Sright) ,- elsif (Smiddle eq "1") { ichec (Sleft "d", Sright) ichec (Sleft "i", Sright) icheck (Sleft "k", Sright) ichec (Sleft "n" .Sright) icheck(Sleft "o" .Sright) icheck(Sleft .Sright) ichec (Sleft .Sright) icheck(Sleft .Sright)

} elsif (Smiddle eq "m") {

# m is the last character if (Si == Slength - 1 ii Sdic (Sleft} != 0) { Scorrection = Sleft. "\ , " ; push (θcorrections, Sleft. "\, ") ; $numcor+÷;

} icheck($left, "b" , Sright) ; icheck($left, "1", Sright) ; icheck($left, "n". Sright); icheck($left, "o", Sright) ; icheck (Sleft, " t ", Sright ); } elsif (Smiddle eq "n") { icheck($left, "b" . Sright) icheck (Sleft , " c " , Sright) icheck($left, "d", Sright) ichec (Sleft, "g", Sright) icheck (Sleft, "h" , Sright) icheck (Sleft, "1", Sright) icheck($left, " ", Sright) icheck($left, "r", Sright) icheck($left, "t", Sright) icheck (Sleft, "u", Sright) } elsif (Smiddle eq "o") ( icheck (Sleft, *a", Sright) ; icheck (Sleft , " e " , Sright ) ; icheck (Sleft, "i", Sright) ; icheck($left, "1", Sright) ; icheck($left, *n", Sright) icheck($left, "p", Sright) ; spell.pl /21 p:/Spe!IGrams/revminedist/

ichec (Sleft , Sright) ; ichec (Sleft "u" Sright) ; elsif (Smiddle eq P") { icheck(Sleft "b" Sright) ; icheck (Sleft "e" Sright) ; icheckfSleft "o" Sright) ; icheck (Sleft Sright) ; elsif (Smiddle eq q") ( icheck (Sleft "a" Sright) ; icheck (Sleft "c" Sright) ; icheckfSleft "g" Sright) ; icheck (-Sleft "w" Sright) ; elsif (Smiddle eq r") { icheck(Sleft "b" Sright) ; icheckfSleft "c" Sright) ; icheck (Sleft "d" Sright); icheckfSleft "e" Sright) ; icheckfSleft "g" Sright) ; icheckfSleft "1" Sright) ; icheckfSleft "n" Sright) ; icheck (Sleft "o" Sright) ; icheck (Sleft "P" Sright); icheckfSleft -t" Sright) ; elsif (Smiddle eq s") { icheckfSleft "a" Sright) ; icheckfSleft "c" Sright) ; icheckfSleft "d" Sright) ; icheckfSleft "e" Sright) ; icheckfSleft "1" Sright) ; icheckfSleft "m" Sright) ; icheckfSleft ^■n" Sright) ,- icheckfSleft "t" Sright) ; icheckfSleft "w" Sright) ; icheckfSleft "X" Sright) ; icheckfSleft "z" Sright) ; elsif (Smiddle eq f) ( icheckfSleft "V .Sright) icheckfSleft "b" Sright) icheckfSleft "c" Sright) icheckfSleft "d" Sright) icheckfSleft "e" Sright) icheckfSleft "f" Sright) icheckfSleft "g" Sright) icheckfSleft "n" Sright) icheckfSleft "r" Sright) icheckfSleft "s" Sright) icheckfSleft ■y" Sright) elsif (Smiddle eq u") { icheckfSleft "a" Sright) icheckfSleft Sright) icheckfSleft "i" Sright) icheckfSleft "n" Sright) icheckfSleft "O" Sright) ichec fSlef Sright! ichec fSleft Sright) ichec fSleft »y_'< Sright) elsif (Smiddle eq V) ( icheckfSleft ^•b" Sright) spell.pl 11/35 p:/SpellGrams/revminedist/ 98/04/21

ichec fSleft, c , Sright) ; icheckfSleft, , Sright) ; icheckfSleft, "» «-»" .Sright) ; icheckfSleft, "n" .Sright) ; icheckfSleft, "w" .Sright) ; elsif (Smiddle eq "w") { icheck (Sleft, "a" .Sright) ; ichec fSleft, , Sright) ; icheckfSleft, "q" , Sright) ; icheckfSleft, , Sright) ; icheckfSleft, "S" .Sright) ,- icheckfSleft, .Sright); icheckfSleft, "U" .Sright); elsif (Smiddle eq "X") { ichec (Sleft, "c" .Sright) ; icheckfSleft, " "αd" .Sright) ; icheckfSleft, " "ss" , Sright) ; elsif (Smiddle eeqq "y") { icheck (Sleft, " "aa* .Sright) ; ichec (Sleft, " "ee" , Sright) ; icheckfSleft, , Sright) ; icheck (Sleft, "i" .Sright) ; icheckfSleft, " o" .Sright) ; icheckfSleft, "t" .Sright) ; icheckfSleft, "U" .Sright); elsif (Smiddle e "z") { icheckfSleft, "a" ,$right) ; icheckfSleft, " c " .Sright) ; icheckfSleft, "s" .Sright) ; icheckfSleft, "X" .Sright) ;

# deletions icheckfSleft, ' .Sright) ;

# inse:rtions icheck (Sleft ^•a" .Smiddle, Sright ) icheck (Sleft "b" .Smiddle, Sright) icheck (Sleft "c" •Smiddle, Sright) icheck (Sleft "d" •Smiddle, Sright) icheck (Sleft "e" .Smiddle, Sright) icheck (Sleft ^• f .Smiddle, Sright) icheck (Sleft "g" •Smiddle, Sright) icheck(Sleft "h" .Smiddle, Sright) icheck (Sleft .Smiddle, Sright) icheck (Sleft .Smiddle, Sright) icheck (Sleft "k" .Smiddle, Sright) icheck (Sleft •Smiddle, Sright) icheck (Sleft " " •Smiddle, Sright) icheck (Sleft "n" •Smiddle, Sright) icheck (Sleft "o" •Smiddle, Sright) icheck (Sleft "p" •Smiddle, Sright) icheck (Sleft ^•q" •Smiddle, Sright) icheck [ Slsf* .Smiddle, Srigh ) icheck (Sleft "s" •Smiddle, Sriσht ) icheck (Sleft •Smiddle, Sright) icheck (Sleft "u" •Smiddle, Sright) icheck (Sleft "V" •Smiddle, Sright) /35 spell.pl 04/21 p:/SpeIIGrams/revminedist/

icheck(Sleft "w" .Smiddle, Sright) ; icheck($lef: "x" -Smiddle, Sright) ; ichec (Slef- "y" .Smiddle, Sright) ; icheck (Slef: "z" . Smiddl , Sright ) ; icheck (Slef: " \ ' " . Smiddle , Sright)

TΓ special case fo. last letter s rtions if (Sversion > 1) { if (Si == Slength - 1) C icheckfSleft, Smiddle > . " a " , Sright) icheck(Slef , Smiddlei.-b" , right) ichec fSleft, Smiddle ' . " c " , right) icheck(Sleft, Smiddle ."d" , Sright) icheckfSleft, Smiddle . "e" , Sright) icheckfSleft, Smiddle . " f " , Sright) icheckfSleft, Smiddle . "g" , Sright) icheck (Sleft, Smiddle . h" , Sright) ichec (Sleft, Smiddle . * i " , Sright) icheckfSleft, Smiddle ."j" .Sright) icheckfSleft, Smiddle . "k- .Sright) ichec fSleft, Smiddle * 1 " , Sright) icheckfSleft, Smiddle . "m" , Sright) icheckfSle t, Smiddle .Sright) icheck(Sle t, Smiddle . "0" , Sright) icheckfSleft, Smiddle . "p" , Sright) icheck(Sleft, Smiddle ."q" , Sright) icheck ( Sleft, Smiddle ."r" , Sright) icheck (Sle , Smiddle . "s" , Sright) icheck(Sleft, Smiddle " , Sright) icheck(Sleft, Smiddle . "u" , Sright) icheck(Sle t, Smiddle ."V" , Sright) icheck(Sleft, Smiddle -ⁿW" .Sright) icheck(Sleft, Smiddle .Sright) icheck(Sleft, Smiddle ."y" , Sright) icheckfSle t, Smiddle . "z"

} π transpositions if (Si != Slength - 1) {

Sm2 = substr(Sright, 0, 1) ; Sr2 = substrfS ight, 1) ; if (Smiddle eq "V") { icheckfSleft, "n\'",$r2) if (Sm2 eq "n") } elsif (Smiddle eq "a") { icheck(Sleft, "ca" ,$r2) if (Sm2 eq "c") icheck($left, *ea",$r2) if (Sm2 eq "e") icheckfSleft, "ga",$r2) ($m2 e "g") icheckfSleft, "ha ¹ ,$r2) (Sm2 eq "h") icheckfSleft, "ia ',$r2) (Sm2 eq "i") icheckfSleft, "ka \$r2) (Sm2 eq "k") icheckfSleft, "ia \Sr2) (S 2 eq "1") ichec fSleft, " a ',Sr2) (Sm2 eq "m") icheckfSlef , "na ' , Sr2) (3m2 e _ ichec (Slef , " oa ',$r2! !Sm2 ec "o") ichec (Sleft, "pa ',Sr2) (3m2 "tj" ! icheckfSle t, "ra"¹ ,, SSrr2)) if ((SSmm22 eq "r"i ichec fSle , "sa ',Sr2) (Sm2 spell.pl 13/35 p:/S pellG rams/revmin edist/ 98/04/21

icheckfSleft, ,Sr2) ($m2 eq "t") ; icheckfSleft, "ua" ,Sr2) ($m2 eq "u") ; } elsif (Smiddle eq "b") { icheck (Sleft, "ab" ,Sr2) ($m2 eq "a") ; icheck (Sleft, "ib" ,Sr2) ($m2 eq "i") ; icheck (Sleft, " b" ,Sr2) ($m2 eq "m") ; } elsif (Smiddle eq "c") C icheck (Sleft, "ac" ,Sr2) ir ($m2 eq "a") ; icheck($left, "ec" ,Sr2) ($m2 eq "e"); icheckfSleft, "ic" ,Sr2) ($m2 eq "i") ; icheckfSleft, "nc" ,Sr2) ($m2 eq "n") ; ichec fSleft, oc" ,Sr2) it (Sm2 eq "o") ; icheckfSleft, rc",$r2) if (Sm2 eq "r") ; icheckfSleft, sc",$r2) if ($m2 eq 's'); ichec fSleft, "tc",Sr2) if ($m2 eq "t") ; icheckfSleft, "uc",$r2) if (Sm2 eq "u") ; icheckfSleft, ^•yc",$r2) if ($m2 eq "y")_; } elsif (Smiddle eq "d") { ichec fSleft, "ad",$r2) if ($m2 eq "a") ; icheckfSleft, "ed",$r2) if ($m2 eq 'e")_; ichec fSleft, ^•id",$r2) if (Sm2 eq *i") _; icheckfSleft, *ld",$r2) if ($m2 eq "1") ; ichec fSleft, ^•nd".$r2) if (Sm2 eq *n") ; ichec fSleft, ^■od",$r2) if ($m2 eq "o") ; } elsif (Smiddle eq "e") { ichec fSleft, ^•ae",$r2) if (Sm2 eq "a") ; icheckfSleft, "be",$r2) if (Sm2 eq "b") ; icheckfSleft, "ce",$r2) if (Sm2 eq 'c'); icheckfSleft, "de",$r2) if (Sm2 eq "d") ; ichec fSleft, ^•fe",$r2) if (Sm2 eq "f)_; ichec fSleft, "ge",$r2) if (Sm2 eq "g"); icheckfSleft, ^■he",$r2) if f$m2 eq "h" ) ; icheckfSleft, ^•ie",$r2) if ($m2 eq "i") ; ichec fSleft, "ke",$r2) if (Sm2 eq "k") ; icheckfSleft, ^•le",$r2) if ($m2 eq "1"); icheckfSleft, "me",$r2) if (Sm2 eq "m" ) ; icheckfSleft, ^•ne",Sr2) if ($m2 eq *n" ) ; ichec fSleft, "oe",$r2) if (Sm2 eq "O"); icheckfSleft, "pe",$r2) if ($m2 eq *p" ) ; icheckfSleft, "re"',Sr2) if (Sm2 eq *r"); icheckfSleft, ^•se"'.Sr2) if ($m2 eq -s') ; ichec fSleft, "te"',$r2) if ($m2 eq *t") ,- ichec fSleft, "ue ,Sr2) if ($m2 eq "u") ; icheckfSleft, "ve ,Sr2) if (Sm2 eq "v" ) ; icheckfSleft, "ye ,Sr2) if (Sm2 eq ^*y } ; } elsif (Smiddle eq f) { icheckfSleft, "ef ,$r2) ir ( Sm2 eq "e" ) ; icheckfSleft, "If ,Sr2) if ( $m2 eq " l" ) ; icheckfSleft, "nf ,$r2) if ( $m2 eq "n" ) ; icheckfSleft, "of ,$r2) if ( $m2 eq "o " ) ; icheckfSleft, ^•rf",$r2) if ( $m2 eq "r" ) ; } elsif (Smiddle eq "g") { icheck (Sleft, "ag",$r2) ( $m2 eq " a " ) ; ichec fSleft, "ig". s .:; ( Sm2 eς ' i " ) ^■ icheckfSleft, "ng", Sr2) ( Sm2 eq "n" ) ^■ icheckfSleft, "og" Sr2) ( Sm2 eq " o " ) ; icheckfSleft, Sr2) ( $m2 eq " r" ) ; icheckfSleft, "ug" Sr2) ( $m2 ec "u" ) ; speli.pl /21 p:/SpeilGrams/revminedist/

elsif (Smiddis eq "h") { icheck (Slef: , "ch" ,$r2) f ($m2 eq "c" ichec (Sleft ,, "gh" ,$r2) f (≤m2 eq *g" icheckfSleft ,, "ph" ,$r2) f (Sm2 eq "p" icheckfSleft , "rh" ,$r2) if ($m2 eq "r" icheckfSleft ,, "sh" ,$r2) f ($m2 eq "s" icheckfSleft , "th" ,$r2) if (Sm2 eq "t" icheckfSleft "wh",$r2 if (Sm2 eq "w" elsif (Smiddle se "i") icheck (Sleft ^•ai",Sr2 τ_ (S 2 eq "a" icheck(Sleft "ci",Sr2 if (Sm2 icheck(Sleft "di",Sr2 τ_ ($m2 icheckfSleft "ei",Sr2 τ_f (Sm2 eq "e" icheckfSleft ^•gi",$r2 if (Sm2 eq ^•g" icheck (Sleft "hi",$r2 if (Sm2 eq ^•h" icheckfSleft "li",$r2 if (Sm2 eq ^•1" icheckfSleft "ni",$r2 if (Sm2 eq "n" icheckfSleft, "oi",$r2 if (Sm2 eq ^•o" icheckfSleft, "ri",$r2 if (Sm2 eq "r" icheckfSleft, "si",$r2 i (Sm2 eq ^•s" icheckfSleft, "ti",$r2 •ϊ (Sm2 eq * n icheckfSleft, "ui",$r2 if (Sm2 eq "U" ichec (Sleft , ^•vi " , $r2 if ($m2 eq "V" ichec (Sleft, "wi",$r2 if ($m2 eσ ^•w" elsif (Smiddle eq "j")

# ichec ( $le t, "g",$ 2) if ($m2 eq "n" ) elsif (Smiddle e "k") icheckfSleft, "ak",$r2 ir Sm2 eq a" icheckfSleft, "ck",$r2 if Sm2 eq c' icheckfSleft, "nk",$r2 if Sm2 eq n" icheckfSleft, "rk".$r2 if Sm2 eq r" icheckfSleft, "sk",$r2 if $m2 eq "s" elsif (Smiddle eq "1") icheck ( Sleft , "al " , $r2 if $m2 eq 'a* ichec (Sleft, "bl",$r2 if $m2 eq *b" icheck($left,"cl",$r2 if $m2 eq "c" icheck (Sleft , ^• el " , $r2 if $m2 eq "e" icheckfSleft, "il",$r2 if $m2 eq "i" icheckfSleft, "ol",$r2 if $m2 eq "o" icheckfSleft. "pi ",$r2 if $m2 eq "p" ichec ( Sleft , "rl " , $r2 if $m2 eq "r" icheckfSleft."si ".$r2 if $m2 eq "s" icheckfSleft. "tl",$r2 if $m2 eq *t" ichec ( Sleft, "ul " , $r2 if $m2 eq "u" ichec (Sleft, "yl",$r2 if Sm2 eq "y" elsif (Smiddle eq "m" ) ichec (Sleft, "am",$r2 if $m2 eq "a" ichec (Sleft , " em" , $r2 if $m2 eq "e" icheckfSleft, "nm",$r2 if $m2 eq "n" icheckfSleft, "om",$r2 if $m2 eq "o" icheckfSleft, "rm",Sr2 if $m2 eq "r" icheckfSleft, "sm",$r2 $m2 eq "s" elsif (Smiddle eq "n") icheckfSleft, "an" , Sr2 $m2 eq "a" icheckfSleft •en",Sr2 Sm2 eq "e" icheckfSleft ^•gn",Sr2 $m2 e "g" icheckfSleft ^• n ,$r2 $m2 eq "i" icheckfSleft ^•kn" ,$r2 $m2 eq "k" spell.pf 15/35 pJSpellGrams/revminedist 98/04/21

icheckfSleft "on",$r2 ;$m2 eq "o") , icheckfSleft ^•m",$r2 !$m2 eς *r"); icheckfSleft "sn",$r2 ir :$m2 eς ^•s"); icheckfSleft "un",Sr2 if $m2 eq "U") ; icheckfSleft "wn",Sr2 if $m2 eq ^■w" ) ; icheckfSleft "yn",Sr2 if ($m2 eq ^•y ) ;

} elsif (Smiddle eq "o") icheckfSleft "ao",$r2 if $m2 eq "a"); icheckfSleft "co",$r2 if $m2 eq "C" ) ; icheckfSleft ^•eo",$r2 if $m2 eq ^•e"); icheckfSleft "fo",Sr2 if $m2 eq ^•f"); icheckfSleft "go",Sr2 if $m2 eς ^■g") ; icheckfSleft "ho",Sr2 if $m2 eq ^•h")_; icheckfSleft "io".$r2 if Sm2 eq "i"); icheckfSleft "lo",Sr2 if Sm2 eq ^■1") ; icheckfSleft "mo",$r2 if $m2 eq "a^* ) ; icheck(Sleft "no",$r2 if $m2 eq ^■n") _; icheckfSleft "ro*,Sr2 if Sm2 eq "r"); icheckfSleft "so",$r2 if $m2 eq ^•s"}; icheckfSleft "to",$r2 if $m2 eq ^•t"); icheckfSleft ^•uo",$r2 if $m2 eq "U" ) ; ichec fSleft "wo",$r2 if $m2 eq ^■w" ) ;

} elsif (Smiddle eq "p") ichec (Sleft "ap",$r2 if $m2 eς "a"); icheckfSleft "ep",$r2 if Sm2 eq ^■e") ; icheckfSleft "iρ",$r2 if $m2 eq ^•i"); ichec fSleft "lp",$r2 if $m2 eg "1"); icheckfSleft "mp",$r2 if $m2 eq ^•m" ) ; icheckfSleft "oρ",$r2 if $m2 eq ^•o"); icheckfSleft "rp",$r2 if $m2 eq "r"); ichec fSleft ^■sp",$r2 if Sm2 eq ^■s") ; icheckfSleft "up",$r2 if $m2 eq "U" ) ;

} elsif (Smiddle eq "q") icheckfSleft "cq",$r2 if $m2 eq "C") ;

} elsif (Smiddle eq "r") icheckfSleft "ar",$r2 if Sm2 eq "a"); icheckfSleft "cr",$r2 if $m2 eq ^•c"); icheckfSleft "er",$r2 if Sm2 eq ^•e"); ichec fSleft "gr",$r2 if $m2 eq "g"); icheck(Sleft "hr",$r2 if $m2 eq ^•h"); icheck(Sleft ^•ir",$r2 if Sm2 eq ^■i"); icheck($left ^•or",$r2 if $m2 eq ^•o"); icheckfSleft "pr".$r2 if $m2 eq ^•p"); icheckfSleft "tr".$r2 if Sm2 eq ^■t") ; icheck(Sleft ^•ur",$r2 if $m2 eq ^■U*) ; icheckfSleft "yr".$r2 if $m2 eq ^•y ) ;

} elsif (Smiddle eq "s") icheckfSleft "as",$r2 if $m2 eq "a") ; icheckfSleft "bs",$r2 if $ni2 eq "b") ; icheckfSleft "es",$r2 if $m2 eq "e") ; icheckfSleft "is",$r2 if $ια2 eq "i") ; icheckfSleft "ks ,Sr2 if Sm2 eq ^•k"); icheckfSleft "Is ,Sr2 Sm2 eq ^■1" ) ; icheckfSlef "ms ,Sr2 Sm2 eς ^•m'' ) ; icheckfSleft "ns ,Sr2 $m2 eς "n" ) ; icheckfSleft "os ,Sr2 $m2 eς *0" ) ; icheckfSleft "ps",$r2 Sm2 eq ^■p") ; ichec fSleft "rs",$r2 Sm2 eς "r"),- /35 speil.pl 04/21 p:/Spel!Grams/revminedist/

icheck(Sleft, "ts " , $r2 f ($m2 eq "t" icheckfSleft, "us" ,$r2 if ($m2 eq "u" icheckfSleft, "ys",$r2 f ($m2 ec "y*

} elsif (Smiddle e "t") icheckfSleft,"at",$r2 f ($m2 eς "a" icheckfSleft, "ct",$r2 if ($m2 ec "c" icheckfSleft, "et",$r2 if (Sm2 eq "e" icheckfSleft, "ht",$r2 ($m2 eq "h" icheckfSleft, "it",$r2 ($m2 eς "i" icheckfSleft, "It", $r2 if ($m2 eς "1" icheckfSleft, "nf,$r2 if ($m2 eς "n" icheckfSleft, "ot",$r2 f ($m2 ec "o" icheckfSleft, "pt",$r2 f ($m2 eq "p" icheckfSleft, "rt",$r2 if ($m2 eq "r" icheckfSleft, "st",$r2 ($m2 eq "s" icheckfSleft, "ut",$r2 if ($m2 eq "u"

} elsif (Smiddle eq "u") icheck(Sleft, "au" , $r2 ($m2 eq "a") icheckfSleft, "bu",$r2 if (Sm2 eq "b") icheckfSleft, "cu",$r2 . ($m2 eq ^■c") icheckfSleft, "eu",$r2 if (Sm2 eq *e") icheckfSleft, "gu",$r2 if ($m2 eq "g") icheckfSleft, "lu",$r2 ($m2 eq "1") icheckfSleft, "nu",$r2 (Sm2 eq "a") icheckfSleft, "ou",$r2 (Sm2 eq ^•o") icheckfSleft, "pu",$r2 if ($m2 eq ^•p'⁾ icheck(Sleft, "ru" , Sr2 if (Sm2 eq "r") icheckfSleft, "su",$r2 ($m2 eq ^•s") icheckfSleft, "tu",$r2 if ($m2

} elsif (Smiddle eq "v") icheckfSleft, "av",$r2 if ($m2 eq "a") icheckfSleft, "ev",$r2 if ($m2 eq "e") icheckfSleft, "iv",$r2 if ($m2 eq *i") icheckfSleft, "lv",$r2 if ($m2 eq "1")

} elsif (Smiddle eq "w" ) icheckfSleft, "ew",$r2 if ($m2 eq "e") icheckfSleft, "ow",$r2 if ($m2 eq "o") icheckfSleft, "sw",$r2 if ($m2 eq "s")

} elsif (Smiddle eq "x") # icheck($left,"s",$r if ($m2 eq "n")

} elsif (Smiddle eq "y") ichec (Sleft,"ay", $r2 if (Sm2 eq "a") icheckfSleft, "hy" , $r2 if (Sm2 eς "h") icheckfSleft, "ly",$r2 if ($m2 eq *1") icheck (Sle t, "ry" , $r2 if ($m2 eς "r") icheck(Sleft, "sy" , $r2 if (Sm2 eq *s")

} elsif (Smiddle eq "z") icheckfSleft, "iz",$r2 if ($m2 ec "i") icheck( Sleft, "yz" , $r2 if ($m2 eq "y" )

}

T Long Distance Transpositions if (Si <= Slength - 2) {

Sm2 = substr (Sright, 0, 1) ;

$m3 = substr (Sright, 1, 1} ;

Sr2 = substr (Sright, 2);

Smid = Smiddle. Sm2.Sm3; spell.pl 17/35 p:/SpellGrams/revminedist/ 98/04/21

Γ Not including if (Smid =- /^Λ [aeiouy] [dglnrstv] [aeiouyJS/ ϋ Smiddle ne Sm3) { icheck (Slef , Sm3. $m2. Smiddle, $r2) ,- } elsif (Smid =- /^Λ [Imn] [aeiou] [lmn] $/ ii Smiddle ne $m3) { icheckfSleft, Sm . $m2. Smiddle, $r2) ; } elsif ($mid eq "vel" | | $mid eq "lev" | | Smid eq "ton" j | Smid eq "not") ( icheckfSleft, $m3. $m2.Smiddle, $r2) ; } elsif (Smid =- /^Λ [cdfgrst] [aieu] [cdfgrst] S/ ii Smiddle ne $m3) ( icheck($left, $m3. $m2.Smiddle, $r2) ; }

if (Sversion >= 3) {

# Multiple Character Substitutions iii

# y/ie, f/ph, al/le,

5r not doing pneu/ne, ant/ent, ance/ence, aly/ally, eu/ea, oe/ow,

# pre/pro, ious/uous, pre/per, ceed/sede, ament/ement, eous/ious,

# sh/sc, ghth/ght, all/al, c/sc/s/ss, m/gm, ss/s, eorg/orge, ene/ean Γ uf/ough, mce/cem, eat/ate, tui/uit, al/ile, ash/has, fea/afe,

# rau/ura if (Smiddle eq "e") { icheck($left, "ia", Sright) ; icheckfSleft, "ai" , Sright) ; } elsif (Smiddle eq "f") { icheck($left, "ph", Sright) ; icheck($lef , "ve" , Sright) ; } elsif (Smiddle eq "i") { icheckfSleft, "ea" , Sright) ; } elsif (Smiddle eq "y") { icheckfSleft, "ie" , Sright) ; } if ($i == Slength - 1) { if (Smiddle eq "t") { icheckfSleft, "ed", Sright) ; icheckfSleft, "led" , Sright) ,- } } if ($i <= Slength - 2) {

$middle2 = substr (Sword, $i,2) ; $right2 = substr (Sword, $i+2) ; if ($middle2 eq "al") { ichec (Sleft, "le" , $right2) ; } elsif ($middle2 eq "as") { # from cassi —> ccasi icheck($left, "ca" , $right2) ; } elsif (Smiddle2 eq "a *) { icheckfSleft, "o",$right2) ; } elsif (Smiddle2 eq "ce") { icheckfSleft, "es" , $right2) ; } elsif (Smiddle2 eq "co") { icheckfSleft, "om" , Sright2) ; } elsif (Smiddle2 eq "de" ii Si == 0) ( icheckfSleft, "un" , $right2) ; } elsif ($middle2 eq "ea") { icheckfSleft, "i",$right2) ; icheckfSleft, "ie" , $right2) ; spell.pl /21 p:/SpellG rams/revminedist/

elsif (S iddle2 eq "el*) {

# pell —> ppel icheck($left, "pe" , Sright2) ; ichec (Sleft , "al " , $right2 ) ;

# icheck(Sleft, " le " , $right2 ) ; elsif ($middle2 eq "en") { icheck(Sleft, *ine " , $right2) ; elsif ($middle2 eq "ey") { icheck(Sleft, "ie" , Sright2) ; elsif (Smiddle2 eq "fe") {

# from ffering —> farring and ttereα —> r= - ichecktSleft, "εr" , $right2) ;

^# from ffes —> fess icheck($lef , " s " , Sright2 ) ; elsif ($midάle2 eq "gi*) {

# ggin —> ginn icheck($left, "in" , $right2) ; elsif (Smiddle2 eq "ia") { ichec ( Slef , "e" , $right2 ) ; elsif ($middle2 eq "ie*) { ichecktSleft, "y" , Sright2) ; icheckfSleft, ^•ey",Sright2) ; elsif (Smiddle2 eq "in") {

# from cinn —> ccin icheck(Sleft , "ci" , Sright2 ) ; elsif ($miαdle2 eq ^•if) { ichecktSleft, "ate ' ,$right2); ichecktSleft, "ute \$right2); ichec fSleft, "mi" .$right2); ichec ( Sleft, " te" , $right2 ) ; elsif ($middle2 eq "le*) { icheckfSleft, "al" , $right2) ;

# icheck (Sleft, "el * , $right2 ) ^■ elsif ($middle2 eq "lo") { icheckfSleft, "os" ,$right2) ; elsif fSmiddle2 eq "mn") { ichecktSleft, "um" , $right2) ; elsif ($middle2 eq "oo*) { icheckfSleft, "u" , $right2) ; elsif ($middle2 eq "ph*) C icheckfSleft, "f $right2); elsif ($middle2 eq *qu*) { icheckfSleft, "ck' ,$right2); elsif ($middle2 eq "ra") C ichecktSleft, *al" ,$right2); ichecktSleft, "as" , $right2) elsif ($middle2 eq "ri^*l { icheckfSleft, "ib ' , $right2 ) ; icheckfSleft, "if ' ,$right2) ; elsif ($middle2 eq "ro'l C icheckfSleft, "er ^• , $right2 ) ; el ") {

ichec tSleft, "ic" ,Sright2) , elsif (Smiddle2 eq "te") ( icheck($left, "ght" , Srignt2) ; elsif (Smiάdle2 eq "ye") { ichec fSleft, "i" , Sright2 ) ; spell. pi 19/35 pr/SpellGrams/revminedist 98/04/21

}

-f (Si <= Slength - 3) { $.middle3 = subst (Sword, $i, 3) ; $right3 = substr (Sword, $i-r3 ) ; if ($middle3 eq "age") { icheckfSleft, "edge" , $right3 ) ,- elsif ($middle3 eq "acy") ( icheckfSleft, ' isy",$right3) ; elsif ($middle3 eq "ase*) { ichecktSleft, ' sea" , $right3) ; elsif ($middle3 eq "ded" ) { ichecktSleft, ' t" , $right3) ; elsif ($middle3 eq "ear") { icheckfSleft, ' ere" , $right3 ) ; elsif ($middle3 eq "evi" ) { ichec tSleft, ^• iev" , $right3) ; elsif {$middle3 eq "exi") { π from exion —> ection icheckfSleft, "ecti" ,$right3) elsif ($middle3 eq "gin") { ichecktSleft, ing ,$right3) elsif ($middle3 eq ine") { ichecktSleft, ein ,$right3) elsif ($middle3 eq isy") { icheckfSleft, acy ,$right3) elsif ($middle3 eq nts*) { icheckfSleft, nee , $right3 ) elsif ($middle3 eq ons* ) { ichecktSleft, a",$right3) ; elsif ($middle3 eq "que") { ichecktSleft, "ck $right3) ; elsif ($middle3 eq sci") C icheckfSleft, cil ,$right3) ,- elsif ($middle3 eq tio*) { ichecktSleft, cea , $right3 ) ; elsif ($middle3 eσ unp" ϋ $i == 0) C # un —> im before a p icheckfSleft, " imp" , $right3 ) ; elsif ($middle3 eq 'ums") ( ichecktSleft, a",$right3) ; elsif ($middle3 eq "ure") { icheckfSleft, eur" ,$right3) ; icheckfSleft, er",$right3) ;

} sub spellcor_pos2 { local ( Sword, $i, Slength) = @_; local (Sleft, Smiddle, Sright, Sm2,Sr2)

Sleft = substr (Sword.0, Si) ; Smiddle = substr (Sword, i, ii ; Sright = substr (Sword, Si-D ; 5 spell.pl 4/21 p.7SpellGrams/revminedist/

substitutions if (Smiddle eq " ; " ) { icheck2 ($left, "\", Sright) if (Si < Slength - 1); icheck2 (Sleft, "1", Sright) ; } elsif (Smiddle eq *\^*") { icheck2 (Sleft, "\", Sright) if (Si < Slength - 1) ; } elsif (Smiddle eq "\_") { icheck2 (Slef , "\-", Sright) if (Si < Slength - 1) ; } elsif (Smiddle eq "\S") { icheck2 (Sleft, "s", Sright) if (Si < Slength - 1 ii $i > 0); } elsif (Smiddle eq "\=") { icheck2 (Sleft, "\", Sright) if ($i < Slength - 1 ii $i > 0) } elsif (Smiddle eq *V) { icheck2 (Sleft, "\'", Sright) if ($i < Slength - 1 ii $i > 0), } elsif (Smiddle =- /^Λ\d$/) { if (Smiddle eq "0") { icheck2 (Sleft, "o", Sright) ;

} elsif (Smiddle eq "1") C icheck2 (Sleft, "1", Sright) ; icheck2 (Sleft, "i",Sright) ; icheck2 (Sleft, "e", Sright) ;

} elsif (Smiddle eq "3") ( icheck2 (Sleft, "e", Sright) ;

} elsif (Smiddle eq "9") { icheck2 ( $ieft, "o* . Sright) ;

} } elsif (Smiddle eq "a") { icheck2 (Sleft, "e", Sright) ; icheck2 (Sleft, "i", Sright) ; icheck2 (Sleft, "o", Sright) ; icheck2 (Sleft, "s", Sright) ; icheck2(Sleft, "u", Sright) ; icheck2 (Sleft, "z", Sright) ; } elsif (Smiddle eq "b") { icheck2 (Sleft, "d", Sright) ; icheck2 (Sleft, "g", Sright) ; icheck2 (Sleft, "h" , Sright) ; icheck2 (Slef , "1* , Sright) ; icheck2( Sleft, "n", Sright); icheck2 (Sleft, " ", Sright); icheck2 (Sleft, *t" , Sright) ; icheck (Sleft, "v" , Sright) ; } elsif (Smiddle eq "c") { icheck2 (Sleft, "d" , Sright) icheck2 (Sleft, "e", Sright) icheck2 (Sleft, "g", Sright) icheck2 (Sleft, "k", Sright) icheck2 (Sleft, "n", Sright) icheck2 (Sleft, "s" .Sright) icheck2(Sleft, "t", Sright) icheck2 (Sleft, "v", Sright) icheck2(Sleft, "x" , Sright) } elsif (Smiddle eq "d") { icheck2 (Sleft, "b¹ , Sright) ; icheck2 (Sleft, "c", Sright) ; icheck2 (Slef , "e" , Sright) ; icheck (Sleft, "f" , Sright) ; icheck2 (Sleft, "g" , Sright) ,- spell.pl 21/35 p:/Spe!IGrams/revminedis 98/04/21

icheck2 (Sleft "n", Sright) icheck2 (Sleft "r" , Sright) icheck2 (Sleft "s" .Sright) icheck (Sleft "t", Sright) elsif (Smiddle eq "e") { icheck2 (Sleft "a".Sright) icheck2 (Sleft "c", Sright) icheck2 (Sleft "d", Sright) icheck2 (Sleft "g", Sright) icheck2( Sleft " i " , Srigh ) icheck2 (Sleft "1", Sright) icheckZ(Sleft "o", Sright) &check2 (Sleft "r", Sright) icheck2(Sleft "s", Sright) icheck2 (Sleft "t". Sright) icheck2 (Sleft "u" .Sright) icheck2 (Sleft "w" .Sright) icheck2 (Sleft "y", Sright) elsif (Smiddle eq "f") { icheck2 (Sleft "d", Sright) icheck2 (Sleft "g", Sright) icheck2 (Sleft "o" , Sright) icheck2 (Sleft "p", Sright) icheck2 (Sleft "r" .Sright) icheck2 (Sleft "t", Sright) icheck2 (Sleft "v", Sright) elsif (Smiddle eq "g") { icheck2 (Sleft "b", Sright) icheck2 (Sleft "c", Sright) icheck2 (Sleft ^•d", Sright) icheck2 (Sleft "e", Sright) icheck (Sleft "f", Sright) icheck2 (Sleft "h", Sright) icheck2( Sleft "j", Sright) icheck2 (Sleft "n" .Sright) icheck2 (Sleft "q", Sright) icheck2 (Sleft ^• t" , Sright) icheck2 (Sleft ^■v", Sright) elsif (Smiddle eq "h") { icheck2 (Sleft "c", Sright) icheck2 (Sleft ^■g", Sright) icheck2 (Sleft ^•j", Sright) icheck2 (Sleft ^•k", Sright) icheck2 (Sleft ^•1", Sright) icheck (Sleft "n", Sright) icheck2 (Sleft "s", Sright) elsif (Smiddle eq "i") { icheck2 (Sleft "a", Sright) icheck2 (Sleft "e", Sright) icheck2 (Sleft "1", Sright) icheck2 (Sleft "n", Sright) icheck2 (Sleft "o", Sright) icheck2 (Sleft "s", Sright) icheck (Sleft "u", Sright) icheck2(Sleft "y" , Sright) elsif (Smiddle eq " j " ) { icheck2 (Sleft "g" , Sright) icheck2 (Sleft "h", Sright) 22 35 speli.pl

98/04/21 p-iSpelIGrams/revminedis /

icheck2 (Sleft, "n", Sright) elsif (Smiddle eq "k" ) { icheck2( Sleft, "c", Sright) icheck2( Sleft "g", Sright) icheck2 (Sleft "i" .Sright) icheck2 (Sleft "1", Sright) icheck2 (Sleft "n" , Sright) icheck2 (Sleft, ""oo" , Sright) icheck2 ( Sleft , ""tt" , Sright) elsif (Smiddle eq 1") { icheck2 (Sleft , "d .Sright) ,- icheck2 (Slef , "i .Sright) ; icheck2 (Sleft "k¹ , Sright) ; icheck2 (Sleft "n' .Sright) ; icheck2 (Sleft "o' .Sright) ; icheck2 (Sleft, "D¹ .Sright) ; icheck2(Sleft, .Sright) ; icheck2 (Sleft ^•t' .Sright) ; elsif (Smiddle eq " ") {

# m is the last character

(Si == Slength - 1 ii SdictCSleft} 0) { Scorrection = $left."\,"; pushfScorrections, Sleft. " \ , ") ; $numcor*÷ ; icheck2( Sleft b" ,$right) ; icheck2 (Sleft 1", Sright); icheck2(Sleft n" , Sright) ; icheck2 (Slef , " o , Sright) ; icheck2( Sleft, "t .Sright) ; elsif (Smiddle eq n") { icheck2( Sleft, "b .Sright) icheck2 (Sleft, , S ight) icheck2 (Sleft, .Sright) icheck2( Sleft, .Sright) icheck2 (Sleft, , Sright) icheck2 (Sleft, ,$right) icheck2 (Sleft, "m .Sright) icheck2 (Sleft, "r ,$right) icheck2( Sleft, "t", Sright) icheck (Sleft, "u", Sright) elsif (Smiddle eq "o") { icheck2 (Sleft, "a", Sright) icheck2 (Sleft, "e", Sright) icheck2( Sleft, "i", Sright) icheck2(Sleft, "1' , Sright) ; icheck2 (Sleft, "n . Sright) , icheck (Sleft, "p . Sright) , icheck2 (Sleft, "r . Sright) icheck2 (Sleft, "u . Sright) elsif (Smiddle eq P" ) ( icheck2 (Sleft, "b , $riςht) icheck2 ( Sleft , " a , Sright) icheck2 ( Sle , " o , Sright) icheck (Sleft , " r , Sright) Lsif ( Smiddle eq q" ) ( icheck2 ( Sleft , "a , Sright) icheck2 ( Sleft , " c ¹ , Sright ) speli.pl 23/35 p:/SpelIGrams/revminedist 98/04/21

icheck2 (Slef g", Sright) icheck2($lef w", Sright) elsif (Smiddle eq "r") [ icheck2 (Sleft "b", Sright) icheck2 (Sleft "c", Sright) icheck2 (Sleft "d", Sright) icheck2 (Sleft "e", Sright) icheck2 (Sleft "g", Sright) icheck2 (Sleft "1", Sright) icheck (Sleft "n" .Sright) icheck2 (Sleft "o " , Srigh ) icheck2.( Sleft "p", Sright) icheck2( Sleft "t", Sright) elsif. (Smiddle eq "s") { icheck2 (Sleft "a", Sright) icheck2 (Sleft "c", Sright) icheck2 (Sleft "d", Sright) icheck2 (Sleft "e", Sright) icheck (Sleft "1", Sright) icheck2 (Sleft "m", Sright) icheck2 (Sleft "n", Sright) icheck2 (Sleft "t", Sright) icheck2( Sleft "w" , Sright) icheck2( Sleft "x", Sright) icheck2 (Sleft "z " , Sright) elsif (Smiddle eq "t") { icheck (Sleft ^■\' ".Sright ); icheck2 (Sleft "b", Sright) icheck2 (Sleft "c", Sright) icheck (Sleft "d", Sright) icheck2 (Sleft "e", Sright) icheck2 (Sleft "f", Sright) icheck2 (Sleft "g", Sright) icheck2 (Sleft "n" , Sright) icheck2( Sleft "r", Sright) icheck2 (Sleft "s", Sright) icheck2 (Sleft "y", Sright) elsif (Smiddle eq "u") { icheck2( Sleft "a", Sright) icheck2 (Sleft "e", Sright) icheck2 (Sleft "i", Sright) icheck2 (Sleft "n" .Sright) icheck2 (Sleft ^•o", Sright) icheck2 (Sleft "r", Sright) icheck2 (Sleft "w", Sright) icheck2 (Sleft "y", Sright) elsif (Smiddle eq "v") { icheck2 (Sleft "b", Sright) icheck2 (Sleft "c" , Sright) icheck2( Sleft "f", Sright) icheck2 (Sleft "g" .Sright) icheck2 (Sleft "n" .Sright) icheck2 (Sleft "w" .Sright) elsif (Smiddle eq "w") { icheck (Sleft "a" .Sright) icheck2 (Sleft ^•e", Sright) icheck2 (Sleft "q" .Sright) icheck2 (Sleft "r" , Sright) /35 spell.pl 04/21 p:/SpellGrams/revminedis /

icheck2 (Sleft, "s", Sright) icheck2 (Sleft, "t", Sright) icheck2 (Sleft, "u", Sright) elsif (Smiddle eq "x") { icheck2 (Sleft, "c", Sright) icheck2 (Sleft, "d", Sright) icheck2 (Sleft, "s", Sright) elsif (Smiddle eq "y" ) { icheck2 (Sleft, "a" , Sright) icheck2 (Sleft, "e" , Sright) icheck2( Sleft, "h", Sright) icheck (Sleft, "i", Sright) icheck2( Sleft, "o", Sright) icheck2( Sleft, "t", Sright) icheck2 (Sleft, *u", Sright) elsif (Smiddle eq "z") { icheck2 (Sleft, "a" , Sright) icheck2 (Sleft, "c", Sright) icheck2 (Sleft, "s", Sright) icheck2 (Sleft, *x" , Sright)

} ir deletions icheck2 (Sleft '", Sright)

$ insβrtions icheck2 (Sleft, "a" .Smiddle, Sright) icheck2 (Sleft, "b" .Smiddle, Sright) icheck2 (Sleft, "c" .Smiddle, Sright) icheck2 (Sleft, ^•d" . Smiddle, Sright) icheck2 (Sleft, "e" . Smiddle, Sright) icheck2 (Sleft, "f" . Smiddle, Sright) icheck2 (Sleft, •g^» .Smiddle, Sright) icheck2 (Sleft, "h" .Smiddle, Sright) icheck2 (Sleft, "i" .Smiddle, Sright ) icheck2 (Sleft, "j" .Smiddle, Sright) icheck2 (Sleft, "k" .Smiddle, Sright) icheck2 (Sleft, "1" .Smiddle, Sright) icheck2 (Sleft, "m" .Smiddle, Sright) icheck2 (Sleft, "n" .Smiddle, Sright) icheck2 (Sleft, "o" -Smiddle, Sright) icheck2 (Sleft, •p" . Smiddle, Sright) icheck2 (Sleft, "q" .Smiddle, Sright) icheck2 (Sleft, "r" . Smiddle, Sright) icheck2 (Sleft, "_S" .Smiddle, Sright) icheck2 (Sleft, "t" . Smiddle, Sright) icheck2 (Sleft, "U" .Smiddle, Sright) icheck2 (Sleft, "v" . Smiddle, Sright) icheck2 (Sleft, "w" •Smiddle, Sright) icheck2 (Sleft, "X" .Smiddle, Sright) icheck2 (Sleft, "y" .Smiddle, Sright) icheck2 (Sleft, .Smiddle, Sright) icheck2 Sleft, 'V " . Smiddle:, Sright)

# special case for last lattar insertion if (Sversion > 1) ( if (Si == Slength - 1) { icheck2 (Slef t, Smiddle. "a" , Srigh- speil.pl 25/35 p:/SpellGrams/revminedist/ 98/04/21

icheck2 (Sleft, Smiddle. "b" , Sright) icheck2 (Sleft, Smiddle. "c" .Sright) icheck2 (Sleft, Smiddle. "d" .Sright) icheck2 (Sleft, Smiddle. "e" , Sright) icheck2 (Sleft, Smiddle. "f " .Sright) icheck2 (Sleft, Smiddle. "g" .Sright) icheck2 (Sleft, Smiddle . "h" .Sright) icheck2 (Sleft, Smiddle. .Sright) icheck2 (Sleft, Smiddle. "j" .Sright) icheck2 (Sleft, Smiddle. "k" .Sright) icheck2 (Sleft, Smiddle. "1" .Sright) icheck2 (Sleft, Smiddle. "m" .Sright) icheck2 (Sleft, Smiddle. "n" .Sright) icheck2 (Sleft, Smiddle. "o" .Sright) icheck2 (Sleft, Smiddle. "p" .Sright) icheck2 (Sleft, Smiddle. "q" .Sright) icheck2 (Sleft, Smiddle. "r" , S ight) icheck2 (Sleft, Smiddle. "S" .Sright) icheck2 (Sleft, Smiddle. "t" .Sright) icheck2 (Sleft, Smiddle. "U" , Sright) icheck2 (Sleft, Smiddle. "V" .Sright) icheck2 (Sleft, Smiddle. "w" , Sright) icheck2 (Sleft, Smiddle. "X" .Sright) icheck2 (Sleft, Smiddle. -y- , Sright) icheck2 (Sleft, Smiddle. Sright)

}

.•# transpositions if (Si != Slength - 1) {

Sm2 = substr (Sright, 0, 1);

Sr2 = substrfSright, 1) if (Smiddle eq "\") { icheck2 (Sleft, "n\' ",Sr2) i f (S . 2 eq "n" )

} elsif (Smiddle eq "a ') ( icheck2 (Sleft, "ca" ,Sr2) if (Sm2 eq "c"); icheck2 (Sleft, "ea" ,Sr2) if (Sm2 eq "e") ; icheck2 (Sleft, "ga" ,Sr2) if (Sm2 eq ^»g^»); icheck2 (Sleft, "ha" ,Sr2) if (Sm2 eq "h"); icheck2 (Sleft, "ia" ,Sr2) if (Sm2 eq "i"); icheck2 (Sleft, "ka" ,Sr2) if (Sm2 eq "k")_; icheck2 (Sleft, "la" ,Sr2) if (Sm2 eq ^*1"); icheck2 (Sleft, "ma" ,Sr2) if (Sm2 eq "m" ) ; icheck2{Sleft, "na" ,Sr2) if (Sm2 eq "n"); icheck (Sleft, "oa" ,Sr2) if (Sm2 eq "O") ; icheck (Sleft, "pa" ,Sr2) if f$m2 eq ^*P"); icheck2 (Sleft, "ra" ,Sr2) if (Sm2 eq "r"); icheck2 (Sleft, "sa" ,Sr2) if (Sm2 eq "S") ; icheck2 (Sleft, "ta" ,Sr2) if (Sm2 eq "t"); icheck2 (Sleft, "ua" ,Sr2) i (Sm2 eq "u") ;

} elsif (Smiddle eq "b ") ( icheck2 (Sleft, "ab" ,Sr2) if (Sm2 eq "a"); icheck2 (Sleft, "ib" ,Sr2) f (Sm2 eq "!^*) ; icheck (Sleft, "mb" ,Sr2) t (Sm2 s*~* "m" ) ;

} elsif (Smiddle eq "c ^■) { icheck2 (S1eft, "ac" . ,Sr2) t (Sm2 eq "a") ; icheck2 (Sleft, "ec" ,Sr2) i (Sm2 eq "e") ; icheck (Sleft, "ic". ,5r2) if (Sm2 eq " i ' ) ; speil.pl /21 p:/SpellGrams/revminedist/

icheck2 (Sleft, *nc" , Sr2) (Sm2 eς "n" ) icheck2 (Sleft, "oc" , Sr2) (Sm2 eq "o") icheck2 (Sleft, "re" , Sr2) (Sm2 eq "r") icheck2 (Sleft, "sc", Sr2) (Sm2 eq "s") icheck2 ( Sle t , " tc " , Sr2) it (Sm2 eq *t") icheck2 (Sleft, "uc", Sr2) (Sm2 eς "u") icheck2 (Sleft, "yc", Sr2) ( Sm2 eq "y" ) } elsif (Smiddle eq "d" ) { icheck2( Sleft, "ad", Sr2) (Sm2 eq *a") icheck2 t Sleft, "ed", Sr2) it (Sm2 eς "e") icheck2 (Sleft, "id Sr2) if (Sm2 eq "i") icheck2 (Sleft, "Id Sr2) if (Sm2 eq *1") icheck2 (Sleft, "nd Sr2) if (Sm2 eq *n") icheck2 (Sleft, "od" , Sr2) if (Sm2 eq *o")

} elsif (Smiddle eq "e" ) { icheck2 (Sleft, "ae" , Sr2) ir (Sm2 eς ^*a") icheck2 (Sleft, "be", Sr2) if (Sm2 eς ^•b") icheck2 (Sleft, "ce", Sr2) if (Sm2 eq ^■C) icheck2 (Sleft, "de", Sr2) if (Sm2 eq *d") icheck2 (Sleft, "fe", Sr2) if (Sm2 eq *f") icheck2 ( Sleft , "ge" , Sr2) if (Sm2 eq ^*g") icheck2 (Sleft, "he" , Sr2) if (Sm2 eq *h") icheck2 (Sleft, "ie" , Sr2) if (Sm2 eς "i") icheck2 (Sleft, "ke" , Sr2) if (Sm2 eq "k") icheck2 (Sleft, "le", Sr2) if (Sm2 eς "1") icheck2 (Sleft, "me", Sr2) if (Sm2 eq "m") icheck2 (Sleft, "ne" , Sr2) if (Sm2 eq "n") icheck2 (Sleft, "oe" , Sr2) if (Sm2 eq ^■o") icheck2 (Sleft, "pe" , Sr2) if (Sm2 eq ^*P^*) icheck2 (Sleft, "re", Sr2) if (Sm2 eq ^■r*) icheck2( Sleft, "se", Sr2) if (Sm2 eq *s") icheck2 (Sleft, "te", Sr2) if (Sm2 eς *t*) icheck2 (Sleft, "ue" , Sr2) if (Sm2 eq "u") , icheck2 (Sleft, "ve", Sr2) (Sm2 eq "v") , icheck2 (Sleft, "ye", Sr2) (Sm2 eq "y"),

} elsif (Smiddle eq "f" ) { icheck2 (Sleft, "ef", Sr2) lϊ ($m2 eq "e") icheck2 (Sleft, "If, Sr2) if (Sm2 eq "1") icheck2 (Sleft, "nf" , Sr2) if (Sm2 eq *n") icheck2 (Sleft, "of, Sr2) if (Sm2 eq "o") icheck2 (Sleft, *rf", Sr2) if (Sm2 eq "r")

} elsif (Smiddle eq "g" ) { icheck2 (Sleft, "ag", Sr2) if (Sm2 eq *a") ; icheck2 (Sleft, ^•ig" Sr2) if (Sm2 eq "i") ; icheck2( Sleft, ^■ng" Sr2) if ($m2 eq "n"); icheck2 (Sleft, ^•og" Sr2) if ($m2 eς "o") ,- icheck2 (Sleft, ^■rg" Sr2) if (Sm2 eq *r"); icheck2( Sleft, ^■ug* Sr2) if (Sm2 eq "u") ;

} elsif (Smiddle eq "h ) { icheck2 (Sleft, "ch" Sr2) ir (Sm2 eq "c") ,- icheck2( Sleft, "gn" Sr2) if (Sm2 eς "g") ; icheck2 (Sleft, "ph" Sr2) if (Sm2 eq "p") ; icheck2 (Sleft, Sr2) (Sm2 eq "r") ; icheck2 (Sle t, *sh" Sr2) ( 5m2 eς ^•s " ) ,- icheck2 (Sleft, ~. Sr2) (Sm2 eς "t") ; icheck2 (Sleft, "wh" Sr2) (Sm2 eς "w") ;

} elsif (Smiddle eq ) { icheck2 (Sleft, "a Sr2) .f ($m2 eς "a" ) ; spell.pl 27/35 p:/Spe!IGrams/revmiπedist/ 98/04/21

icheck2 (Sleft, "ci",$r2) if (Sm2 eq "c") icheck2 (Sleft, "di",$r2) if ($m2 eq *d") icheck2 (Sleft, "ei",$r2) if ($m2 eq "e") icheck2 (Sleft, "gi",$r2) i (Sm2 eq "g") icheck2 (Sleft, "hi", $r2) if (Sm2 eq "h" ) ; icheck2 (Sleft, "li \Sr2) it ($m2 eq "1") icheck2 (Sleft, "ni \$r2) if ($m2 eq "n") icheck2 (Sleft, "oi ^•,$r2) if ( $m2 eq "o " ) icheck2 (Sleft, "r *.Sr2) if (Sm2 eq "r") icheck2 (Sleft, "s \$r2) ( $m2 eq * s " ) icheck2 (Sleft, "ti \Sr2) ( $m2 eς *t " ) icheck2 (Sleft, "ui" , $r2) ($m2 eq "u") icheck2 (Sleft, "vi",$r2) ($m2 eq "v" ) icheck2($left,"wi",$r2) if ($m2 eq "w" ) elsif (Smiddle eq "j") { S icheck2 (Sleft, "g",$r2) if ($m2 eq "n") elsif (Smiddle eq *k") { icheck2($left,"ak",$r2) if $m2 eq "a") ; icheck2 (Sleft, "ck",$r2) if $m2 eq "c") ; icheck2 (Sleft, "nk",$r2) if $m2 eq *n") ; icheck2 (Sleft, "rk",$r2) if $m2 eq "r") ; icheck2($left,"sk",$r2) if $m2 eq *s") ;

} elsif (Smiddle eq "1") { icheck2 (Sleft, "al" , $r2 $m2 eq "a") ; icheck2 (Sleft, "bl" , $r2 $m2 eq *b") ,- icheck2 (Sleft, "cl" , $r2 $m2 eq c"); icheck2 (Sleft, "el",$r2 $m2 eq e"); icheck2 (Sleft, "il" , $r2 $m2 eq i") ; icheck2 (Sleft, "ol",$r2 $m2 eq o"); icheck2 (Sleft, "pi ",$r2 $m2 eq "p") ; icheck2 (Sleft, "rl",$r2 $m2 eq "r") ; icheck2 (Sleft, "si" , $r2 $m2 eq "s") ; icheck2 (Sleft, "tl" , $r2 $m2 eq " t " ) ; icheck2 (Sleft, "ul" , $r2 $m2 eq *u" ) ; icheck2 (Sleft, "yl",$r2 $m2 eq "y" ) ;

} elsif (Smiddle eq "m") { icheck2( Sleft, "am",$r2 $m2 eq "a" ) ; icheck2 (Sleft, "em" , $r2 $m2 eq "e") ; icheck2 (Sleft, "nm",$r2 Sm2 eq "n" ) ; icheck2 (Sleft, "om",$r2 $m2 eq "o") ; icheck2 (Sleft, "rm" , $r2 $m2 eq ^*r" ) ; icheck2 (Sleft, "sm" , $r2 $m2 eq ^*s" ) ;

} elsif (Smiddle eq "n") C icheck2 (Sleft, "an", $r2 $m2 eq "a") ; icheck2 (Sleft, "en" , $r2 Sm2 eq "e") ; icheck2 (Sleft, "gn",$r2 $m2 eq "g" ) ; icheck2 (Sleft, "in" , $r2 $m2 eq "i") ,- icheck2 (Sleft, "kn" , $r2 $m2 eq "k") ; icheck2 (Sleft, "on", $r2 if $m2 eq "o" ) ; icheck2($lef ,"rn",$r2 if $m2 eq *r") ,- icheck2 (Sleft, "sn" , $r2 if $m2 eq "s") ; icheck2 (Sleft, "un" , $r2 Sm2 eq "u") ; icheck2 (Sleft, "wn" , Sr2 $m2 eq *w" ) ; icheck2 (Sleft, "yn" , $r2 Sm2 ec "v" )

} elsif (Smiddle eq "o") { icheck2 (Sleft, "ao" , Sr2 5ni2 aq icheck2(Sleft, "co" ,Sr2 $m2 eq icheck2 (Sleft, "eo" , $r2 Sm2 eq spell. pi /21 p:/SpellGrams revminedisi/

icheck2 (Sleft, '^■ to ,Sr2) ($m2 icheck2 (Sleft, "go' ,Sr2) (Sm2 ^■ g- icheck2 (Sleft, "ho* ,Sr2) (Sm2 * Vι " icheck2 (Sleft, "io" ,Sr2) (Sm2 eq "i" icheck2( Sleft, "lo' ,Sr2) ($m2 eq "1" icheck2 (Sleft, "mo" ,Sr2) ($m2 eq "m" icheck2 (Sleft, "no" ,Sr2) (Sm2 eq "n" icheck2 (Sleft, "ro" ,Sr2) (Sm2 & "r¹ icheck2 (Sleft, "30" ,Sr2) (Sm2 & "s¹ icheck2 (Sleft, "to" ,Sr2) f (Sm2 eq "f icheck2 (Sleft, *uo" ,Sr2) if (Sm2 eq "u' icheck2 (Sleft, ' o" ,Sr2) if (Sm2 eq "w' elsif (Smiddle eq "p") { icheck2( Sleft, "ap",$r2) (Sm2 eq "a" ^' icheck2 (Sleft, "ep",$r2) (Sm2 eq "e" icheck2 (Sleft, "ip",$r2) ($m2 eq "i" icheck2 (Sleft, "lp",$r2) ir ($m2 eq "1" icheck2 (Sleft, "mp",$r2) ($m2 eq "m" icheck2 (Sleft, "op",Sr2) I: (Sm2 eq "o" icheck2(Sleft, "rp",$r2) ( m2 eq "r" icheck2(Sleft, "sp",$r2) ($m2 eq "s" icheck2 (Sleft, "up",$r2) ($m2 eq "u" elsif (Smiddle eq "q") { icheck2( Sleft, "cq",$r2) Lf ($m2 eq "c") ; elsif (Smiddle eq "r") { icheck2 (Sleft, "ar",$r2) if ($m2 eq "a" ) ; icheck2 (Sleft, "cr",$r2) if ($m2 eq "c" ) ; icheck2 (Sleft, "er ,$r2) if (Sm2 eq "e" ) ; icheck2 (Sleft, "gr ,$r2) if (Sm2 eq "g") ; icheck2 (Sleft, "hr ,$r2) if ($m2 eq "h") ; icheck2 (Sleft, "ir ,$r2) if ($m2 eq "i" ) ; icheck2 (Sleft, "or ,$r2) if t$m2 eq *o") ; icheck2 (Sleft, "pr ,$r2) if (Sm2 eq "p") ; icheck2 (Sleft, tr",$r2) if (Sm2 eq "t") ; icheck2 (Sleft, ur",Sr2) if ($m2 eq *u" ) ; icheck2 (Sleft, yr",$r2) if ($m2 eq "y" ) ; elsif (Smiddle eq "s") { icheck2 (Sleft, "as",$r2) if ($m2 eq "a") ; icheck2 (Sleft, "bs",$r2) if ($m2 eq "b" ) ; icheck2 (Sleft, "es",$r2) if t$m2 eq *e") ; icheck2 (Sleft, "is",$r2) if ($m2 eq "i"); icheck2($left,"ks*,$r2) if ($m2 eq "k"); icheck2 (Sleft, ls",$r2) if ($m2 eq ^*1"); icheck2(Sleft, ms",$r2) if ($m2 eq " " ) ; icheck2(Sleft, ns",$r2) if ($m2 eq *n" ) ; icheck2 (Sleft, "OS ,$r2) if (Sm2 eq "o") ; icheck2 (Sleft, "PS ,Sr2) if (Sm2 eq "p") ; icheck2 (Sleft, "rs ,$r2) if (Sm2 eq "r") ; icheck2 (Sleft, "ts ,$r2) if (Sm2 eq "t") ; icheck2(Sleft, "us ,$r2) if (Sm2 eq "u") ; icheck2 (Sleft, "ys ,$r2) if ($m2 eq "y" ) ; elsif (Smiddle eq "t") C icheck2 (Sleft, "at" ,Sr2) (Sm2 eq "a") ; icheck2 (Sleft, "ct" ,Sr2) :Sm2 eq "c¹) ; icheck (Slef , "et" ,$r2) (Sm2 eq "e" ) ; icheck2 (Sleft, "ht" ,Sr2) (3m2 eq "h" ) ; icheck (Sleft, "it" ,$r2) ( Sm2 eq " i* ) ; icheck2 (Sleft, "It" ,Sr2) if (3m2 eq "1* ) ; spell.pl 29/35 p:/SpellGrams/revminedist 98/04/21

icheck2 (Sleft, "nt" , $r2) ($m2 eς " " ) ; icheck2 (Sleft, "ot",$r2) ($m2 eς "o''); icheck2 (Sleft, "pt",$r2) ($m2 eς "p" icheck2 (Sleft, "rt",$r2) (Sm2 eς "r" icheck2 (Sleft, "st" , $r2 ) ($m2 eς "s" icheck2 (Sleft, "ut",$r2) if ($m2 eq "u" elsif (Smiddle eq "u") { icheck2 (Sleft, "au",$r2) if ($m2 eq "a" icheck2 (Sleft, "bu",$r2) if ($m2 eq "b" icheck (Sleft, "cu",Sr2) if ($m2 eς "c" icheck2 (Sleft, "eu",$r2) if (Sm2 eς "e" icheck2 (Sleft , "g " , $r2 ) if ($m2 eς "g" icheck2 (Sleft, "lu",Sr2) if ($m2 eς *1" icheck2 (Sleft, "nu",$r2) if ($m2 eς "n" icheck2 (Sleft, "ou",$r2) if ($m2 eq "o" icheck2 (Sleft, "pu",$r2) if ($m2 eq "p" icheck2 (Sleft, "ru",$r2) if ($m2 eq "r" icheck2(Sleft, "su",$r2) if ($m2 eq *s" icheck2 (Sleft, "tu",$r2) if ($m2 eς "t" elsif (Smiddle eq "v") ( icheck2(Sleft, "av",Sr2) if ($m2 eς "a" icheck2 (Sleft, "ev",$r2) if ($m2 eς "e" icheck2 (Sleft, °iv",$r2) if ($m2 eς "i" icheck2 (Sleft, "lv",$r2) if ($m2 eς "1" elsif (Smiddle eq "w" ) £ icheck2 (Sleft, "ew" , $r2 ) if ($m2 eq "e" icheck2 (Sleft, "ow" , $r2 ) if ($m2 eq "o" icheck2(Sleft, "sw",$r2) if ($m2 eq "s" elsif (Smiddle eq "x") C # icheck2($left,"s",$r2) if ($m2 eq "n") ; elsif (Smiddle eq "y") C icheck2 (Sleft, "ay",$r2) if ( Sm2 eς " a " ) icheck2 (Sleft, "hy" , $r2 ) if ( $m2 eς "h" ) icheck2 (Sleft,"ly" , $r2 ) if ( Sm2 eq " 1 " ) icheck2 (Sleft, "ry",$r2) if ( Sm2 eq "r" ) icheck2 (Sleft, "sy",Sr2) if ( $m2 eς * s " ) elsif (Smiddle eq "z") { icheck2 (Sleft, "iz",$r2) if ($m2 eq "i") ; icheck2 (Sleft, "yz", $r2 ) if ($m2 eq "y" ) ;

}

# Long Distance Transpositions if ($i <= Slength - 2) {

$m2 = substr (Sright, 0, 1) ;

$m3 = substr (Sright, 1, 1) ;

$r2 = substr(Sright, 2) ;

Smid = Smiddle. $m2. $m3 ;

# Not including if (Smid =- /^Λ[aeiouy] [dglnrstv] (aeiouy]$/ ii Smiddle ne Sm3] icheck2 (Sleft, $m3. Sm2. Smiddle, $r2) ;

} elsif (Smid =- /^Λ [lmni [aeiou] [Imn] S/ ii Smiddle ne Sm3 ) icheck2 (Slef , $m3. Sm2. Smiddle, Sr2) ; els: (Smid eq "vel" \ ] Smid eq " lev" I ; Smid eq "ton" | \ Smid eq "net") i icheck2 (Sleft, Sm3. Sm2.Smiddle, 2-2) ; elsif ($mid =- /^Λ [cdf rst] [aieu] [cdfgrs- 5 cccx

icheck (Sleft, Sπϋ . $m .Smiddle, $r2) ; spell.pl /21 pr/SpellGrams/revminedist

(Sversion >= 4) C Γ Multiole Character Substitutions ...

# y/ia, f/ph, al/le,

÷r not doing pneu/ne, ant/ent, aπce/ence, aiy/ally, eu/ea, oe/ow, i pre/pro, ious/uous, pre/per, caed/sede, ament/ement, eous/ious, ÷r sh/sc, ghth/ght, all/al, c/sc/s/ss, m/gm, ss/s, eorg/orge, ene/e= Γ uf/ough, mce/cem, eat/ate, tui/uit, ai/ile, ash/has, fea/afe, rau/ura if (Smiddle eq "e") ( icheck2 (Sleft, "ia", Sright) ; icheck2 (Sleft, "ai", Sright) ; } elsif (Smiddle eq "f") { icheck2 (Sleft, "ph" , Sright) ; icheck2 (Sleft, "ve" , Sright) ; } elsif (Smiddle eq "i") { icheck2 (Sleft, "ea", Sright) ; } elsif (Smiddle eq "y" ) { icheck2 (Sleft, "ie" , Sright) ; } if ($i == Slength - 1) { if (Smiddle eq "t") { icheck (Sleft, "ed" , Sright) ; icheck2 (Sleft, "led" , Sright) ;

} } if (Si <= Slength - 2) {

Smiddle2 = substr (Sword, Si, 2) ;

Sright2 = substr (Sword, $i+2) ; if ($middle2 eq "al") { icheck2 (Sleft, "le" , $right2) ;

} elsif (Smiddle2 eq "as") {

# from cassi —> ccasi icheck2( Sleft, "ca" , $right2) ;

} elsif ($middle2 eq "aw") { icheck2 (Slef , "o" , $right2) ; } elsif (Smiddle2 eq "ce") { icheck2($left,"es",$right2) } elsif (Smiddle2 eq "co") C icheck2 (Sleft, "om" , $right2) } elsif ($middle2 eq "de" ii Si == 0) { icheck2 (Sleft, *un" , $right2 ) } elsif ($middle2 eq "ea") { icheck2 ( Sleft , " i " , $right2 ) ; icheck2 (Sleft, "ie" , $right2) ; } elsif (Smiddle2 eq "el") C

# pell —> ppel icheck (Sleft, "pe" , $right2) ; icheck (Sleft , "al " , $right2 ) ;

# icheck2 (Sleft, " ia" , Sright2 ) elsif (Smiddla2 eq "an") { icheck2 (Slef , " ine" , Sright2 j ^■ } elsif (Smiddle2 eq "ey") C icheck2 (Sleft, " ie" , Sright2 ) ; } elsif (Smiddie2 eς "fe") ( spell.pl 31/35 p:/SpelIG rams/revminedist/ 98/04/21

=r tro tteπ: :g —> ferring and :terec —> rerreα icheck2 (Sleft,"er",$right2) ;

# from ffes — > fess icheck2 (Sleft ^•es",$right2) ; elsif ($middle2 eq "gi") {

# ggin —> gi .nn icheck2 (Sleft , "in",$right2) ; elsif ($middle2 eq "ia") { icheck2 (Sleft, "e",$right2) ; elsif (Smiddle2 eq "ie") { icheck2 (Sleft, "y" ,$right2) ;

. icheck2 (Sleft, "ey" ,$right2) ; elsif ($middle2 eq "in") ( Γ from cinn — •> ccin icheck2 (Sleft ^•ci",$right2) ; elsif ($middle2 eq "if) { icheck2 (Sleft "ate",$right2) ; icheck2 (Sleft ,"ute",$right2) ,- icheck2 (Sleft, "mi" ,$right2) ; icheck2 (Sleft, "te",$right2) ,- elsif ($middle2 eq "le*) { icheck2 (Sleft "al " , $right2 ) ;

# icheck2($laft, "el",$right2) ; elsif ($middle2 eq "lo") { icheck2 (Sleft,"os",$right2) ; elsif ($middle2 eq "inn^*) { icheck2 (Sleft, "urn" ,$right2) ; elsif ($middle2 eq "oo") { icheck2 (Sleft ,"u",$right2); elsif ($middle2 eq "ph*) £ icheck2 (Sleft "f-,$right2) ; elsif ($middle2 eq "qu*) £ icheck2 (Sleft , "ck",$right2) ; elsif ($middle2 eq "ra") { icheck2 (Sleft,"al",$right2) ,- icheck2( Sleft, "as " , $right2 ) ; elsif ($middle2 eq "ri*) f icheck2 (Sleft ,"ib",$right2); icheck2($lef ,"if",$right2) ; elsif ($middle2 eq "rq") £ icheck2 (Sleft, "er",$right2) ; elsif ($middle2 eq "si^*) £

# from ssic — > sice icheck2 (Sleft "ic",$right2) ; elsif ($middle2 eq "te^*) £ icheck2 (Sleft ,"ght",$right2) ; elsif ($middle2 eq "ye*) { icheck2 (Sleft,"i",$right2);

(Si <= Slength - 3) { Smiddle3 = substr (Sword, $i, 3) ; Sright3 = substr (Sword, $i-3) ; if (Smiddle3 eq "aga") { icheck2 (Sleft, "adge" ,Sright3 ) } elsif (Smiάάle3 eq "acy^*} ( icheck2 ( Sleft , " isy" , Srιght3 j ; } elsif (Smiddlej eq "asa") £ 32/35 spell. pi 98/04/21 p:/Spe!lGrams/revminedist/

icheck2 (Sleft, "sea",Sright2) ; elsif (Smiddie3 eq "dec") £ icheck2 (Sleft, ^•t",Sright3); elsif ($middle3 eq "ear") £ icheck2 (Sleft, "ere" ,Sright3) ; elsif ($middle3 eq "evi") £ icheck2 (Sleft, "iev" ,Sright3) ; elsif ($middle3 eq "exi") £

# from exion -—> ection icheck2( Sleft, "ecti " , $right3 ) ,- elsif (Smiάdle3 eq "gin") { .icheck2 (Sleft, " ing" , Sright3 ) ; elsif (Smiddle3 eq "ine") { icheck2 (Sleft, "ein",$right3) ; elsif ($middle3 eq "isy") £ icheck2 (Sleft, "acy",Sright3) ; elsif ($middle3 eq "nts") £ icheck (Sleft, "nce",$right3) ,- elsif ($middle3 eq "ons" ) { icheck2( Sleft, "a",$right3) ; elsif ($middle3 eq "que") { icheck2 (Sleft, "ck",≤right3) ; elsif ($middle3 eq "sci") £ icheck2 (Sleft, "cil",$right3); elsif ($middla3 eq "tio") { icheck2 (Sleft, "cea",Sright3) ; elsif ($middle3 eσ "unp" ϋ Si == 0) if un —> im before a p icheck2 (Sleft, "imp" , $right3 ) ; elsif ($middle3 eq "urns") £ icheck2 (Sleft, "a",$right3) ,- elsif ($middle3 eq "ure") £ icheck2 (Sleft, "eur" , $right3 ) ; icheck2 (Sleft, "er",Sright3) ;

} sub check £ local (Sleft, Smiddle, Sright) = @_; local ( Sword) ;

Sword = Sleft. Smiddle. Sright; if f$dict£$word} != 0) t

Scorrection = iapply_case (Sword, Scase) ; if ( !grep(/^Λ$correction$/, Θcorrections) ) ( push ( ©corrections , Scorrection) ; $numcor+÷;

}

..b check2 £ local (Sleft, Smiddle, Sright ) = -i_ local (Sword, 3length, Si) ; Sword = Sleft. Smiddle. Sright ; Slength = length (Sword) ; spell.pl 33/35 p:/Spe!IGrams/revminedis / 98/04/21

for ($i = length (Sleft. middle) ; Ξi < Slength; Si—) £ isoellcor_pos (Sword, Si, Slength) ; } }

# Code for Sorting sub by_freq ( if (SdictCSb} == $dict£$a}) { this gives a preference to insertions over subst/trans over deletions length($b) <=> length(Sa);

} else £

$dict£$b} <=> Sdict£$a}; } }

# The following code relates to the case (UPPER, lower, Initial, etc.)

# of the word. sub id_case £ local (Sword) = 9_; local (Swfirst, Swrest, Slcfirst, Slcrest);

Swfirst = substr (Sword, 0, 1) ;

Swrest = substr (Sword, 1) ;

Slcfirst = Swfirst; Slcfirst =- tr/A-Z/a-z/;

Slcrest = Swrest; Slcrest =- tr/A-Z/a-z/; if (Swfirst eq Slcfirst) £

# First Letter is Lowercase if (Swrest eq Slcrest) £

# Rest of Word is Lowercase return(O) ;

} elsif (length (Swrest) >= 2 ϋ substr (Swrest, 0,1) ne substr (Slcrest, 0, 1) ii substr ( Swrest, 1) eq substr (Slcrest, 1) ) ( π mistake style return(0) ; } elsif (length (Swrest) >= 2 ϋ substr (Swrest, 0,1) ne substr (Slcrest, 0, 1) ii substr (Swrest, 1,1) ne substr (Slcrest, 1, 1) i substr (Swrest, 2) ne substr (Slcrest, 2) ) £

# mISTAKΞ style (inverted caps lock) return(2) ;

} else £

# Rest of Word is Uppercase retur (1) ;

} } else £

S First Letter is Uppercase if (Swrest eq Slcrest) £

# Rest of Word is Lowercase r=tum(2) ; aisif (length (Swrest) >= 2 ϋ subst (Swrest, 0,1) eq substr ( lcrest , 0 , 1) ii substr (Swrest, 1,1) ne subst (Slcrest, 1, 1) ii substr (Swrest, 2) eq subst (Slcrest, ) ) £ =f McDonald style 34/35 spell.pl

98/04/21 pr/SpellGrams/revmiπedist/

retur (4) ; } elsif (length (Swrest) >= 2 ϋ substr (Swrast, 0, 1! ne substr (Slcrest, 0 , 1) ii substr (Swrsst, 1) eq substr (Slcrest, 1) ) £

# Mistake style return(2) ;

} elsif (Swrest =- /-/) C

# Foo-3ar style, ignore retur (-I) ;

} else £

S Rest of Word is Uppercase xetum(3) ; }

} sub apply_case £ local (Sword, Scase) = θ_; local (Swfirst, Swrest); if (Scase == 0) {

# word

Sword =- tr/A-Z/a-z/ ; retur (Sword) ; } elsif (Scase == 1) £

# wORD

Swfirst = substr (Sword, 0,1) ; Swrest = substr (Sword, 1) ; Swfirst =- tr/A-Z/a-z/; Swrest =- tr/a-z/A-Z/; retur (Swfirst. Swrest) ; } elsif (Scase == 2) {

# Word

Swfirst = substr (Sword, 0,1) ; Swrest = substr (Sword, 1) ; Swfirst =- tr/a-z/A-Z/; Swrest =- tr/A-Z/a-z/; return(Swfirst. Swrest) ; } elsif (Scase == 3) {

# WORD

Sword =- tr/a-z/A-Z/; retur (Sword) ; } elsif (Scase == 4) {

# McDonald

Swfirst = substr($word, 0,1) ;

Swsecond = substr(Sword, 1,1) ;

Swthird = substr ($word, 2,1) ;

Swrest = substr (Sword, 3) ;

Swfirst =- tr/a-z/A-Z/;

Swsecond =- tr/A-Z/a-z/ ;

Swthird =- tr/a-z/A-Z/ ;

Swrest =- tr/A-Z/a-z/ ; return ( Swfirs . Swsecond. Swthird. Swres;

Correcting word boundary errors . spell.pl 35/35 p:/Spe!IGrams/revminedis / 98/04/21

sub split_word £ local (Sword) = @_;

IccaKSfound, $i, Slength, S≤point, Sleft, Sright) ; if (Sword =- /\ -,\w+/l £ if (Sword !- /\d÷,\d÷/ ii Sword !- A(.÷\)/) £

Sword =- s/, /, /g; } else £

Sword = Sword; } } elsif (length(Sword) >= 6) £

TΓ search for potential split point

$found = 0;

Sspoint = 0;

Slength = length (Sword) ; for ($i = 3; $i <= Slength - 3; $i+÷) £

Sleft = substr (Sword, 0, $i) ;

Sright = substr (Sword, $i) ;

# note that we\'re not changing the case, £ so FooBar will not be split if ($dict£$left} >= S insplitfreq ii S ict fSright} >= Sminsplitfreq) £ $found+τ; Sspoint = $i;

} } if (Sfound == 1) £

Sleft = substr (Sword, 0, Sspoint) ;

Sright = substr (Sword, Sspoint) ;

Sword = "Sleft Sright"; } elsif (Sfound > 1) £

# if more than one, return word unchanged for now Sword = Sword;

} else {

# if no splits found, return word unchanged Sword = Sword;

} } else {

# leave the word alone

Sword = Sword; } retur ( Sword) ; }

# *EOF*

errstat.pl 1/3 p:/Spe!lGrams/revmiπedist/ 98/04/21

# ! /usr/lccal/bin/peri

Smisses = 0; while (<>) £ chomp; S/\s÷$//;

(Sleft, Sright) = spli (/ —> / ) ; Smismatch = ifind_mismatch($lef , Sright) ; if (subst (Sleft, Smismatch÷i) eq substr (Sright, Smismatch÷i) ii lengt (Sleft) > Smismatch ϋ length (Sright) > Smismatch) ( $sub£substr (Sleft, Smismatch, 1) .substr (Sright, Smismatch,!) }-rτ; } elsif (substr (Sleft, Smismatch÷i) eq substr (Sright, Smismatch) ) £

$del£substr (Sleft, Smismatch, 1) }+÷; } elsif (substr (Sleft, Smismatch) eq substr (Sright, Smismatch÷i) ) (

Sins £substr(Sright, Smismatch, 1) }+÷; } elsif (substr (Sleft, $mismatch÷2) eq substr (Sright, $mismatch+2) ii substr (Sleft, Smismatch, 1) eq substr (Sright, Smismatch÷i, 1) ϋ substr($left, Smismatch÷i, 1) eq substr (Sright, Smismatch, 1) ) £ $trans (substr (Sleft, Smismatch, 2) }÷÷; } elsif (substr($left,$mismatch÷3) eq substr (Sright, $mismatch÷3) ii substr (Sleft, Smismatch, 1) eq substr (Sright, $mismatch÷2, 1) ϋ substr (Sleft, Smismatch÷i, 1) eq substr (Sright, Smismatch÷i, I) i≤ substr (Sleft,$mismatch÷2, 1) eq substr (Sright, Smismatch, 1) ) £ $trans2 (substr (Sleft, Smismatch, 3) }÷÷; } elsif (substr (Slef , $mismatch-4) eq substr (Sright, $mismatch+4) ii substr (Sleft, Smismatch, 1) eq substr (Sright, $mismatch÷3 , 1) ϋ substr($left,$mismatch-l,2) eq substr (Sright, Smismatch÷i, 2) iS substr ($lef , $mismatch÷ , 1) eq substr (Sright, Smismatch, 1) ) £ $trans2{substr(Sleft, Smismatch, 4) }++; } elsif (substr($left,$mismatch÷2) eq substr (Sright, Smismatch) ii length(Sleft) > Smismatch÷i) ( Sdel2 {substr (Sleft, Smismatch, 2 ) }÷÷ ; } elsif (substr (Sleft, $mismatchτ3) eq subst (Sright, Smismatch) ii length(Sleft) > $mismatch÷2) { $del2 (substr (Sleft, Smismatch, 3 ) } ++; } elsif (substr (Sleft,$mismatch+4) eq substr (Sright, Smismatch) ϋ length(Sleft) > $mismatch+3) { $del2 (substr(Sleft, Smismatch, 4) }++; } elsif (substr(Sleft, Smismatch) eq substr (Sright, $mismatch÷2) ii length(Sright) > Smismatch÷i) £ Sins2£substr(Sright, Smismatch, 2) }++; } elsif (substr (Sleft, Smismatch) eq substr (Sright, $mismatch÷3) ii length(Sright) > $mismatch÷2) { Sins2£substr(Sright, Smismatch, 3) }++; } elsif (substr(Sleft, Smismatch) eq substr (Sright, $mismatch÷4) ii length(Sright) > $mismatch+3) { $ins2£substr(Sright, Smismatch, 4) }++; } elsif (substr (Sleft,$mismatch÷2) eq substr (Sright, Smismatch÷i) ii length(Sleft) > Smismatch ii length(Sright) > Smismatch) £ $sub2 {substr(Sleft, Smismatch, 2) ." —> " .substr (Sright, Smisma_tch, 1) } } elsif (substr($left,$mismatchτ-l) eq substr (Sright, ≤mismatch÷2) ii length(Sleft) > Smismatch ii length(Sright) > Smismatch) { Ss b2 (substr (Sleft, Smismatch, 1) ." —> " .substr (Sright , Smismatch, 2 ) } elsif (substr (Sleft, Smismatch-2) eq substr ( Sright, Smismatch-2 ) ii length(Sleft) > Smismatch ii length) Sright ) > Smismatch: Ssub2 (substr (Sleft, Smismatch, 2) ." —> " .subst (Sright , Smisma_tch, 2 ; } elsif (substr (Sleft, Smismatch÷i) eq substr (Sright, Smismatch- ) ii 2 3 errstat.pl

98/04/21 p:/Spe!IGrams/revmiπedist/

length( Sle t ) > Smismatch ϋ lengt (Sright) > Smismatch-2) ( 5sub2Csubstr (Sleft, Smismatch, 1) ." —> " .substr (Sright, Smismatch, 3 )} ÷-r ; } elsif (substr (Sleft, mismatch-2) eq substr (Sright, Smismatch÷3 ) ϋ length (Sleft) > Smismatch÷i ϋ length(Sright) > $mismatch÷2) £ Ssub2£substr( Sleft, Smismatch, 2) ." ^■—> " .subst (Sright, Smismatch, 3 ) }÷÷; } elsif (substr (Sleft, Smismatch÷3 ) eq substr (Sright, Smismatch÷i) ϋ length(Sleft) > $mismatch÷2 ii length(Sright) > Smismatch) { $sub2 (substr (Sleft, Smismatch, 3) . * —> " .substr (Sright, Smismatch, 1) }÷-; } elsif (substr (Sleft, $mismatch÷3 ) eq substr (Sright, Smismatch-2 ) ϋ length (Sleft) > Smismatch-2 ϋ length (Sright) > Smismatch÷i) £ Ssub2£subst (Sleft, Smismatch, 3) . " —> " .subst (Sright, Smismatch, 2) }÷-; } elsif (substr (Sleft, Smismatch-2) eq substr (Sright, $mismatch÷3 ) ϋ length(Sleft) > $mismatch+2 ii length (Sright) > $mismatch÷2) £ $sub2 (substr (Sleft, Smismatch, 3) . " —> " .subst (Sright, Smismatch, 3) }÷÷; } else £ print "Sleft —> $right\n"; Smisses+÷; } } print "\nMisses: $misses\n" ; print " \nSubstitutions : \n" ; foreach Scorr (keys(%sub)) £ printf "%6d\t%s —> %s\n", $sub£Scorr}, subst (Scorr, 0, 1) , substr (Scorr , 1) , } print "\nDeletions:\n" ; foreach Scorr (keys(%del)) £ printf "%6d\t%s\n", $del£$corr}, Scorr; } print " \nInsertions : \n" ; foreach Scorr (keys(%ins)) { printf "%6d\t%s\n", SinsfScorr}, Scorr; } print "\nTranspositions:\n" ; foreach Scorr (keys (%trans ) ) £ printf "%6d\t%s\n", $ rans £Scor } , Scorr; } print "\nLong Distance Transpositions : \n" ; foreach Scorr (keys (%trans2) ) £ printf "%6d\t%s\n", $trans2 (Scorr} , Scorr; } print " \nLarger Substitutions : \n" ; foreach Scorr (keys (%sub2) ) { printf "%6d\t%s\n", $sub2 £$corr} , Scorr; } print " \nLarger Deletions : \n' ; it-reach Scorr (keys (%dei2 ) ; ^■' printf "%δd\t s\n" , 5del (Scorr , Scorr: }

\nlarger Insertions errstat.pl 3/3 p:/SpellGrams/revminedist/ 98/04/21

foreach Scorr (keys (%ins2) ) £ printf

Sins2 (Scorr} , Scorr; } sub find_mismatch £ local ($wordl,$word2) = @_; local ( $wl, $w2 , ©word! , @word2 , Scount) ; ©wordl = spli ( // , Swordl) ; 3word2 = split ( // , $word2 ) ; for ($count=0; Scount < i i (length! Swordl ) , length ( Sword2 )); Scount--) £ $wl = shift (Swordl) ; $w2 = shift (@word2) ; if (Swl ne $w2) £ last; } } return (Scount) ; } sub min £ local ($vall,$val2) = @_; if (Svall <= $vai2) £ retur ($vail) ; } else { retur ($vai2) ; } }

Claims

I CLAIM:

1. A computer method of spelling correction which comprises a step of testing a word against a valid dictionary and, if the word is not in the dictionary, calculating the edit distance to at least one valid word using a restricted set of edit operations that correct the most common errors comprising insertion, deletion, transposition and/or substitution errors and displaying at least one valid word.

2. The computer method of spelling correction according to claim 1 wherein statistical techniques are used to identify the most common spelling errors and/or the restricted set of edit operations.

3. The computer method of spelling correction according to claim 1 wherein the calculation of edit distance is a step in a standard minimum edit distance algorithm.

4. The computer method of spelling correction according to claim 1 wherein the calculation of edit distance is a step in a reverse minimum edit distance algorithm.

5. The computer method of spelling correction according to claim 1 wherein the valid word is displayed in a list of candidate words.

6. The computer method of spelling correction according to claim 1 wherein the valid word is displayed by replacing it for the misspelled word.

7. The computer method of spelling correction according to claim 1 wherein the restricted set of edit operations includes the most common edits at distance one to correct errors based upon a training corpus of documents with uncorrected spelling errors.

8. The computer method of spelling correction according to claim 1 wherein the restricted set of edit operations includes common complex edits selected from the group comprising long-distance transpositions, multiple letter corrections and missing space errors.

9. The computer method of spelling correction according to claim 1 implemented in the automatic spelling correction function of a word processing program.

10. The computer method of spelling correction according to claim 1 implemented in the batch spelling correction function of a word processing program.

11. The computer method of spelling correction according to claim 1 implemented to correct the spelling on an input line in a computer interface such as a command line for an operating system, a data base query or the like.

12. A computer method of spelling correction comprising the steps of : a) storing a dictionary of valid words; b) for each input string to be checked comparing the input string to words in the stored dictionary to identify input strings not in the dictionary; c) for each input string not found in the preceding step, generating test words by a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution; d) comparing the edited input string generated in the preceding step with words stored in the dictionary; and e) generating a candidate word or list of candidate words from edited input strings that are found in the dictionary.

13. A computer method according to claim 12 wherein the members of the restricted set of edit operations are selected based upon the most common spelling errors .

14. A computer method according to claim 13 wherein statistical techniques are used to identify the most common spelling errors and/or the restricted set of edit operations.

15. A computer method according to claim 14 wherein a corpus of documents used to determine the most common spelling errors is selected from the academic or business field with which the method will be used.

16. A computer method according to claim 15 wherein the corpus of documents used to determine the most common spelling errors is selected from documents generated by the individual who will use the method.

17. A computer method according to claim 12 wherein the members of the restricted set of edit operations are selected based on the letter n-grams containing the letter or letters to be edited.

18. A computer method according to claim 12 wherein the edit operations are restricted to distance one.

19. A computer method according to claim 12 wherein if no valid edited input strings are found at edit distance one, allowing edits of distance two.

20. A computer method according to claim 12 wherein the edit operations are restricted to distances one and two .

21. A computer method according to claim 12 allowing all possible edits if no valid edited input strings are found at edit distances one or two.

22. A computer method according to claim 12 wherein the edit operations include long-distance transpositions, multiple letter substitutions, multiple letter deletions and missing space errors at edit distance one .

23. A computer method according to claim 12 wherein the dictionary is stored in a data structure selected from the group hash tables, binary trees and tries .

24. A computer method according to claim 12 wherein the substitution edits may include non-alphabetic characters .

25. A computer method according to claim 12 wherein a candidate list is sorted by combinations of word length, typed edit, word frequency or edit frequency.

26. A computer method according to claim 12 further comprising searching for missing space errors by testing complementary portions of a nonword for being valid words with a frequency above a given threshold.

27. A computer method of spelling correction in input lines comprising the steps of: a) storing a dictionary of valid words; b) for each word in the command line comparing the word to words in the stored dictionary to identify misspelled words; c) for each misspelled word found in the preceding step, generating test words by a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution; d) comparing the edited words generated in the preceding step with words stored in the dictionary; and e) substituting a candidate word in the command line.

28. A computer method of spelling correction of words in a computer document comprising the steps of: a) storing a dictionary of valid words; b) for each word in the computer document comparing the word to words in the stored dictionary to identify misspelled words; c) for each misspelled word found in the preceding step, generating test words by a restricted set of edit operations which correct the most common errors comprising insertion, deletion, transposition and/or substitution; d) comparing the edited words generated in the preceding step with words stored in the dictionary; and e) substituting a unique candidate word for misspelled words in the computer document.