GB2136612A - Word checking system - Google Patents
Word checking system Download PDFInfo
- Publication number
- GB2136612A GB2136612A GB08306665A GB8306665A GB2136612A GB 2136612 A GB2136612 A GB 2136612A GB 08306665 A GB08306665 A GB 08306665A GB 8306665 A GB8306665 A GB 8306665A GB 2136612 A GB2136612 A GB 2136612A
- Authority
- GB
- United Kingdom
- Prior art keywords
- word
- code
- store
- stored
- responsive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9017—Indexing; Data structures therefor; Storage structures using directory or table look-up
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A word checking system includes first and second encoders EN1, EN2 responsive to a word to encode it in accordance with two different algorithms to produce first and second codes. A directory store DS is included in which the first code of any word may be stored at a location determined by the second code. Checking means CM are provided to check for the presence in the store DS of a first code relating to a word applied to the encoders at a location defined by the second code. If the applied word is present then another word may be checked. The absence of an applied word causes the checking means to give an indication so that an operator may decide on the action to be taken. <IMAGE>
Description
SPECIFICATION
Word checking system
This invention relates to a word checking system, that is to a system which will carry out a simple check on spelling when text is written or encoded.
Systems are known which check spelling, usually comprising word processing or computer systems.
One of the main disadvantages is the need for the system to store a very large vocabulary so that any word used may be checked. The presence of a large vocabulary also leads to slow and complex search routines. It is, of course, possible to use smaller vocabularies and to define rules for its use. For example only words having more than a certain number of characters may be checked, or only technical terms, and so on. These rules are, however, complex to define and again result in complex search routines. In addition, words vary condiderably in length, and their distribution throughout a conventional dictionary is very uneven. For example there are many more words starting with the letter "e" than with the letter "q".This results in very uneconomic use of storage since it is difficult to decide in advance how much space should be provided for words beginning with any particular letter.
It is an object of the invention to provide a word checking system having a vocabulary of any desirable size which operates with simple and clearlydefined rules.
According to the present invention there is provided a word checking system which includes first encoding means responsive to the application of a word to encode the word in accordance with a first algorithm so as to produce a first code representative of that word, a store in which said code may be stored, second encoding means responsive to the application of said word to encode the word in accordance with a second algorithm different from the first algorithm so as to produce a second code defining a location in said store in which a number of said first codes may be stored, and checking means operable on receipt of a first code and a second code both corresponding to a word to compare the first code with any first code already stored in the store location defined by the second code, the checking means being responsive to the presence of said first code in said location to cause the first and second encoding means to accept another word, and responsive to the absence of said code to give an indication of such absence.
The invention will now be described with referpence to the accompanying drawings, in which:
Figure 1 is a schematic block diagram of one embodiment of the invention;
Figure 2 is a flow diagram illustrating the operation of the invention; and
Figure 3 is a flow diagram illustrating the operation of one encoder.
Referring now to Figure 1, this shows a schematic block diagram of one embodiment of the invention.
The invention is applied to a word processor or to a computer having word-processing facilities, and including a text store TS and a display DP on which selected parts of the text may be displayed. The word processor is assumed to have the ability to apply successive words of the text displayed on the display to some other device. This is a conventional facility, used for example when printing a screen of displayed information. In the invention each successive word of the displayed text is applied in parallel to two encoders, shown in Figure 1 as EC1 and EC2.
Each of the two encoders produces a code in respect of the word applied to it as determined by a "hash" or algorithm, the two algorithms being different so that the two codes together identify the word in question. The code output from encoder
EN2 defines an address in the dictionary store DS in which all words forming the dictionary are stored in a form determined by the output of encoder EN 1.
The output from encoder EN2 may be used to cause each word in the defined location to be read out to a comparator CM, in which it is compared with the output from encoder ENI. The same encoder output may also be applied to the store.
"Hashing" techniques are well known, and a survey of such techniques may be found, for example in chapter 4 of "Compiling Techniques" by F.R.A.
Hopgood, published by Macdonald in 1969.
Referring now to Figure 2, this shows a flow chart for the operation of the invention. The blocks having a double outline are those denoting a decision made or an operation carried out by the system operation.
It is assumed initially that a page or pages of text have been typed and that the text is stored in the text store TS of Figure 1. The first operation of the system, as shown by block 10 is therefore to withdraw the first word of text from the text store.
The word is then subjected to the two hash operations simultaneously to provide the two code outputs required, as shown at block 11. As defined by block 12, the dictionary store DS is then searched at an address defined by the second code to check whether the word defined by the first code is present or not. The first decision is made at block 13, depending upon whether or not the required word is present in the store at the address searched. If the word is present, then the next word is selected from the text store and the operation is repeated. If, however, the word is not present, block 14 shows the next decision which has to be made. This determines whether or not the spelling of the word in question is correct, and requires action by an operator. If the spelling is not correct, then a correction has to be made, as shown by block 15.In this case the corrected word is passed back for the two hash operations to be repeated, followed by the search of the dictionary store and decisions 13 and 14.
If the spelling of the word was in fact correct, and the word is not already in the dictionary store, then a further decision, shown by block 16, has to be made, again by the operator. This questions whether the word should be entered into the dictionary store as a new word. In practice, words which are names or rarely-used words may not need to be stored, whereas words which are likely to be used again may be stored. If the word is not to be stored, then the next word in the text store is selected, and the procedure set out above is repeated.
If the word is to be stored, then this is done as indicated by block 17. Finally, the last decision block 18 asks if there are any more words in the text store.
If so, then the next word is selected. If there are no further words, then the operation is stopped, with some appropriate indication being given.
The manner in which operator decisions are requested or the end of the operation indicated will depend upon the particular word processing or computer system to which the spelling check system is attached. By way of example only, the need for an operator decision may be indicated by highlighting or flashing the word in question, and the end of the operation may be indicated by the cursor flashing at the bottom of a page of displayed text. There are, of course, other ways of indicating these functions of the system.
Hash operations are well-known, and may take a wide variety of forms. As already stated, the system described above uses two different hash operations which together define any particular word. By way of example only, Figure 3 is a flow chart for one particular hash operation which may be used. This operation requires that any character may be defined by a five-bit number, the final hash being a sixteen-bit number.
The first operation, indicated by block 20, is to clear a 16-bit register of the results of any previous hash operation. Block 21 requires the first letter of the word to be converted into a 5-bit number. The simplest way of doing this is to use the position of the letter in the alphabet, so that for example the letter 'w' becomes 10111. Other conversions may be used. The 5-bit number is then put through an 'exclusive-OR' operation with any number already occupying the five least significant bits of the register, and the result is stored in that same position in the register, as indicated by blocks 22 and 23. Decision block 24 asks whether there is another letter in the word, and if not the entire 16-bit contents of the register are read out as the required 'hash' or code, as at block 25.
If there is another letter in the word, then the contents of the register are shifted cyclically 5 bits to the left. Hence the previous five most significant bits will become the five least significant bits, and all other bits will increase their significance by five (see block 26). The cycle then repeats, with the next letter being converted into a five-bit number, the exclusive-OR operation, and so on. When the complete word has been processed, then the 18bit number in the register represents the final encoded output.
Many other hash techniques are known, possibly involving purely mathematical operations.
The technique described above overcomes one of the main problems of known systems in that all codes to be stored in the dictionary store are of the same length. This results in considerable improvement in the use of the storage available. In addition the stored codes will tend to be shorter than the average word.
All of the operations described above are capable of being performed by any general-purpose computer.
Claims (5)
1. A word checking system which includes first encoding means responsive to the application of a word to encode the word in accordance with a first algorithm so as to produce a first code representative of that word, a store in which said code may be stored, second encoding means responsive to the application of said word to encode the word in accordance with a second algorithm different from the first algorithm so as to produce a second code defining a location in said store in which a number of said first codes may be stored, and checking means operable on receipt of a first code and a second code both corresponding to a word to compare the first code with any first code already stored in the store location defined by the second code, the checking means being responsive to the presence of said first code in said location to cause the first and second encoding means to accept another word, and responsive to the absence of said code to give an indication of such absence.
2. A system as claimed in Claim 1 which includes means operable to store the first code relating to a word in the store at a location determined by the second code relating to that word.
3. A system as claimed in either of Claims 1 or 2 in which the first and second codes uniquely identify any word.
4. A system as claimed in any of Claims 1 to 3 in which the first and second codes are generated simultaneously.
5. A word checking system substantially as herein described with reference to the accompanying drawings.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB08306665A GB2136612B (en) | 1983-03-10 | 1983-03-10 | Word checking system |
DE19843407831 DE3407831A1 (en) | 1983-03-10 | 1984-03-02 | WORD CHECK ARRANGEMENT |
NL8400712A NL8400712A (en) | 1983-03-10 | 1984-03-05 | WORD CONTROL SYSTEM. |
AU25418/84A AU559290B2 (en) | 1983-03-10 | 1984-03-08 | Word checking system |
BR8401177A BR8401177A (en) | 1983-03-10 | 1984-03-12 | WORD VERIFICATION SYSTEM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB08306665A GB2136612B (en) | 1983-03-10 | 1983-03-10 | Word checking system |
Publications (3)
Publication Number | Publication Date |
---|---|
GB8306665D0 GB8306665D0 (en) | 1983-04-13 |
GB2136612A true GB2136612A (en) | 1984-09-19 |
GB2136612B GB2136612B (en) | 1986-04-09 |
Family
ID=10539349
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB08306665A Expired GB2136612B (en) | 1983-03-10 | 1983-03-10 | Word checking system |
Country Status (5)
Country | Link |
---|---|
AU (1) | AU559290B2 (en) |
BR (1) | BR8401177A (en) |
DE (1) | DE3407831A1 (en) |
GB (1) | GB2136612B (en) |
NL (1) | NL8400712A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0287713A1 (en) * | 1987-04-23 | 1988-10-26 | Océ-Nederland B.V. | A text processing system and methods for checking in a text processing system the correct and consistent use of units or chemical formulae |
WO1998039715A1 (en) * | 1997-03-07 | 1998-09-11 | Apple Computer, Inc. | System and method for rapidly identifying the existence and location of an item in a file |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS57172471A (en) * | 1981-04-17 | 1982-10-23 | Casio Comput Co Ltd | Searching system for electronic dictionary having extended memory |
US4588985A (en) * | 1983-12-30 | 1986-05-13 | International Business Machines Corporation | Polynomial hashing |
-
1983
- 1983-03-10 GB GB08306665A patent/GB2136612B/en not_active Expired
-
1984
- 1984-03-02 DE DE19843407831 patent/DE3407831A1/en active Granted
- 1984-03-05 NL NL8400712A patent/NL8400712A/en not_active Application Discontinuation
- 1984-03-08 AU AU25418/84A patent/AU559290B2/en not_active Ceased
- 1984-03-12 BR BR8401177A patent/BR8401177A/en unknown
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0287713A1 (en) * | 1987-04-23 | 1988-10-26 | Océ-Nederland B.V. | A text processing system and methods for checking in a text processing system the correct and consistent use of units or chemical formulae |
US5159552A (en) * | 1987-04-23 | 1992-10-27 | Oce-Nederland B.V. | Method for checking the correct and consistent use of units or chemical formulae in a text processing system |
WO1998039715A1 (en) * | 1997-03-07 | 1998-09-11 | Apple Computer, Inc. | System and method for rapidly identifying the existence and location of an item in a file |
US5897637A (en) * | 1997-03-07 | 1999-04-27 | Apple Computer, Inc. | System and method for rapidly identifying the existence and location of an item in a file |
Also Published As
Publication number | Publication date |
---|---|
GB8306665D0 (en) | 1983-04-13 |
BR8401177A (en) | 1984-10-23 |
AU559290B2 (en) | 1987-03-05 |
DE3407831C2 (en) | 1988-10-13 |
NL8400712A (en) | 1984-10-01 |
AU2541884A (en) | 1984-09-13 |
DE3407831A1 (en) | 1984-09-13 |
GB2136612B (en) | 1986-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5224038A (en) | Token editor architecture | |
US4689768A (en) | Spelling verification system with immediate operator alerts to non-matches between inputted words and words stored in plural dictionary memories | |
US3995254A (en) | Digital reference matrix for word verification | |
EP0054667A1 (en) | Method of generating a list of expressions semantically related to an input linguistic expression | |
US4092729A (en) | Apparatus for automatically forming hyphenated words | |
KR950012251A (en) | Hanja conversion correction processing method | |
GB2136612A (en) | Word checking system | |
JPH056398A (en) | Document register and document retrieving device | |
JPS621062A (en) | Documentation supporting device | |
JPH0575143B2 (en) | ||
JPH0731315Y2 (en) | Electronics | |
EP0391706B1 (en) | A method encoding text | |
JPH0685169B2 (en) | Document processing method | |
JP2889431B2 (en) | Character processor | |
JP2761606B2 (en) | Document data processing device | |
JPS6315360A (en) | Kana-kanji converting system | |
JPS5925268B2 (en) | A device that generates a vector representation of an input word | |
JPH05289845A (en) | Code converter | |
JPS62271051A (en) | Producing device for document in japanese language | |
Rolfe | Generations of permutations with non-unique elements | |
JPS5852719A (en) | Character data inputting method | |
JPH0395668A (en) | Character data processor | |
JPH0556553B2 (en) | ||
JPH05128105A (en) | Kanji/kana converter | |
JPS62117064A (en) | Kanji-to-kana converting device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |