WO1982000442A1 - Ideographic word selection system - Google Patents
Ideographic word selection system Download PDFInfo
- Publication number
- WO1982000442A1 WO1982000442A1 PCT/US1981/001017 US8101017W WO8200442A1 WO 1982000442 A1 WO1982000442 A1 WO 1982000442A1 US 8101017 W US8101017 W US 8101017W WO 8200442 A1 WO8200442 A1 WO 8200442A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- ideographic
- words
- character
- phonetic
- Prior art date
Links
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B41—PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS
- B41J—TYPEWRITERS; SELECTIVE PRINTING MECHANISMS, i.e. MECHANISMS PRINTING OTHERWISE THAN FROM A FORME; CORRECTION OF TYPOGRAPHICAL ERRORS
- B41J3/00—Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed
- B41J3/01—Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed for special character, e.g. for Chinese characters or barcodes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/018—Input/output arrangements for oriental characters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
Definitions
- the present invention is directed to an ideographic word selection system, and specifically to a character processo which can rapidly enter Chinese, Japanese or Korean charac ters into a computer system, for example for printing purposes, the foregoing being done from a keyboard having limited number of keys.
- Word or character processing for Oriental languages such a Japanese, Chinese, Korean, etc, has been difficult because of the structure of the written language; that is, there is no limited alphabet, rather thousands of different ideo ⁇ graphic words and characters.
- Other languages such as Arabic or Farsi, have a written alphabet but also have numerous different ways of writing each letter; the result ⁇ ing written language is difficult to process using a key ⁇ board for entry because of the number of different characte which could be used.
- the pronunciation of a character or a word is used to access that character, a large set of homonyms will be produced because of the similar pronunciations of other characters or words.
- U.S. patent 4,096,934 granted January 27, 1978 to Kir ser et al. discloses a method and apparatus for reproducing desired ideographs, where a phonetic spelling the desired ideograph, along with a characteristic identi ⁇ fication of this desired ideograph, is used by a computer identify the desired ideograph.
- Such characteristic iden ⁇ tification is based on an operator making a judgment about the geometric shape of the character. This is a slow procedure and one which is liable to errors when operators try to increase their speed.
- a suggested alternative of Kirmser which is just mentioned briefly as opposed to the geometry method described at length, is to use the suggested meaning of th character described.
- an ideographic word selection system for selecting a desired word of a language having a relatively unlimited number of ideographic words by use of a keyboard having a limited number of keys.
- This system comprises a method for elec ⁇ tronically storing information representing the ideographi words of a selected number of words of a language, where each ideographic word has associated with it its phonetic spelling, as well as the phonetic spelling of several word related in one or more ways to the ideographic word and th computerized graphic representation of the word.
- Other information such as English equivalent words, may also be stored with the word in electronic storage.
- the method includes the use of the above-mentioned keyboard, for inputting information, such as the phonetic spelling of a desired ideographic word, and also inputting the phonetic spelling of at least one of the words related to the ideographic word.
- the stored and inputted information are then compared to select the desired ideographic word.
- the selection of one word from among homony can be used to preselect the same words by subsequent entr of the pronunciation alone.
- This optional method is especially suited to the data entr of specific text, in which repeated words are very likely and for which this method of entry is especially advanta- geous.
- Use of a word or character having the same pronun ⁇ ciation as a previous different word or character would the require only that the related word be entered after the pronunciation.
- a method of selecting a desired word by use of a keyboard from an electronically stored list of a selected number of ideo- ographic words of a language. This is done by the followin s.teps: 1) storing in association with each ideographic word the phonetic spelling of the ideographic word, along with
- C PI the phonetic spelling of several words related to the ideographic word; 2) inputting, by the use of the keyboard the phonetic spelling of a desired ideographic word and als the phonetic spelling of at least one word related to the desired ideographic word; and 3) comparing the stored and inputed information to uniquely select the desired ideo ⁇ graphic word.
- a word selection system which includes a table of electronically retrievable data.
- This table includes data representing th ideographic words of a large number of desired words from a language, data representing the phonetic spelling of each ideographic word, and data representing the phonetic spell- ing of several words related to each ideographic word.
- FIG. 1 is a block diagram embodying the system of th present invention
- Figure 2 illustrates a table of electronically retrie- veable data which is used in the memory of the present invention
- Figure 3 is a drawing in tabular form, illustrating an example of how characters would be selected in a
- Figure 4 is a drawing in tabular form, illustrating ho ideographic characters would be selected in Chinese.
- the present invention is built around a linguistic and mnemonic feature -of the relations between a spoken and a written language; there is often more than one way to pronounce a character, and it is easy for anyone who knows language to think of synonyms or other related words when
- an operator first enters on a keyboard the phonetic representation (Roman or kana) of a character, and then the phonetic representation of at least one other word - either a second pronunciation of the character, or a related word of some kind.
- the phrase "related word,” which is used for the second entry, is defined as a word which an ordinary person skilled in t use of a language would think of when encountering the ideographic character. Although most people would think o a variety of different responses, the total number of different responses to a request to think of a related wor would be at least finite, and probably small.
- the present invention makes use of the mnemonic device of asking the operator to enter a second word related to the first, or an alternative pronunciation which will vary according to the operator and may vary on different occa ⁇ sions with the same operator.
- the system then responds by accessing the desired character, distinguishing from among homonyms by having already provided a data base containing all of the common related words from which an operator migh be expected to choose, or which might occur to an operator to enter.
- OMPI character when the second word is entered.
- the ent of a phonetic representation of a character (the first wor does not uniquely select from among homonyms. But the arr which is constructed in electronic memory (which will be described below) makes a selection from among homonyms quickly and easily.
- the desired word will be accessed uniquely by either: 1) entering a second related word, or 2) picking visually from a cathode ray tube display which of the homonyms displayed is the desired character to be entered.
- This case of more than one of the homonyms having the same related word is antici pated to be relatively rare. It would not materially lowe the average speed at which data are entered.
- a “compound word” is composed of (a) the phonetic spelling in Roman or other phonetic symbols, such as kana, of the desired character, and (b) the simila phonetic spelling of a synonym or related word.
- FIG 1 illustrates the block diagram of the system and embodies the present invention as it might be applied to a Chinese or Japanese language character processor.
- a key ⁇ board 10 contains a limited alphabet or a limited number o keys of either Roman or katakana. That is, in the case of Japanese, it would be katakana, and in the case of .Chinese, Pin-yin Roman.
- Associated with the keyboard is a cathode ray tube (CRT), display screen 11, and a hard copy printer 12.
- CTR cathode ray tube
- a computer and storage device 13 interrelates and controls all of the units of the system which has as a las unit a key element, which is a dictionary which is related to the language being processed. Details of the dictionary are shown in Figure 2 and further explained in Figures 3 an 4.
- an operator who is skilled in the particular language which is being used, types in a phoneti spelling in Roman or katakana of a desired ideographic wor The operator next inputs the phonetic spelling of at least one word related to the desired ideographic word. There ⁇ upon, this compound word input is compared to dictionary words, and printer 12 will type that particular word. Alternatively, it is stored for future use.
- the CRT displa screen 11 may be used where there is not a unique solution, and where perhaps an additional related word must be entere or for further instructions to the operator.
- each ideographic character stored in the dictionar is given a sequential character number. These are asso ⁇ ciated with a character dot matrix graphics data set 17 containing sufficient binary control words to print or display the ideographic character.
- a character dot matrix graphics data set 17 containing sufficient binary control words to print or display the ideographic character.
- the Japanese language features a double system of usual pronunciation: the character may be pronounced according to a Chinese fashion (the On-Yomi) or the Japanes manner (Kun-Yomi).
- the character which is pronounced in th Chinese manner will have homonyms very different from those it would have in the Japanese style of pronunciation.
- the present invention thus allows either pronunciation to be used for the first word entered, and the remaining pronun ⁇ ciation may be used as though it were a related word. Or, the operator may choose one of the pronunciations and then use a related word.
- columns 20 A through 20F there are entered phonetic spellings of up to six related words.
- the system is structured so that the keyboard entry of the first part of the compound word/character accesses identic character phonetic spellings of either the Chinese or Japanese style spellings of the pronunciations of the desired Kanji (in the case of a Japanese processor) charac ter. If, for example, the first half of the compound word/character is an On-Yomi, a unique selection can be accomplished by entering the alternate spelling in Kun-Yom or one of the associated related words. As discussed abov in either case, a unique solution is provided, and a word will be selected.
- the dictionary is constructed so that additiona entries of characters, their numbers, graphics, On-Yomi and Kun-Yomi, and related words and synonyms can be made, limited only by the extent of available memory.
- the Chinese systems then would be expressed as Mandarin-Wade-Giles, Cantonese-Wade-Giles, Mandarin-Pin-yin and Cantonese-Pin-yin.
- th keyboard can either be Kana or Romanji.
- the operator In the Japanese version, the operator must be able to translate a character into its pronunciation. Most Japanese would read a charact in either its On-Yomi or Kun-Yomi pronunciation; a word would be entered with its customary pronunciation. A related word would come quickly to mind in either case and could be entered through the keyboard as its Kana or Romani pronunciation.
- the advantage of the present system is that a context is provided for a character independent of the text in whic the character is used.
- the context (the second half of the compound word/character - the related word) enables machine selection of a unique character in almost every instance. And, when it does not, either the character is not in the repetoire of the dictionary and must be added, or all possible characters from the dictionary are displayed on th CRT for further selection of the operator. This last will be very rare.
- the table of Figure 3 illustrates a dictionary in the for of Figure 2 for a Japanese character processor. Twenty characters are listed. The English translation is listed for each. As discussed above, in Japanese a character may be pronounced according to the Chinese fashion (the On-Yom or the Japanese manner (the Kun-Yomi). Thus, a character which is pronounced in the Chinese manner will have homony very different from those it would have in the Japanese style of pronunciation.
- the present invention allows eith pronunciation to be used for the first word entered, and t remaining pronunciation may be used as though it were a related word. Or the operator may choose one of the relat pronunciations and then use a related word.
- An example of the many different paths used to access the same is given for characters numbered 5 and 6, namely "KA” and "KE".
- this illustrates a Chinese character processor, where the initial ideographic shape of the character is illustrated in 1 - 20.
- a new Latin Romanized phonetic spelling (Pin-Yin) in the mandarin dialect is illustrated, along with three related words, and an English translation.
- the characters themselves are stored as numbers as discussed above, specifically 1-20 in this simplified format, which can be routed to a printer an to the associated dot matrix graphics, for display on the CRT.
- the specific selection of a Chinese character by the present invention is as follows:
- OMFI 2 The operator enters a space from the space bar.
- the operator has performed a total of eight keystrokes, rapidly entering the pronunciations of both the desired character/word and a related word.
- the related word from the table is a compound word containing "bi."
- Most of the related words in the table will contain elemen similar to the word/charcter to be selected, and greater speed in typing can be secured by using a memory key to contain the pronunciation of the word/character being accessed and then releasing it in a single keystroke.
- Another example is:
- OMPI homonyms The operator could also, by specifying the character through pronunciation and related words, use the graphics capability system of an associated computer ter ⁇ minal entry to draw the character and add its repetoire to the system.
- the present invention supplies an impro ideographic word selection system.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
- Input From Keyboards Or The Like (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17468480A | 1980-08-01 | 1980-08-01 | |
US174684800801 | 1980-08-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1982000442A1 true WO1982000442A1 (en) | 1982-02-18 |
Family
ID=22637113
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1981/001017 WO1982000442A1 (en) | 1980-08-01 | 1981-07-30 | Ideographic word selection system |
Country Status (2)
Country | Link |
---|---|
JP (1) | JPS57501254A (enrdf_load_stackoverflow) |
WO (1) | WO1982000442A1 (enrdf_load_stackoverflow) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4544276A (en) * | 1983-03-21 | 1985-10-01 | Cornell Research Foundation, Inc. | Method and apparatus for typing Japanese text using multiple systems |
EP0265573A1 (fr) * | 1986-10-30 | 1988-05-04 | International Business Machines Corporation | Procédé de transcription automatique sténotypie-français |
EP0271619A1 (en) * | 1986-12-15 | 1988-06-22 | Yeh, Victor Chang-ming | Phonetic encoding method for Chinese ideograms, and apparatus therefor |
US4787038A (en) * | 1985-03-25 | 1988-11-22 | Kabushiki Kaisha Toshiba | Machine translation system |
US4859091A (en) * | 1986-06-20 | 1989-08-22 | Canon Kabushiki Kaisha | Word processor including spelling verifier and corrector |
US5175803A (en) * | 1985-06-14 | 1992-12-29 | Yeh Victor C | Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language |
WO1998020429A1 (en) * | 1996-11-05 | 1998-05-14 | Kanji Software, Inc. | Method for converting non-phonetic characters into surrogate words for inputting into a computer |
WO1998033111A1 (en) * | 1997-01-24 | 1998-07-30 | Tegic Communications, Inc. | Reduced keyboard disambiguating system |
FR2775858A1 (fr) * | 1998-03-03 | 1999-09-10 | Koninkl Philips Electronics Nv | Caracteres chinois dans un appareil electronique |
US6011554A (en) * | 1995-07-26 | 2000-01-04 | Tegic Communications, Inc. | Reduced keyboard disambiguating system |
US6231252B1 (en) * | 1998-10-05 | 2001-05-15 | Nec Corporation | Character input system and method using keyboard |
US6636162B1 (en) | 1998-12-04 | 2003-10-21 | America Online, Incorporated | Reduced keyboard text input system for the Japanese language |
US6646573B1 (en) | 1998-12-04 | 2003-11-11 | America Online, Inc. | Reduced keyboard text input system for the Japanese language |
US6999915B2 (en) * | 2001-06-22 | 2006-02-14 | Pierre Mestre | Process and device for translation expressed in two different phonetic forms |
WO2006043988A1 (en) * | 2004-10-20 | 2006-04-27 | Oracle International Corporation | Computer-implemented methods and systems for entering and searching for non-roman-alphabet characters and related search systems |
FR2955952A1 (fr) * | 2010-01-29 | 2011-08-05 | Delta Process | Procede de transcription instantanee de la parole |
US8938688B2 (en) | 1998-12-04 | 2015-01-20 | Nuance Communications, Inc. | Contextual prediction of user words and user actions |
US8972905B2 (en) | 1999-12-03 | 2015-03-03 | Nuance Communications, Inc. | Explicit character filtering of ambiguous text entry |
US9786273B2 (en) | 2004-06-02 | 2017-10-10 | Nuance Communications, Inc. | Multimodal disambiguation of speech recognition |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4096934A (en) * | 1975-10-15 | 1978-06-27 | Philip George Kirmser | Method and apparatus for reproducing desired ideographs |
US4136395A (en) * | 1976-12-28 | 1979-01-23 | International Business Machines Corporation | System for automatically proofreading a document |
WO1980000105A1 (en) * | 1978-06-14 | 1980-01-24 | Logan Corp | System for selecting graphic characters phonetically |
US4270022A (en) * | 1978-06-22 | 1981-05-26 | Loh Shiu C | Ideographic character selection |
-
1981
- 1981-07-30 JP JP56502778A patent/JPS57501254A/ja active Pending
- 1981-07-30 WO PCT/US1981/001017 patent/WO1982000442A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4096934A (en) * | 1975-10-15 | 1978-06-27 | Philip George Kirmser | Method and apparatus for reproducing desired ideographs |
US4136395A (en) * | 1976-12-28 | 1979-01-23 | International Business Machines Corporation | System for automatically proofreading a document |
WO1980000105A1 (en) * | 1978-06-14 | 1980-01-24 | Logan Corp | System for selecting graphic characters phonetically |
US4270022A (en) * | 1978-06-22 | 1981-05-26 | Loh Shiu C | Ideographic character selection |
Non-Patent Citations (1)
Title |
---|
IBM Technical Disclosure Bulletin, vol. 17, No. 8, Published January 1975, A. Rellano et al:, "Word Generation System For Typist", pp 2422-2423 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4544276A (en) * | 1983-03-21 | 1985-10-01 | Cornell Research Foundation, Inc. | Method and apparatus for typing Japanese text using multiple systems |
US4787038A (en) * | 1985-03-25 | 1988-11-22 | Kabushiki Kaisha Toshiba | Machine translation system |
US5175803A (en) * | 1985-06-14 | 1992-12-29 | Yeh Victor C | Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language |
US4859091A (en) * | 1986-06-20 | 1989-08-22 | Canon Kabushiki Kaisha | Word processor including spelling verifier and corrector |
EP0265573A1 (fr) * | 1986-10-30 | 1988-05-04 | International Business Machines Corporation | Procédé de transcription automatique sténotypie-français |
EP0271619A1 (en) * | 1986-12-15 | 1988-06-22 | Yeh, Victor Chang-ming | Phonetic encoding method for Chinese ideograms, and apparatus therefor |
US6011554A (en) * | 1995-07-26 | 2000-01-04 | Tegic Communications, Inc. | Reduced keyboard disambiguating system |
US6307549B1 (en) | 1995-07-26 | 2001-10-23 | Tegic Communications, Inc. | Reduced keyboard disambiguating system |
WO1998020429A1 (en) * | 1996-11-05 | 1998-05-14 | Kanji Software, Inc. | Method for converting non-phonetic characters into surrogate words for inputting into a computer |
CN1296806C (zh) * | 1997-01-24 | 2007-01-24 | 蒂吉通信系统公司 | 去多义性的简化键盘系统 |
WO1998033111A1 (en) * | 1997-01-24 | 1998-07-30 | Tegic Communications, Inc. | Reduced keyboard disambiguating system |
US5953541A (en) * | 1997-01-24 | 1999-09-14 | Tegic Communications, Inc. | Disambiguating system for disambiguating ambiguous input sequences by displaying objects associated with the generated input sequences in the order of decreasing frequency of use |
FR2775858A1 (fr) * | 1998-03-03 | 1999-09-10 | Koninkl Philips Electronics Nv | Caracteres chinois dans un appareil electronique |
US6231252B1 (en) * | 1998-10-05 | 2001-05-15 | Nec Corporation | Character input system and method using keyboard |
US6636162B1 (en) | 1998-12-04 | 2003-10-21 | America Online, Incorporated | Reduced keyboard text input system for the Japanese language |
US6646573B1 (en) | 1998-12-04 | 2003-11-11 | America Online, Inc. | Reduced keyboard text input system for the Japanese language |
US8938688B2 (en) | 1998-12-04 | 2015-01-20 | Nuance Communications, Inc. | Contextual prediction of user words and user actions |
US9626355B2 (en) | 1998-12-04 | 2017-04-18 | Nuance Communications, Inc. | Contextual prediction of user words and user actions |
US8972905B2 (en) | 1999-12-03 | 2015-03-03 | Nuance Communications, Inc. | Explicit character filtering of ambiguous text entry |
US8990738B2 (en) | 1999-12-03 | 2015-03-24 | Nuance Communications, Inc. | Explicit character filtering of ambiguous text entry |
US6999915B2 (en) * | 2001-06-22 | 2006-02-14 | Pierre Mestre | Process and device for translation expressed in two different phonetic forms |
US9786273B2 (en) | 2004-06-02 | 2017-10-10 | Nuance Communications, Inc. | Multimodal disambiguation of speech recognition |
WO2006043988A1 (en) * | 2004-10-20 | 2006-04-27 | Oracle International Corporation | Computer-implemented methods and systems for entering and searching for non-roman-alphabet characters and related search systems |
US7376648B2 (en) | 2004-10-20 | 2008-05-20 | Oracle International Corporation | Computer-implemented methods and systems for entering and searching for non-Roman-alphabet characters and related search systems |
FR2955952A1 (fr) * | 2010-01-29 | 2011-08-05 | Delta Process | Procede de transcription instantanee de la parole |
Also Published As
Publication number | Publication date |
---|---|
JPS57501254A (enrdf_load_stackoverflow) | 1982-07-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7257528B1 (en) | Method and apparatus for Chinese character text input | |
WO1982000442A1 (en) | Ideographic word selection system | |
US5187480A (en) | Symbol definition apparatus | |
US4544276A (en) | Method and apparatus for typing Japanese text using multiple systems | |
US4559598A (en) | Method of creating text using a computer | |
US7414616B2 (en) | User-friendly Brahmi-derived Hindi keyboard | |
US4498143A (en) | Method of and apparatus for forming ideograms | |
US5119296A (en) | Method and apparatus for inputting radical-encoded chinese characters | |
US5903861A (en) | Method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer | |
US5360343A (en) | Chinese character coding method using five stroke codes and double phonetic alphabets | |
US4566065A (en) | Computer aided stenographic system | |
USRE32773E (en) | Method of creating text using a computer | |
US4868913A (en) | System of encoding chinese characters according to their patterns and accompanying keyboard for electronic computer | |
US4173753A (en) | Input system for sino-computer | |
GB2221780A (en) | System for encoding a collection of ideographic characters | |
US5410306A (en) | Chinese phrasal stepcode | |
US5378068A (en) | Word processor for generating Chinese characters | |
Huang | The input and output of Chinese and Japanese characters | |
WO2000043861A1 (en) | Method and apparatus for chinese character text input | |
WO1990002992A1 (en) | Symbol definition apparatus | |
KR940007932B1 (ko) | 표의문자 식별장치 및 처리방법 | |
KR20010067827A (ko) | 다국어 한자 데이터 베이스 구조 | |
JP2592020B2 (ja) | 記録または表示用仮名キーを備えた入力装置 | |
JPH0560139B2 (enrdf_load_stackoverflow) | ||
AU665293B2 (en) | Apparatus for encoding and defining symbols and assembling text in ideographic languages |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Designated state(s): JP |