WO1982000442A1 - Ideographic word selection system - Google Patents

Ideographic word selection system Download PDF

Info

Publication number
WO1982000442A1
WO1982000442A1 PCT/US1981/001017 US8101017W WO8200442A1 WO 1982000442 A1 WO1982000442 A1 WO 1982000442A1 US 8101017 W US8101017 W US 8101017W WO 8200442 A1 WO8200442 A1 WO 8200442A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
ideographic
words
character
phonetic
Prior art date
Application number
PCT/US1981/001017
Other languages
English (en)
French (fr)
Inventor
R Johnson
Original Assignee
R Johnson
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by R Johnson filed Critical R Johnson
Publication of WO1982000442A1 publication Critical patent/WO1982000442A1/en

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B41PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS
    • B41JTYPEWRITERS; SELECTIVE PRINTING MECHANISMS, i.e. MECHANISMS PRINTING OTHERWISE THAN FROM A FORME; CORRECTION OF TYPOGRAPHICAL ERRORS
    • B41J3/00Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed
    • B41J3/01Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed for special character, e.g. for Chinese characters or barcodes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/018Input/output arrangements for oriental characters
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Definitions

  • the present invention is directed to an ideographic word selection system, and specifically to a character processo which can rapidly enter Chinese, Japanese or Korean charac ters into a computer system, for example for printing purposes, the foregoing being done from a keyboard having limited number of keys.
  • Word or character processing for Oriental languages such a Japanese, Chinese, Korean, etc, has been difficult because of the structure of the written language; that is, there is no limited alphabet, rather thousands of different ideo ⁇ graphic words and characters.
  • Other languages such as Arabic or Farsi, have a written alphabet but also have numerous different ways of writing each letter; the result ⁇ ing written language is difficult to process using a key ⁇ board for entry because of the number of different characte which could be used.
  • the pronunciation of a character or a word is used to access that character, a large set of homonyms will be produced because of the similar pronunciations of other characters or words.
  • U.S. patent 4,096,934 granted January 27, 1978 to Kir ser et al. discloses a method and apparatus for reproducing desired ideographs, where a phonetic spelling the desired ideograph, along with a characteristic identi ⁇ fication of this desired ideograph, is used by a computer identify the desired ideograph.
  • Such characteristic iden ⁇ tification is based on an operator making a judgment about the geometric shape of the character. This is a slow procedure and one which is liable to errors when operators try to increase their speed.
  • a suggested alternative of Kirmser which is just mentioned briefly as opposed to the geometry method described at length, is to use the suggested meaning of th character described.
  • an ideographic word selection system for selecting a desired word of a language having a relatively unlimited number of ideographic words by use of a keyboard having a limited number of keys.
  • This system comprises a method for elec ⁇ tronically storing information representing the ideographi words of a selected number of words of a language, where each ideographic word has associated with it its phonetic spelling, as well as the phonetic spelling of several word related in one or more ways to the ideographic word and th computerized graphic representation of the word.
  • Other information such as English equivalent words, may also be stored with the word in electronic storage.
  • the method includes the use of the above-mentioned keyboard, for inputting information, such as the phonetic spelling of a desired ideographic word, and also inputting the phonetic spelling of at least one of the words related to the ideographic word.
  • the stored and inputted information are then compared to select the desired ideographic word.
  • the selection of one word from among homony can be used to preselect the same words by subsequent entr of the pronunciation alone.
  • This optional method is especially suited to the data entr of specific text, in which repeated words are very likely and for which this method of entry is especially advanta- geous.
  • Use of a word or character having the same pronun ⁇ ciation as a previous different word or character would the require only that the related word be entered after the pronunciation.
  • a method of selecting a desired word by use of a keyboard from an electronically stored list of a selected number of ideo- ographic words of a language. This is done by the followin s.teps: 1) storing in association with each ideographic word the phonetic spelling of the ideographic word, along with
  • C PI the phonetic spelling of several words related to the ideographic word; 2) inputting, by the use of the keyboard the phonetic spelling of a desired ideographic word and als the phonetic spelling of at least one word related to the desired ideographic word; and 3) comparing the stored and inputed information to uniquely select the desired ideo ⁇ graphic word.
  • a word selection system which includes a table of electronically retrievable data.
  • This table includes data representing th ideographic words of a large number of desired words from a language, data representing the phonetic spelling of each ideographic word, and data representing the phonetic spell- ing of several words related to each ideographic word.
  • FIG. 1 is a block diagram embodying the system of th present invention
  • Figure 2 illustrates a table of electronically retrie- veable data which is used in the memory of the present invention
  • Figure 3 is a drawing in tabular form, illustrating an example of how characters would be selected in a
  • Figure 4 is a drawing in tabular form, illustrating ho ideographic characters would be selected in Chinese.
  • the present invention is built around a linguistic and mnemonic feature -of the relations between a spoken and a written language; there is often more than one way to pronounce a character, and it is easy for anyone who knows language to think of synonyms or other related words when
  • an operator first enters on a keyboard the phonetic representation (Roman or kana) of a character, and then the phonetic representation of at least one other word - either a second pronunciation of the character, or a related word of some kind.
  • the phrase "related word,” which is used for the second entry, is defined as a word which an ordinary person skilled in t use of a language would think of when encountering the ideographic character. Although most people would think o a variety of different responses, the total number of different responses to a request to think of a related wor would be at least finite, and probably small.
  • the present invention makes use of the mnemonic device of asking the operator to enter a second word related to the first, or an alternative pronunciation which will vary according to the operator and may vary on different occa ⁇ sions with the same operator.
  • the system then responds by accessing the desired character, distinguishing from among homonyms by having already provided a data base containing all of the common related words from which an operator migh be expected to choose, or which might occur to an operator to enter.
  • OMPI character when the second word is entered.
  • the ent of a phonetic representation of a character (the first wor does not uniquely select from among homonyms. But the arr which is constructed in electronic memory (which will be described below) makes a selection from among homonyms quickly and easily.
  • the desired word will be accessed uniquely by either: 1) entering a second related word, or 2) picking visually from a cathode ray tube display which of the homonyms displayed is the desired character to be entered.
  • This case of more than one of the homonyms having the same related word is antici pated to be relatively rare. It would not materially lowe the average speed at which data are entered.
  • a “compound word” is composed of (a) the phonetic spelling in Roman or other phonetic symbols, such as kana, of the desired character, and (b) the simila phonetic spelling of a synonym or related word.
  • FIG 1 illustrates the block diagram of the system and embodies the present invention as it might be applied to a Chinese or Japanese language character processor.
  • a key ⁇ board 10 contains a limited alphabet or a limited number o keys of either Roman or katakana. That is, in the case of Japanese, it would be katakana, and in the case of .Chinese, Pin-yin Roman.
  • Associated with the keyboard is a cathode ray tube (CRT), display screen 11, and a hard copy printer 12.
  • CTR cathode ray tube
  • a computer and storage device 13 interrelates and controls all of the units of the system which has as a las unit a key element, which is a dictionary which is related to the language being processed. Details of the dictionary are shown in Figure 2 and further explained in Figures 3 an 4.
  • an operator who is skilled in the particular language which is being used, types in a phoneti spelling in Roman or katakana of a desired ideographic wor The operator next inputs the phonetic spelling of at least one word related to the desired ideographic word. There ⁇ upon, this compound word input is compared to dictionary words, and printer 12 will type that particular word. Alternatively, it is stored for future use.
  • the CRT displa screen 11 may be used where there is not a unique solution, and where perhaps an additional related word must be entere or for further instructions to the operator.
  • each ideographic character stored in the dictionar is given a sequential character number. These are asso ⁇ ciated with a character dot matrix graphics data set 17 containing sufficient binary control words to print or display the ideographic character.
  • a character dot matrix graphics data set 17 containing sufficient binary control words to print or display the ideographic character.
  • the Japanese language features a double system of usual pronunciation: the character may be pronounced according to a Chinese fashion (the On-Yomi) or the Japanes manner (Kun-Yomi).
  • the character which is pronounced in th Chinese manner will have homonyms very different from those it would have in the Japanese style of pronunciation.
  • the present invention thus allows either pronunciation to be used for the first word entered, and the remaining pronun ⁇ ciation may be used as though it were a related word. Or, the operator may choose one of the pronunciations and then use a related word.
  • columns 20 A through 20F there are entered phonetic spellings of up to six related words.
  • the system is structured so that the keyboard entry of the first part of the compound word/character accesses identic character phonetic spellings of either the Chinese or Japanese style spellings of the pronunciations of the desired Kanji (in the case of a Japanese processor) charac ter. If, for example, the first half of the compound word/character is an On-Yomi, a unique selection can be accomplished by entering the alternate spelling in Kun-Yom or one of the associated related words. As discussed abov in either case, a unique solution is provided, and a word will be selected.
  • the dictionary is constructed so that additiona entries of characters, their numbers, graphics, On-Yomi and Kun-Yomi, and related words and synonyms can be made, limited only by the extent of available memory.
  • the Chinese systems then would be expressed as Mandarin-Wade-Giles, Cantonese-Wade-Giles, Mandarin-Pin-yin and Cantonese-Pin-yin.
  • th keyboard can either be Kana or Romanji.
  • the operator In the Japanese version, the operator must be able to translate a character into its pronunciation. Most Japanese would read a charact in either its On-Yomi or Kun-Yomi pronunciation; a word would be entered with its customary pronunciation. A related word would come quickly to mind in either case and could be entered through the keyboard as its Kana or Romani pronunciation.
  • the advantage of the present system is that a context is provided for a character independent of the text in whic the character is used.
  • the context (the second half of the compound word/character - the related word) enables machine selection of a unique character in almost every instance. And, when it does not, either the character is not in the repetoire of the dictionary and must be added, or all possible characters from the dictionary are displayed on th CRT for further selection of the operator. This last will be very rare.
  • the table of Figure 3 illustrates a dictionary in the for of Figure 2 for a Japanese character processor. Twenty characters are listed. The English translation is listed for each. As discussed above, in Japanese a character may be pronounced according to the Chinese fashion (the On-Yom or the Japanese manner (the Kun-Yomi). Thus, a character which is pronounced in the Chinese manner will have homony very different from those it would have in the Japanese style of pronunciation.
  • the present invention allows eith pronunciation to be used for the first word entered, and t remaining pronunciation may be used as though it were a related word. Or the operator may choose one of the relat pronunciations and then use a related word.
  • An example of the many different paths used to access the same is given for characters numbered 5 and 6, namely "KA” and "KE".
  • this illustrates a Chinese character processor, where the initial ideographic shape of the character is illustrated in 1 - 20.
  • a new Latin Romanized phonetic spelling (Pin-Yin) in the mandarin dialect is illustrated, along with three related words, and an English translation.
  • the characters themselves are stored as numbers as discussed above, specifically 1-20 in this simplified format, which can be routed to a printer an to the associated dot matrix graphics, for display on the CRT.
  • the specific selection of a Chinese character by the present invention is as follows:
  • OMFI 2 The operator enters a space from the space bar.
  • the operator has performed a total of eight keystrokes, rapidly entering the pronunciations of both the desired character/word and a related word.
  • the related word from the table is a compound word containing "bi."
  • Most of the related words in the table will contain elemen similar to the word/charcter to be selected, and greater speed in typing can be secured by using a memory key to contain the pronunciation of the word/character being accessed and then releasing it in a single keystroke.
  • Another example is:
  • OMPI homonyms The operator could also, by specifying the character through pronunciation and related words, use the graphics capability system of an associated computer ter ⁇ minal entry to draw the character and add its repetoire to the system.
  • the present invention supplies an impro ideographic word selection system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Input From Keyboards Or The Like (AREA)
PCT/US1981/001017 1980-08-01 1981-07-30 Ideographic word selection system WO1982000442A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17468480A 1980-08-01 1980-08-01
US174684800801 1980-08-01

Publications (1)

Publication Number Publication Date
WO1982000442A1 true WO1982000442A1 (en) 1982-02-18

Family

ID=22637113

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1981/001017 WO1982000442A1 (en) 1980-08-01 1981-07-30 Ideographic word selection system

Country Status (2)

Country Link
JP (1) JPS57501254A (enrdf_load_stackoverflow)
WO (1) WO1982000442A1 (enrdf_load_stackoverflow)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4544276A (en) * 1983-03-21 1985-10-01 Cornell Research Foundation, Inc. Method and apparatus for typing Japanese text using multiple systems
EP0265573A1 (fr) * 1986-10-30 1988-05-04 International Business Machines Corporation Procédé de transcription automatique sténotypie-français
EP0271619A1 (en) * 1986-12-15 1988-06-22 Yeh, Victor Chang-ming Phonetic encoding method for Chinese ideograms, and apparatus therefor
US4787038A (en) * 1985-03-25 1988-11-22 Kabushiki Kaisha Toshiba Machine translation system
US4859091A (en) * 1986-06-20 1989-08-22 Canon Kabushiki Kaisha Word processor including spelling verifier and corrector
US5175803A (en) * 1985-06-14 1992-12-29 Yeh Victor C Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language
WO1998020429A1 (en) * 1996-11-05 1998-05-14 Kanji Software, Inc. Method for converting non-phonetic characters into surrogate words for inputting into a computer
WO1998033111A1 (en) * 1997-01-24 1998-07-30 Tegic Communications, Inc. Reduced keyboard disambiguating system
FR2775858A1 (fr) * 1998-03-03 1999-09-10 Koninkl Philips Electronics Nv Caracteres chinois dans un appareil electronique
US6011554A (en) * 1995-07-26 2000-01-04 Tegic Communications, Inc. Reduced keyboard disambiguating system
US6231252B1 (en) * 1998-10-05 2001-05-15 Nec Corporation Character input system and method using keyboard
US6636162B1 (en) 1998-12-04 2003-10-21 America Online, Incorporated Reduced keyboard text input system for the Japanese language
US6646573B1 (en) 1998-12-04 2003-11-11 America Online, Inc. Reduced keyboard text input system for the Japanese language
US6999915B2 (en) * 2001-06-22 2006-02-14 Pierre Mestre Process and device for translation expressed in two different phonetic forms
WO2006043988A1 (en) * 2004-10-20 2006-04-27 Oracle International Corporation Computer-implemented methods and systems for entering and searching for non-roman-alphabet characters and related search systems
FR2955952A1 (fr) * 2010-01-29 2011-08-05 Delta Process Procede de transcription instantanee de la parole
US8938688B2 (en) 1998-12-04 2015-01-20 Nuance Communications, Inc. Contextual prediction of user words and user actions
US8972905B2 (en) 1999-12-03 2015-03-03 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US9786273B2 (en) 2004-06-02 2017-10-10 Nuance Communications, Inc. Multimodal disambiguation of speech recognition

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4096934A (en) * 1975-10-15 1978-06-27 Philip George Kirmser Method and apparatus for reproducing desired ideographs
US4136395A (en) * 1976-12-28 1979-01-23 International Business Machines Corporation System for automatically proofreading a document
WO1980000105A1 (en) * 1978-06-14 1980-01-24 Logan Corp System for selecting graphic characters phonetically
US4270022A (en) * 1978-06-22 1981-05-26 Loh Shiu C Ideographic character selection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4096934A (en) * 1975-10-15 1978-06-27 Philip George Kirmser Method and apparatus for reproducing desired ideographs
US4136395A (en) * 1976-12-28 1979-01-23 International Business Machines Corporation System for automatically proofreading a document
WO1980000105A1 (en) * 1978-06-14 1980-01-24 Logan Corp System for selecting graphic characters phonetically
US4270022A (en) * 1978-06-22 1981-05-26 Loh Shiu C Ideographic character selection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IBM Technical Disclosure Bulletin, vol. 17, No. 8, Published January 1975, A. Rellano et al:, "Word Generation System For Typist", pp 2422-2423 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4544276A (en) * 1983-03-21 1985-10-01 Cornell Research Foundation, Inc. Method and apparatus for typing Japanese text using multiple systems
US4787038A (en) * 1985-03-25 1988-11-22 Kabushiki Kaisha Toshiba Machine translation system
US5175803A (en) * 1985-06-14 1992-12-29 Yeh Victor C Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language
US4859091A (en) * 1986-06-20 1989-08-22 Canon Kabushiki Kaisha Word processor including spelling verifier and corrector
EP0265573A1 (fr) * 1986-10-30 1988-05-04 International Business Machines Corporation Procédé de transcription automatique sténotypie-français
EP0271619A1 (en) * 1986-12-15 1988-06-22 Yeh, Victor Chang-ming Phonetic encoding method for Chinese ideograms, and apparatus therefor
US6011554A (en) * 1995-07-26 2000-01-04 Tegic Communications, Inc. Reduced keyboard disambiguating system
US6307549B1 (en) 1995-07-26 2001-10-23 Tegic Communications, Inc. Reduced keyboard disambiguating system
WO1998020429A1 (en) * 1996-11-05 1998-05-14 Kanji Software, Inc. Method for converting non-phonetic characters into surrogate words for inputting into a computer
CN1296806C (zh) * 1997-01-24 2007-01-24 蒂吉通信系统公司 去多义性的简化键盘系统
WO1998033111A1 (en) * 1997-01-24 1998-07-30 Tegic Communications, Inc. Reduced keyboard disambiguating system
US5953541A (en) * 1997-01-24 1999-09-14 Tegic Communications, Inc. Disambiguating system for disambiguating ambiguous input sequences by displaying objects associated with the generated input sequences in the order of decreasing frequency of use
FR2775858A1 (fr) * 1998-03-03 1999-09-10 Koninkl Philips Electronics Nv Caracteres chinois dans un appareil electronique
US6231252B1 (en) * 1998-10-05 2001-05-15 Nec Corporation Character input system and method using keyboard
US6636162B1 (en) 1998-12-04 2003-10-21 America Online, Incorporated Reduced keyboard text input system for the Japanese language
US6646573B1 (en) 1998-12-04 2003-11-11 America Online, Inc. Reduced keyboard text input system for the Japanese language
US8938688B2 (en) 1998-12-04 2015-01-20 Nuance Communications, Inc. Contextual prediction of user words and user actions
US9626355B2 (en) 1998-12-04 2017-04-18 Nuance Communications, Inc. Contextual prediction of user words and user actions
US8972905B2 (en) 1999-12-03 2015-03-03 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US8990738B2 (en) 1999-12-03 2015-03-24 Nuance Communications, Inc. Explicit character filtering of ambiguous text entry
US6999915B2 (en) * 2001-06-22 2006-02-14 Pierre Mestre Process and device for translation expressed in two different phonetic forms
US9786273B2 (en) 2004-06-02 2017-10-10 Nuance Communications, Inc. Multimodal disambiguation of speech recognition
WO2006043988A1 (en) * 2004-10-20 2006-04-27 Oracle International Corporation Computer-implemented methods and systems for entering and searching for non-roman-alphabet characters and related search systems
US7376648B2 (en) 2004-10-20 2008-05-20 Oracle International Corporation Computer-implemented methods and systems for entering and searching for non-Roman-alphabet characters and related search systems
FR2955952A1 (fr) * 2010-01-29 2011-08-05 Delta Process Procede de transcription instantanee de la parole

Also Published As

Publication number Publication date
JPS57501254A (enrdf_load_stackoverflow) 1982-07-15

Similar Documents

Publication Publication Date Title
US7257528B1 (en) Method and apparatus for Chinese character text input
WO1982000442A1 (en) Ideographic word selection system
US5187480A (en) Symbol definition apparatus
US4544276A (en) Method and apparatus for typing Japanese text using multiple systems
US4559598A (en) Method of creating text using a computer
US7414616B2 (en) User-friendly Brahmi-derived Hindi keyboard
US4498143A (en) Method of and apparatus for forming ideograms
US5119296A (en) Method and apparatus for inputting radical-encoded chinese characters
US5903861A (en) Method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer
US5360343A (en) Chinese character coding method using five stroke codes and double phonetic alphabets
US4566065A (en) Computer aided stenographic system
USRE32773E (en) Method of creating text using a computer
US4868913A (en) System of encoding chinese characters according to their patterns and accompanying keyboard for electronic computer
US4173753A (en) Input system for sino-computer
GB2221780A (en) System for encoding a collection of ideographic characters
US5410306A (en) Chinese phrasal stepcode
US5378068A (en) Word processor for generating Chinese characters
Huang The input and output of Chinese and Japanese characters
WO2000043861A1 (en) Method and apparatus for chinese character text input
WO1990002992A1 (en) Symbol definition apparatus
KR940007932B1 (ko) 표의문자 식별장치 및 처리방법
KR20010067827A (ko) 다국어 한자 데이터 베이스 구조
JP2592020B2 (ja) 記録または表示用仮名キーを備えた入力装置
JPH0560139B2 (enrdf_load_stackoverflow)
AU665293B2 (en) Apparatus for encoding and defining symbols and assembling text in ideographic languages

Legal Events

Date Code Title Description
AK Designated states

Designated state(s): JP