EP1012745A1 - Method for converting non-phonetic characters into surrogate words for inputting into a computer - Google Patents

Method for converting non-phonetic characters into surrogate words for inputting into a computer

Info

Publication number
EP1012745A1
EP1012745A1 EP96945196A EP96945196A EP1012745A1 EP 1012745 A1 EP1012745 A1 EP 1012745A1 EP 96945196 A EP96945196 A EP 96945196A EP 96945196 A EP96945196 A EP 96945196A EP 1012745 A1 EP1012745 A1 EP 1012745A1
Authority
EP
European Patent Office
Prior art keywords
computer
surrogate
phrase
inputting
recited
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP96945196A
Other languages
German (de)
English (en)
French (fr)
Inventor
Kun Chun Chan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kanji Software Inc
Original Assignee
Kanji Software Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/744,021 external-priority patent/US5903861A/en
Application filed by Kanji Software Inc filed Critical Kanji Software Inc
Publication of EP1012745A1 publication Critical patent/EP1012745A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/018Input/output arrangements for oriental characters

Definitions

  • the present invention relates to written non-phonetic characters of oriental languages, such as Chinese, Japanese, Korean language, Indian language, and etc., and more particularly to a conversion method of creating new surrogate words to precisely represent such non-phonetic characters used in written oriental languages, in which the surrogate words are words created with either English-style or native alphabets in the present invention to represent non-phonetic characters used in the Chinese, Japanese and Korean languages. Therefore, the non-phonetic characters can be easily inputted into in a computer through an English-style or native alphabetic keyboard, a mouse or other phonetic inputting method. Moreover, such new surrogate words can be stored in a computer and precisely transmitted by E- mail (Electronic Mail). Non-phonetic characters of Chinese languages were derived from pictures by the ancient
  • the ideogram is a symbol that can be either a character or part of a character, which denotes the meaning of that character by inference. Pure ideograms are rare. However they can be found in many characters that do not have phonetic radicals but instead, have two or more pictograms combined to infer a meaning that can be understood by the readers. The pronunciation of this kind of characters must be memorized, since there are no phonetic radicals present in this kind of characters. When the ideogram is used as a radical of a character, it is silent. The following are some examples of the ideogram.
  • B is made of the sun, B , and the moon, ⁇ , therefore it means bright.
  • fg is consisted of abundant, fgand color, -g,, therefore it means strikingly beautiful.
  • (3) 0 is the combination of a son, -J- and a daughter, - c, hence it means good.
  • (4) ⁇ is made of combination of silk, ft and small squares of rice field, EH, therefore it means tiny and fine.
  • # is made of two trees, f , therefore it means woods.
  • the pictogram is a symbol that is either a character or part of a character, which is the approximate likeness of an object the character described.
  • the pictograms are more common than the ideograms since the Chinese characters evolved from pictures. When the pictogram is used as a part of a character, it is silent. For example, H for bird, J ⁇ for horse, and for wood or tree.
  • a pictogram not only bears the meaning of the character of which it is a part, but also expresses the meaning by showing the physical likeness of the object the character described. This affords the character to be easily recognized and understood.
  • the radical is a part of a character.
  • radicals There are usually two kinds of radicals in a character. One denotes the meaning and the other denotes the pronunciation of the character.
  • the former is known as a pictogram or ideogram depending on its shape or what it stands for. If the shape resembles an object, it is called pictogram. If it does not resemble anything but has a meaning derived from other uses, or from inference, it is called ideogram. They remain silent when the character is pronounced.
  • Another kind of radical is known as a phonetic radical that bears the actual or approximate pronunciation of the character, hence it is sounded.
  • a character can be used as a radical, such as (1) ⁇ in ⁇ j , ⁇ . (2) H in ⁇ 8, ⁇ , $
  • This kind of radicals are mostly used as phonetic radicals.
  • a radical can be used as a character,
  • Another unique feature in colloquial Chinese language is that it allows four ways to pronounce a given phonetic, i.e. four intonations.
  • the total combinations of pronunciations and intonations in Chinese language are about 1,544. This compares to about 13,200 commonly used characters. Theoretically speaking, each pronunciation/intonation combination represents about 8 to 9 characters. In reality, a lot of pronunciation/intonation combinations are not adequately used or not used at all.
  • the Chinese people seem to over-use some of the combinations, such asji, qi and xi. Such uneven usage causes certain combinations to represent more than 50 characters.
  • the applicant calls this phenomenon over representation a problem that renders oriental languages (including Chinese, Japanese, Korean, and Indian languages) very difficult to be computerized in their original forms.
  • Chinese characters such as f ⁇ , jUt, ⁇ , ft, HJ, ⁇ £, , -b, , ffl, *, #t f, WL, Hi, ⁇ r, m, W, J& #f, ffi, If, j& s, £, , etc., having the same pronunciation/intonation combination of qi.
  • Chinese characters such as , $$, fljt, ⁇ , H " , t, f, S» ⁇ , Biff, D ⁇ , IS, , H&» J ., , ffl,etc. , having the same pronunciation/ intonation combination
  • the oriental languages such as Chinese, Japanese, Korean language, and Indian language
  • the computerization of such oriental languages is a substantial problem.
  • the input of the oriental characters into the computers or word processors becomes an extremely hard task.
  • the "shape" system and the "phonetic" system.
  • the "shape” system such as the "CHANGJEI” or “DA YI” input system for Chinese, designates a plurality of shape symbols according to the shapes of the radicals of the characters, in which each combination of the shape symbols represents an unique characters
  • the drawback of the "shape” system is reallv difficult to learn and use The users have to study the specific way of how to divide each character into predetermined shape symbols and learn by heart thousands of shape symbols representing different characters
  • the shape system enables the user to precisely input the specific character into the computer or word processor, only a tiny portion of skilled people such as the professional typists who received special, and intensive training can utilize such "shape” system
  • Ordinary people are unable to input even one character by utilizing the "shape” system
  • the learning process of the "shape” system is so complicated that most business people are unable to spend so much time to learn by heart all the input codes of the "shape” system
  • the "shape” system is designed for those people whose career are computer data
  • This method works fine when the pronunciation of a phrase is unique, but in real life, there are large number of phrases, especially the ones containing less than four characters, having identical pronunciation (homonymous).
  • the software engineers design their program to display the phrases in Chinese characters, at the bottom of the screen for the typist to select the one he or she desires. If the desired phrase or sentence is not there, the typist can hit the down arrow key to invoke the next phrase or sentence until the desired one is found. This searching or selection process makes the existing method cumbersome, time consuming and, sometimes frustrating.
  • the main object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer.
  • Such newly created surrogate words are unique for and can precisely represent the non-phonetic characters used in the written oriental languages such as Chinese, Japanese, Korean language, and Indian language, thus facilitating the easy input of the information in these languages into a computer.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, in which such newly created surrogate words are adapted to be transmitted through E-mail without losing any information.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, which paves the way for incorporating the voice recognizing and generating technology into computers processing information in the oriental languages.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, wherein a voice recognizing computer can be built using a sound card and a software as in usual manner to train it to understand a person's pronunciation of each word in such a manner that first the prefix, then the indicator if applies, then the separating mark (hyphen), then the suffix, then the marker(s) if applicable.
  • the reversed combination described above can also be used for this purpose as long as the hyphen or * is pronounced first, then the suffix, then the hyphen or *, then the prefix. Therefore, the user can input an article into a computer by voice.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a compute which is equipped with a voice generating system having a same sound card and a software that can pronounce the suffices of the surrogate words accurately, so that the computer can read out a document for the user to edit and print. Accordingly, the users need not spend time and effort to go over and check the document character by character for possible typing errors. This function can also help those people who can speak but cannot write or read the respective non-phonetic language to check their document which is inputted into the computer through the conversion method of the present invention.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer. Such method enables both the Western and Eastern people to input oriental languages easily and precisely without any complicated learning process.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, which facilitates the input of the phrases and sentences into the computer.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, in which such method can also be utilized to teach speaking, reading and writing of a language whose written form is non-phonetic by using the theories and logic of the present invention.
  • Another object of the present invention is to provide a method for specifically converting non- phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer, in which base on the converting method of the present invention, the using of the pictographic/ideographic radical as a prefix of the surrogate word can establish a helpful process for the children to memorize the Chinese or Kanji characters easily
  • a method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer compnses the steps of (a) alphabetizing a pictographic/ideographic radical of each character according to its pronunciation in a respective language, with the resulting spelling then being used as a prefix for a newly created surrogate word,
  • step (c) further comp ⁇ ses a distinction step of separating the prefix and the suffix with a separating mark, such as a hyphen "-" or a space inserting between the prefix and the suffix of the surrogate word
  • the last alphabet of the spelling for the specific pictogram/ideogram is repeated either once, twice or thrice, the repeated alphabets being treated just as extra alphabets for distinguishing the radicals they represent and have no bea ⁇ ng on the pronunciation of the radicals
  • the prefixes are to be treated like the pictograms or ideograms that they represent silent Just as the phonetic radicals they represent, the suffixes are sounded If the pronunciation of a Chinese prefix is unique, the indicator for its intonation is exempted Since the suffixes are sounded, the demand for accuracy dictates that the indicator for the intonations should be present at all times with suffixes except when the intonation is the first one of four Brief Description of the Drawings
  • Fig 1 is a chart for illustrating the Pin Yin alphabets and Zhuyin Zimu
  • Fig 2 is a chart for illustrating the Katakana, Hiragana and their English Equivalent
  • Fig 3 is a chart for illustrating the Hangul and their English Equivalent
  • Figs 4-1 to 4-11 are a continuous chart for illustrating the 214 p/i radicals of Chinese language and the spelled surrogate prefix for each p/i radical, in which vanous systems of surrogate prefix can be obtained by Literary pronunciation, Habitual pronunciation, simplified special key(s) (both pmyin alphabets and zhuyin zimu) for computer input, and some simplified optional key for computer input
  • Fig 5 is a chart for illustrating a plurality of Chinese characters having the identical pronunciation of ji, wherein each character is precisely represented by a unique surrogate word achieved by the three steps of the conversion method of the present invention
  • Fig 6 is a chart for illustrating the five steps used to convert the Chinese characters into surrogate words in English-style alphabets and Zhuyin Zimu according to the present invention
  • Fig 7 is a chart for illustrating the five steps used to convert the Japanese characters into surrogate words in English-style alphabets, Katakana and Hiragana according to the present invention
  • Fig 8 is a chart for illustrating the five steps used to convert the Korean characters into surrogate words in English-style alphabets and Hangul according to the present invention
  • Fig 9 is a chart for illustrating the changes made to the alphabets used in the Pin Yin system according to the present invention
  • Fig 10 is a chart for illustrating the steps used to convert the Chinese phrases into surrogate phrases in English-style alphabets according to the present invention
  • Fig 11 is a chart for illustrating the steps used to convert the Chinese phrases into surrogate phrases in Zhuyin Zimu according to the present invention
  • Fig 12 is a chart for illustrating the steps used to convert the phrases of the Chinese characters used by the Japanese phrases into surrogate phrases in English-style alphabets, Hiragana and Katakana according to the present invention
  • Fig 13 is a chart for illustrating the steps used to convert the phrases of Chinese characters used by the
  • Fig 14 is a chart for introducing how the pictographic/ideographic radicals be illustrated visually Detailed Description of the Preferred Embodiment
  • the Chinese, the written form of Japanese and Korean languages which are de ⁇ ved from the Chinese, and even the Indian languages are constituted by non-phonetic characters
  • the present invention provides a method to convert such non-phonetic characters into phonetic components by using existing or newly created phonetic symbols
  • the phonetic symbols are the visual signs to represent the phonetic components, consonants and vowels They can be Latm-style or English-style alphabets and native alphabets such as Zhuyin Zimu, Katakana, Hiragana, Hangul, and etc Phonetic symbols can also be created by designating a set of signs to represent consonants, vowels and intonations of any languages
  • the purpose of the conversion method of the present invention is to enable the compute ⁇ zation of these languages
  • the phonetic symbols can be English-style alphabets, native alphabets such as Zhuyin Zimu, Katakana, Hiragana, Hangul or newly created signs or symbols
  • Each non-phonetic character is then converted to a unique and newly spelled "surrogate word" by the conversion process disclosed in the present invention
  • the newly spelled surrogate words can precisely represent the characters used in the respective language Please refer to Figs 1, 2 and 3 for these phonetic symbols and Figs 6, 7 and 8 for the spelled surrogate words constituted of these phonetic symbols
  • each non-phonetic character such as each typical Chinese character
  • a pictographic or ideographic radical (reciting as "p/i radical” in the following desc ⁇ ption) denoting the meaning of the character and a phonetic radical denoting the pronunciation or the approximate pronunciation of the character
  • p/i radicals can be coded precisely by 214 different sets of codes to represent the corresponding p/i radicals respectively Refer ⁇ ng to Fig 4, in fact, each p/i radical has a specific pronunciation, for example, the Chinese character "0" pronouncing xi and “3 " pronouncing zhi
  • surrogate words "xi” and "zhi” precisely represent the p/i radicals "H” and "3 " respectively
  • a character is broken into two radicals Then the radicals are alphabetized into prefix and suffix to form a surrogate word representing a given character
  • the radicals are alphabetized into prefix and suffix to form a surrogate word representing a given character
  • the present invention provides a conversion method of creating new surrogate words to precisely represent such non-phonetic characters used in w ⁇ tten onental language respectively
  • Such conversion method can entirely solve such problems of homonyms, so that the non-phonetic characters can be easily inputted into in a computer through an ordinary alphabetic keyboard, a mouse or other phonetic inputting method by keying in sequentially the corresponding created surrogate words for the characters
  • Hiragana which is treated as if it is plural in the present invention
  • Hiragana is a group of special Chinese characters adopted by the Japanese as consonants and vowels to denote the pronunciation of vocabularies of Japanese ongin Hiragana resemble scnpt Chinese characters
  • They are also a form of Japanese alphabets
  • Katakana which is treated as if it is plural in the present invention, are a group of special Chinese characters adopted by the Japanese as consonants and vowels to denote the pronunciation of the Chinese characters and vocabularies of foreign ongin Katakana resemble the shapes of the pnnted characters
  • Katakana are also Japanese alphabets
  • the conversion method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer of the present invention comprises the steps as follows
  • Step one Alphabetize a pictographic/ideographic radical of each character according to its pronunciation in a respective language, with the resulting spelling then being used as a prefix for a newly created surrogate word
  • the p/i radicals can be represented in many ways One way is to assign a unique number to each p/i radical The total 214 numbers for the 214 p/i radicals can then be used instead of the spelling Another way is to use a combination of keys on a computer key board to represent the numbers assigned to the radicals For an example, by pressing one or two or all three of Ctrl, Alt, and Shift, then press any one of the alphanumeric keys, we can easily have 214 key combinations to represent all 214 p/i radicals But the applicant thinks that the phonetic representation is the most user-friendly of all methods because the majonty of these radicals can be easily pronounced and spelled It requires little or no effort to memonze numbers or keys
  • Step two Alphabetize a phonetic radical of each character according to its pronunciation in the respective language, or alphabetize a pronunciation of a character if this character does not have a phonetic radical or when the phonetic radical does not bear the actual pronunciation of its character
  • the resulting spelling is then used as a suffix for the newly created surrogate word
  • a Chinese language 1 ya for ⁇ p and as in 2 ⁇ and ft, in which Jf- is the phonetic radical of the character f ⁇ and 3 ⁇ pronounces ⁇ (l e ya) Also is the phonetic radical of the character H and $ pronounces ⁇ ( ⁇ e ya) 2 met for ⁇ , , jg, and ⁇ ) ⁇ as in f ⁇ fc 3 qi for ⁇ as in
  • Step three Combine the prefix and suffix together to create a newly spelled surrogate word for each specific "character" used in the w ⁇ tten form of the respective language
  • an inputting tool bar is programmed to present on the screen of the computer
  • the inputting tool bar comp ⁇ ses two separated input windows, wherein one input window is a prefix inputting window for keying in the alphabetized spelling of the prefix of the surrogate word, and that another input window is a suffix inputting window for keying in the alphabetized spelling of the suffix of the surrogate word
  • the surrogate word is input into the computer for the representing character
  • the above step three further comprises a distinction step of separating the prefix and the suffix with a separating mark, such as a hyphen "-" or a space, which is inserted between the prefix and the suffix
  • a unique and newly spelled surrogate word is created to precisely represent a specific character
  • the surrogate word generally comp ⁇ ses a prefix and a suffix combining by a separating mark, in which the prefix is the spelling of the pronunciation of the p/i radical of the character and the suffix is the spelling of the pronunciation of the phonetic radical of the character or the pronunciation of the character
  • the prefix is the spelling of the pronunciation of the p/i radical of the character
  • the suffix is the spelling of the pronunciation of the phonetic radical of the character or the pronunciation of the character
  • the inputting specific character is a p/i radical itself, such as # ⁇ » — > ft » A ' d: ' ⁇ " ' P ' ⁇ tC ' J3 etc
  • the user can immediately achieve the above precise character after inputting the prefix (the spelling for the p/i radical) and the separating mark only and does not need to key in the suffix
  • the separate mark can also be omitted
  • the surrogate words do not look like the wntten Chinese character, it is 100% derived from Chinese character According to the above three steps of the conversion process of the present invention, approximately 95% of the Chinese characters are precisely represented by a corresponding umque surrogate word respectively In other words, when the user inputs a surrogate word created with the above three steps into a computer or word processor, a specific Chinese character can be precisely obtained Therefore, this conversion method highly increases the inputting accuracy and speed of inputting Chinese characters Furthermore, since the surrogate words are constituted of alphabets and can precisely represent the respective characters, the surrogate words enable the non-phonetic characters, I e the Chinese, Japanese and Korean languages, to be sent through E-mail without any confusion and loss of any information
  • a number of Chinese characters have an identical pronunciation, for example the pronunciation yi has about 99 homonymous characters It is the major problem making the non-phonetic characters difficult or even impossible to be computenzed
  • Fig 5 a list of Chinese homonym characters is illustrated, wherein all the characters pronounce , //
  • the spelling of the pronunciationy; of each homonymous character as shown in Fig 5 is converted to be a suffix of a surrogate word
  • the non-phonetic characters have a common feature that there is no identical character, I e each character has a different appearance, this feature becomes an important distinct factor of the non-phonetic characters Practically, even two characters have an identical pronunciation, they have different p/i radicals or even differently wntten phonetic radicals In other words, even though they may have the same p/i radicals and are pronounced identically, they definitely have different shapes, so that they will not be misunderstood by a reader By this point, one would probably has gathered that the
  • a first additional step is processed before the step three as described above
  • intonations of the pronunciation of the prefixes and suffixes are to be indicated with consonants placed at the end of the spelling
  • the first intonation bears no indicator
  • the second intonation is denoted by the second Chinese consonant "p” for pinym alphabet and " " for zhuyin zimu
  • the third intonation is denoted by a third consonant "m “ for pinyin alphabet and " fl “ for zhuyin zimu
  • the fourth intonation is denoted by a fourth consonant "f for pinyin alphabet and " C " for zhuyin zimu
  • this first additional step does not apply to the Japanese and Korean languages as these languages do not have intonations in their colloquial form Basically, by processing the above four steps, approximately 98% of the Chinese characters can be precisely represented by their corresponding specific surrogate words respectively Since the Chinese language has quite a few pictograms or ideograms with the same pronunciation and intonation, , there are some extreme examples as shown in Fig 6, in which four homonymous Chinese characters having the identical pronunciation of shi are illustrated A second additional step is processed after step three In the second additional step, a marker is added after the last alphabet of the surrogate word to represent the next homonymous pictogram or ide
  • a Chinese language (as shown in Fig 6) 1 van-shif for f ⁇ 2 yan-shiff for U 3 yan-shifff for ⁇ & 4 yan-shiffff for f$ OR 1 yan- shil for f 2 yan-sh ⁇ 2 for ff 3 yan-sh for ' g 4 yan-sh ⁇ 4 for i$ OR (1 yan-shia for jj ⁇ 2 van-shis for ⁇ 3 yan-shid for ff H 4 yan-shij " for f )
  • the prefixes are to be treated like the pictograms or ideograms that they represent silent Just as the phonetic radicals they represent, the suffixes are sounded If the pronunciation of a Chinese prefix is umque, the indicator for its intonation is exempted Since the suffixes are sounded, the demand for accuracy dictates that the indicator for the intonations to be present at all times with suffixes except when the intonation is the first one of the four
  • FIGs 6 to 8 which summa ⁇ ze the entire conversion process for four extreme exemplary characters which can illustrate all five steps regarding to the present invention
  • Fig 6 illustrates four Chinese characters as example
  • Fig 7 illustrates four Japanese characters
  • Fig 8 illustrates four Korean characters as example
  • Those charts in Figs 6 to 8 share the same format that each has five columns and at least eight rows Starting from the left, Column 1 contains the five steps of the conversion process or descnption of the columns to the right
  • the four columns to the nght contains the four characters mentioned in the above examples, one in each column, and the transformation they go through row by row Row 1 is occupied by the four characters in their o ⁇ ginal forms with pronunciation marked by English-style alphabets
  • Row 2 houses the radicals denved from each character
  • Row 3 shows the spelling for each and every radical
  • This row actually details the effects of steps one and two
  • Row 4 illustrates the effect of step three, which applies only to the Chinese language
  • Row 5 shows how step four influences the characters, with explanation in column
  • the prefixes are to be silent (like the pictograms or ideograms they represent) Just as the phonetic radicals they stand for, the suffices are sounded 2 If the pronunciation of a prefix is unique the indicator for its intonation can be omitted
  • the demand for accuracy dictates that the indicator for their intonation to be present at all times except when the intonation is the first one of the four
  • the surrogate words are organized in such a manner the prefix comes in first, then the separating mark (the separating mark can be a hyphen as in Latin alphabets, however the hyphen can be replaced by other symbols such as * if native alphabets are used to spell the surrogate words), then the suffix
  • the step one and the step three as descnbed above can be reversed, that is, the suffix proceeds the prefix Therefore, the surrogate word will be orgamzed "suffix-prefix"
  • the unique su ⁇ ogate word of the specific Chinese character J is "yi-x/" according to this second embodiment (in the first embodiment, the surrogate word of $H is "'x ⁇ -j ⁇ " Since both the surrogate words created from the first embodiment and the second embodiment consist of the same alphabets, both of them can be used to precisely represent the same character Besides, the user may preprogram
  • alphabets of the "home keys” can be used to indicate the order of the homonyms.
  • the user can choose the desired character by typing in a key, such as: “a” for the first, “s” for the second, “d” for the third". This is just a matter of the software design.
  • a voice recognizing computer can be built using a sound card and a software as in usual manner to train it to understand a person's pronunciation of each word in such a sequence that first the prefix, then the indicator if applies, then the marker if applies, then the separating mark (hyphen or asterisk), then the suffix, then the indicator if applies, then the marker(s) if applicable.
  • the reversed combination described above can also be used for this purpose as long as the hyphen or asterisk (*) is pronounced first, then the suffix, then the hyphen or asterisk (*), then the prefix.
  • the conversion method disclosed in the present invention can enable those people who can speak but cannot write the respective non-phonetic language to input the characters into a computer and print out an essay written in that language.
  • Another remarkable function of the present invention is that by utilizing a voice generating computer, with the same sound card and a software that can pronounce the suffices accurately, the computer can read out a document for the user to edit and print, so that the users need not spend time and energy to go over and check the document character by character for possible typing errors.
  • This function can also help those people who can speak but cannot write or read the respective non-phonetic language to check their document which is inputted into the computer through the conversion method of the present invention.
  • the English-style alphabets used to create the surrogate words to precisely represent the characters in written Chinese language are the ones used in the official Pinyin system in China, with some minor changes. These changes are intended to eliminate some exceptions in the Pinyin system, making it easier to use. Please referring to Fig. 1 for the entire set of Pinyin alphabets and Fig. 9 for the said changes. If Zhuyin Zimu is used to spell the newly invented surrogate words, the prefixes and the suffixes should be separated by an asterisk instead of a hyphen.
  • the alphabets used to create the surrogate words to precisely represent the characters in written Japanese are the ones proposed in Hepburn system of romanization commonly accepted by the Japanese to phonetically translate the Japanese language into English-style alphabets. If the new surrogate words are spelled with Katakana or Hiragana, the hyphen should be replaced by an asterisk also.
  • the alphabets used to create the surrogate words to precisely represent the characters used in the written Korean language are the ones commonly accepted by the Koreans to phonetically translate the Korean language into English-style alphabets. Also, an asterisk is used instead of a hyphen when the native alphabet is employed to spell a surrogate word for a character used in Korean language.
  • the consonant, the first alphabet or the last alphabet of a prefix should be adequate to represent most of the prefixes With voice recognition technology, the abbreviation may not be necessary
  • the newly spelled surrogate words resemble the characters in the way both kinds are consisted of radicals, pictographic or ideographic ones and phonetic ones
  • These surrogate words differ from the characters in at least two ways
  • the radicals for the newly spelled surrogate words are phoneticalized, while the radicals m the characters, especially the pictographic/ideographic radicals are not
  • the invented and spelled surrogate words are more uniform in construction that the pictographic/ideographic radicals always occupy the left portion of the surrogate words as prefixes, and the phonetic radicals always occupy the right portion of the surrogate words as suffixes
  • the pictographic/ideographic radicals in the characters can occupy the left, nght, top or bottom portion of the characters
  • the aforesaid resemblance makes the new surrogate words more familiar to the users while the differences make the new surrogate words logical and more scientific from the stand point of phonology As a
  • the manually inputting method compnses the steps of
  • the user can also simply input the predetermined character into the computer, which is equipped with a voice recognizing system, by means of an orally inputting method processed after executing the conversion method as disclosed above
  • the orally inputting method compnses the steps of (1) pronouncing the prefix of the surrogate word created by the conversion method;
  • Figs. 10 to 13 by means of the surrogate words, it renders a set of multi-syllabic vocabularies, a phrase or a sentence of written Chinese or Kanji characters used in Chinese, Japanese and Korean languages to be keyed-in through a simplified method utilizing a surrogate phrase or sentence which is a unique set of codes (USC).
  • USC unique set of codes
  • the unique set of codes is a group of alphabets constitute of acronyms, labels and makers for precisely representing a phrase or sentence, in which the acronym refers to the abbreviation of the suffixes of a plurality of surrogate words representing a plurality of Chinese characters in a given phrase or sentence and the label refers to the abbreviation of the prefixes of a plurality of surrogate words representing a plurality of Chinese characters in the above given phrase or sentence.
  • the marker is the repetition or repetitions of the last alphabet of the acronym or label.
  • the non-phonetic character is pronounced with a pronunciation of a phonetic radical of the non-phonetic character.
  • the specific phrase or sentence may be achieved by merely keying-in the acronym, which is obtained by the above two steps, of the surrogate phrase or sentence into the computer. If there are still a few homonymous phrases or sentences occurred, an additional step of repeating the last alphabet of the acronym as a marker can be processed.
  • the acronym and the label can also be pronounced by Zhuyin Zimu, Hiragana, Katagana or Hangul.
  • the acronym is keyed-in before the label. It is also possible to put the label in front of the acronym, then repeat the last alphabet of the acronym as a marker.
  • step (e) can be substituted by the step (e') of putting each alphabet of the label ahead of each alphabet of the acronym, or the step (e") of putting each alphabet of acronym ahead of each alphabet of the label.
  • the alternative surrogate phrase can be created in this manner: "SaTbUcVdWe” or "AsBtCuDvEw”.
  • the label can be any alphabet of a prefix or suffix of any character of the phrase, for examples, one of the following can be used as a label:
  • G The first alphabet of the suffix of the last character of the phrase or sentence
  • H The last alphabet of the suffix of the last character of the phrase of sentence.
  • the label can be the number of strokes of any character or its one of its radicals in the phrase or sentence.
  • the preferable labels are made of the regular form of the first alphabet of the prefixes because it is flowing with the typist's train of thoughts. Also, the labels will always be in lower case alphabet, while the acronyms themselves are always in upper case alphabet. This arrangement can avoid confusion on the part of the human as well as the computer. Since the spelling of a character is always in lower case alphabet, the "shift" key on the keyboard will serve as a signal to the computer that the user is going to input a phrase or sentence.
  • the theory of the surrogate phrase can be applied to any language that is burdened by homonyms, such as certain dialects in Indian language.
  • the predetermined phrase or sentence of non-phonetic characters can also be input into the computer by means of by means of a manually inputting method processed after executing the conversion method for converting a plurality of non-phonetic characters of a phrase of a language into a surrogate phrase as described above.
  • the manually inputting method comprises the steps of: (1) typing the su ⁇ ogate phrase or sentence created by the conversion method; and
  • the computer is equipped with a voice recognizing system
  • the user can also simply input the predetermined phrase or sentence into the computer by means of an orally inputting method processed after executing the conversion method as disclosed above.
  • the orally inputting method comprises the steps of:
  • the using of the pictographic/ideographic radical as a prefix of the su ⁇ ogate word can establish a helpful process for the children to memorize the Chinese or Kanji characters easily. Since every character has a pictographic/ideographic radical or is solely constituted by a pictographic or ideographic radical (many pictographic/ideographic radical itself is a character), the pictographic/ideographic radicals play a very important part of memorizing a character. Once one recognizes a pictographic/ideographic radical of a character, one can understand the meaning or the shape of the character. It is an important feature and characteristic of Chinese or Kanji characters. Moreover, a visual image is the most easy thing to be memorized.
  • every pictographic/ideographic radical can be illustrated by a visual drawing which is more easily and familiarly for the user to memorize.
  • pronouncing "mu” and having a meaning of the phonetic word, "eye” can be illustrated by a picture of eye
  • pronouncing "mu” and having a meaning of the phonetic word, "wood” can be illustrated by a drawing of a green tree
  • jf pronouncing "shen” and having a meaning of the phonetic word, "body”
  • f * pronouncing "gu” and having a meaning of the phonetic word, "bone” can be illustrated by a picture of bone
  • each pictographic/ideographic radical has a unique meamng
  • the spelling of the meaning of each pictographic/ideographic radical can also be used as the alphabetized spelling of the prefix of the surrogate word
  • "body” for J ⁇ - "bone” for f Therefore, for example, the surrogate words for Chinese characters
  • pictorial icons have an additional function of helping the user or student to memorize the pictographic/ideographic radicals
  • the spelling of the pronunciation of each pictographic/ideographic radical can further be illustrated next to the representing pictorial icon
  • the memorizing process compnses the following steps adding after the above step (c) (dl) illustrating the prefix of the surrogate word by a visual picture which represents the meaning of the pictographic/ideographic radical represented by the prefix,
  • huo fire
  • a video shows a group of people sitting around an orange-red colored bond fire
  • a close-up of the bond fire will be shown
  • the video will then show the flames of the fire roaring to make a noise mimicking the pronunciation of 'huo,' and dancing to form the shape of the character ' f > i' beanng the same orange-red color
  • a video showing how a calhgrapher smoothes out the fussy edges of the picture, and gives it the modern look as ⁇ ' bearing the same color
  • we will show how the nght half of the character is shrunk so that the character can be used as a pictographic radical in the same orange- red color
  • Video showing a woman stir-frying food in a wok on top of an orange-red colored fire, making noise with the iron spatula mimicking 'chao' can be shown together with the character 'j ⁇ ' (Pronounced chao in Chinese) with its pictographic radical ⁇ 'in orange-red color
  • Other characters such as u » > 3 » 'J3j can be made in the same manner, leaving the p/i radical in orange-red color
  • the strokes of the radicals of this type can be made 'hollow,' that is, to outline the stroke with black ink but leave the rest blank Then tiny, but visually discernible colored pictures of the objects the radicals represent can be used to fill in the blank space
  • these pictures will have to be the still pictures that are used to explain the radicals
  • H;gu bone
  • Fig 1 This list is intended to show what can be done to made the radicals easy to recognize and to remember
  • radicals can be illustrated the same way, if they are related to tangible objects There are some radicals that are related to intangible concepts These intangible radicals can be outlined to have "blank space" in the strokes first The its pronunciation in native alphabets can be put in the blank space

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
EP96945196A 1996-11-05 1996-12-10 Method for converting non-phonetic characters into surrogate words for inputting into a computer Withdrawn EP1012745A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/744,021 US5903861A (en) 1995-12-12 1996-11-05 Method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer
US744021 1996-11-05
PCT/US1996/019780 WO1998020429A1 (en) 1996-11-05 1996-12-10 Method for converting non-phonetic characters into surrogate words for inputting into a computer

Publications (1)

Publication Number Publication Date
EP1012745A1 true EP1012745A1 (en) 2000-06-28

Family

ID=24991115

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96945196A Withdrawn EP1012745A1 (en) 1996-11-05 1996-12-10 Method for converting non-phonetic characters into surrogate words for inputting into a computer

Country Status (6)

Country Link
EP (1) EP1012745A1 (ja)
JP (1) JP2002516004A (ja)
KR (1) KR20000053095A (ja)
AU (1) AU1462097A (ja)
CA (1) CA2270956A1 (ja)
WO (1) WO1998020429A1 (ja)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8862988B2 (en) 2006-12-18 2014-10-14 Semantic Compaction Systems, Inc. Pictorial keyboard with polysemous keys for Chinese character output
US11017771B2 (en) 2019-01-18 2021-05-25 Adobe Inc. Voice command matching during testing of voice-assisted application prototypes for languages with non-phonetic alphabets
DE102019007797B4 (de) 2019-01-18 2023-11-30 Adobe Inc. Abgleichen von Stimmbefehlen während des Testens von stimmunterstützten App-Prototypen für Sprachen mit nichtphonetischen Alphabeten

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57501254A (ja) * 1980-08-01 1982-07-15
US5164900A (en) * 1983-11-14 1992-11-17 Colman Bernath Method and device for phonetically encoding Chinese textual data for data processing entry
GB2157864B (en) * 1984-03-07 1987-09-23 Hing Sun Chan A method of coding chinese characters according to phonetic transcriptions
EP0271619A1 (en) * 1986-12-15 1988-06-22 Yeh, Victor Chang-ming Phonetic encoding method for Chinese ideograms, and apparatus therefor
US5175803A (en) * 1985-06-14 1992-12-29 Yeh Victor C Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language
US4951202A (en) * 1986-05-19 1990-08-21 Yan Miin J Oriental language processing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9820429A1 *

Also Published As

Publication number Publication date
JP2002516004A (ja) 2002-05-28
AU1462097A (en) 1998-05-29
KR20000053095A (ko) 2000-08-25
CA2270956A1 (en) 1998-05-14
WO1998020429A1 (en) 1998-05-14

Similar Documents

Publication Publication Date Title
US6292768B1 (en) Method for converting non-phonetic characters into surrogate words for inputting into a computer
US5903861A (en) Method for specifically converting non-phonetic characters representing vocabulary in languages into surrogate words for inputting into a computer
US8328558B2 (en) Chinese / English vocabulary learning tool
US6604878B1 (en) Keyboard input devices, methods and systems
JPS6069726A (ja) キ−ボ−ド構成方法およびキ−ボ−ド
JPS61240365A (ja) コ−ド化システムを用いる、特定のキイボ−ドを有する文字印書装置
US6075469A (en) Three stroke Chinese character word processing techniques and apparatus
Osborn Letters of light: Arabic script in calligraphy, print, and digital design
US20050010391A1 (en) Chinese character / Pin Yin / English translator
Kessler et al. Writing systems: Their properties and implications for reading
WO2005022296A2 (en) Method and system for mutely communication in a spoken language
US20050027547A1 (en) Chinese / Pin Yin / english dictionary
US20040243746A1 (en) Character generation system
WO2000060560A1 (en) Text processing and display methods and systems
AU2006255605B2 (en) Method for learning chinese character script and chinese character-based scripts of other languages
US20050080612A1 (en) Spelling and encoding method for ideographic symbols
KR101559477B1 (ko) 한글을 이용한 다언어 입력시스템
EP1012745A1 (en) Method for converting non-phonetic characters into surrogate words for inputting into a computer
CN105045410A (zh) 一种形式化拼音和汉字对应识别的方法
Greenwood International cultural differences in software
Dasgupta et al. Forward Transliteration of Dzongkha Text to Braille
US5079702A (en) Phonetic multi-lingual word processor
Hall Local language software in South Asia
CN103576891A (zh) 一键快打字
HALL 15 Local Language Software in South Asia

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990526

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): CH DE FR GB IT LI NL SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20020702