CN1115884A - Chinese character changing device - Google Patents

Chinese character changing device Download PDF

Info

Publication number
CN1115884A
CN1115884A CN 94104871 CN94104871A CN1115884A CN 1115884 A CN1115884 A CN 1115884A CN 94104871 CN94104871 CN 94104871 CN 94104871 A CN94104871 A CN 94104871A CN 1115884 A CN1115884 A CN 1115884A
Authority
CN
China
Prior art keywords
syllable
dictionary
chinese
word
word length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 94104871
Other languages
Chinese (zh)
Inventor
周峻慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Publication of CN1115884A publication Critical patent/CN1115884A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

Chinese character changing device is disclosed to reduce the segmentation of syllables and dictionary retrieval that are not necessary. When a phonetic symbol string to be converted is inputted, a dictionary word length retrieval part 11 takes information on corresponding word length out of a dictionary word length information part 12 for the respective syllables of the inputted phonetic symbol string. A conversion control part 13 gives priority to a syllable which is large in word length value first and to a syllable which is inputted earlier secondarily on the basis of the taken-out information on the word length, and segments a phonogram string of word length succeeding the syllable as an object of retrieval at the dictionary part 15. Thus, the phonogram string is outputted after being converted into words retrieves by a dictionary retrieval part 14. Consequently, efficiency such as the speed of Chinese character conversion and the accuracy rate is improved.

Description

Chinese-characters changing device
The present invention relates to Chinese-characters changing device, especially about the Chinese Chinese-characters changing device.
The Chinese character kind of using in the Chinese article has more than 10,000.Be most important problem the Chinese language computer that comprises Chinese character processor is handled how from the correct Chinese character of importing at a high speed again wherein.Chinese character input mechanism in the past has voice recognition, literal identification, keyboard etc.In these mechanisms, the most reliable with the input of keyboard, so obtain widespread use.This imputting Chinese characters of keyboard that utilizes is divided into the input mode of Chinese-character pronunciation and the input mode of Chinese character pattern again.Wherein, the latter must remember the rule that is used to import in advance, and needs the considerable time of cost carry on the back note, also takes time till skillfully.And the former learns the most easily, not only is extensive use of now, estimates to become in the future the main flow of imputting Chinese characters.
As the Chinese-characters changing device of input Chinese-character pronunciation, the device that for example has Taiwan number of patent application 75105838 to be recorded and narrated.Fig. 6 is the structural drawing of the Chinese-characters changing device of this input pronunciation in the past.Among the figure, the 100th, (at this, use the mark with phonetic symbols character string of the plural form of expression, this is because usually with syllable of a plurality of mark with phonetic symbols character representations to mark with phonetic symbols character strings such as the phonetic of input random length, phonetic notation, Roman capitals.Yet, originally single in the Japanese, the plural number difference is indeterminate, so a mark with phonetic symbols character also is expressed as the mark with phonetic symbols character string in this instructions, with regard to " string ", does not have tight plural implication) input part.The 180th, mark with phonetic symbols character string and the word corresponding with it are done the registration dictionary portion of (the permanent storage).The 140th, the NCHAR register of the mark with phonetic symbols character string syllable number of storage input.(by the way, be that a Chinese character is a syllable in principle in the Chinese, so syllable number equals the Chinese character number usually.) 120,130 be respectively PTR register and the NP register that uses when the mark with phonetic symbols character string is transformed into word (be meant certainly Chinese character form Chinese word).PTR register 120 is registers that the position of syllable at first of cutting part in the mark with phonetic symbols character string is stored as searching object.The NP register is when an input mark with phonetic symbols character string is transformed into word, to the word length that becomes searching object in the dictionary, promptly constitute the Chinese character number of word, also be syllable number, the register that stores.The 150th, comparing section, it the word with certain-length by retrieval with conversion process after, the value of above-mentioned NP register is subtracted 1, and the word that makes thus a few Chinese character number that constitutes is in retrieval next time, and the result forms can adopt and allows the principle of the formation number of words preferential conversion of word how.The 160th, the conversion control part, it is controlled to the desired location of PTR register 120 from the original position of input mark with phonetic symbols character string and begins to pass backward successively, and check whether have in this mark with phonetic symbols character string by the syllable of Chinese character conversion, if all syllables all are not transformed, and the word corresponding with dictionary portion 180 arranged, then allow this mark with phonetic symbols character string be transformed into the word corresponding, preferentially carry out the principle of the Chinese character conversion of the mark with phonetic symbols character string of input earlier thereby can adopt with it.The 170th, the dictionary search part, the syllable string of sending here with above-mentioned conversion control part 160 is the retrieval key, retrieval dictionary portion 180 if corresponding word is arranged, just delivers to it conversion control part 160.The 190th, with the efferent of the result of 160 conversion of above-mentioned conversion control part output.
Utilize PTR register 120, NP register 130 and comparing section 150 and become and to adopt the longest consensus method, when being the Chinese character conversion, it is the 1st preferential making the many words of syllabication number, it is the 2nd preferential to allow earlier the Chinese character of input syllable be transformed to again, about the method, owing to be the present patent application people disclosed known technology in Japan of once applying for special permission (application number is willing to flat 5-75911, specially be willing to flat 5-75912 for special) etc. in addition, so save its explanation.
Also some content is also saved detailed description because of identical with the Japan word processor, for example: if the data registration in the dictionary portion 180 press the mark with phonetic symbols character fixed priority, the order that the syllabication number is few, and the Chinese word corresponding with same mark with phonetic symbols character have under a plurality of situations, allow high the showing earlier of usage frequency; Require the importer limit to show that referring to the mark with phonetic symbols character of having imported on the CRT limit keypad carries out the input of input part 100; Dictionary portion 180 is made of high-speed semiconductor memory, disk etc.; Efferent 190 is made of CRT, Printing Department etc.; Carry out the retrieval of search part etc. with the electronics counter point.
Have, much less also additional have because of Chinese character conversion make mistakes disposal when exporting the Chinese character that non-importer wants and learning functionality etc., because also be that well-known technology is saved its explanation again.
Yet, Chinese-characters changing device as described above, owing to carry out the Chinese character conversion with the longest consensus method, for input mark with phonetic symbols character string, at first to begin to pass backward one by one becoming under the maximum word length of transforming object from initial input syllable, cut and to be used for the syllable Chinese character conversion and that become searching object dictionary portion in, next must make this word length reduce to carry out on the basis identical retrieval one by one again.Therefore, various efficient such as Chinese character conversion rate, accuracy are not so good.Especially in the input article, do not have under two above word situations of literal, descend all the more with the Chinese character conversion accuracy of this article of mark with phonetic symbols string representation.For example, when input " wo3de5 jia1 zai4 shan1 de5 na4 tou2 ", if the word length of registering in the dictionary portion, and then the maximum word length that will become Chinese character conversion searching object is 7, then in order to detect word from dictionary portion, at first cutting out length successively by preceding beginning is 7 syllable string " wo3 de5 jia1 zai4 shan1 de5 na4 ", " de5 jia1 zai4shan1 de5 na4 tou2 ", if there is not corresponding word, then cutting out length, to subtract 1 by 7 be 6 syllable string " wo3 de5 jia1 zai4 shan1 de5 ", " de5jia1 zai4 shan1 de5 na4 ", " jia1 zai4 shan1 de5 na4tou2 ", below cut out length in the same manner and be 5 syllable string " wo3 de5 jia1 zai4shan1 ", " de5 jia1 zai4 shan1 de5 ", if can not cut length is 1 syllable " Wo3 ", " de5 ", " jia1 ", " zai4 ", " shan1 ", " de5 ", " na4 ", " tou2 ", then can not detect corresponding word " I ", " ", " family ", " ", " mountain ", " ", " that ", " head ", and after detecting above-mentioned corresponding word, the result of output Chinese character conversion " my family is at that head on mountain ".In this case, in fact because the Chinese character of corresponding same syllable often exists a plurality ofly, no matter from the conversion rate aspect, or the accuracy aspect sees that it is very poor that conversion efficiency all becomes.Purpose of the present invention is desired head it off, provides a kind of conversion efficiency good Chinese-characters changing device.
For achieving the above object, the present invention makes the Chinese-characters changing device with such feature, promptly have: the input part of 1. importing the mark with phonetic symbols character string, 2. registration has the dictionary portion of the Chinese word of mark with phonetic symbols character string and correspondence thereof, 3. all pronunciations of Chinese are made the dictionary word length information portion of corresponding registration with the information of the word length aspect that is registered in the Chinese word that is begun by this pronunciation in the above-mentioned dictionary portion, 4. import each syllable of mark with phonetic symbols character string relatively, take out the dictionary word length search part of the information of corresponding word length aspect from above-mentioned dictionary word length information portion, 5. cut out input mark with phonetic symbols character string as the syllable of Chinese character transforming object the time, the big pronunciation of word length that allows above-mentioned dictionary word length detecting element take out is the 1st preferential, allow under the same word length earlier the syllable of input be the 2nd preferential, cut the syllable that length equals the mark with phonetic symbols character string of the word length that the syllable selected according to mentioned above principle takes out for beginning and cut out portion, 6. cutting out the mark with phonetic symbols character string that portion cuts out with above-mentioned syllable is the retrieval key, retrieve the dictionary search part of Chinese word corresponding in the above-mentioned dictionary portion, 7. the Chinese word that retrieves according to above-mentioned dictionary portion is transformed into above-mentioned mark with phonetic symbols character string the transformation component of corresponding Chinese character.
According to said structure, will become the mark with phonetic symbols character string of Chinese character transforming object by the input part input.Registration has mark with phonetic symbols character string and corresponding Chinese word thereof in the dictionary portion.Dictionary word length information portion is done corresponding registration with whole pronunciations of Chinese with the information of the word length aspect that is registered in the Chinese word that is begun by this pronunciation in the dictionary portion in advance.Dictionary word length search part, each syllable to input mark with phonetic symbols character string takes out corresponding word length information from dictionary word length information portion.Syllable cuts out portion, cut from import the watch sound character string as the syllable of Chinese character transforming object the time, the big pronunciation of word length that allows dictionary word length search part be taken out is the 1st preferential, allow under the same word length earlier the syllable of input be the 2nd preferential, cutting from pronunciation, the length of selecting thus is the continuous mark with phonetic symbols character string that is removed word length.It is the retrieval key that the dictionary search part is cut out the mark with phonetic symbols character string that portion cuts out with syllable, the Chinese word of correspondence in the retrieval dictionary portion.The Chinese word that transformation component retrieves according to the dictionary search part, the mark with phonetic symbols character string of searching object is transformed into corresponding Chinese character with becoming for this reason.
Fig. 1 is the structural drawing of Chinese-characters changing device one embodiment of the present invention;
Fig. 2 is the workflow diagram of the Chinese character conversion process of Fig. 1 embodiment;
Fig. 3 is the data structure concept map of the dictionary portion of Fig. 1 embodiment;
Fig. 4 is the data structure concept map of dictionary word length information portion among Fig. 1 embodiment;
Fig. 5 is the dictionary word length hum pattern of the relevant input syllable represented as concrete example among Fig. 1 embodiment;
Fig. 6 is the structural drawing of Chinese-characters changing device in the past.
Below the number in the figure implication is.10: input part, 11: dictionary word length search part, 12: dictionary word length information portion, 13: conversion control part, 14: dictionary search part, 15: dictionary portion, 16: efferent.
Followingly the present invention is described according to embodiment.
Fig. 1 is the structural drawing of Chinese-characters changing device one embodiment of the present invention.Fig. 2 is the processing flow chart of present embodiment.Among Fig. 1,10 is the input part of watch sound characters such as input Pinyin, phonetic notation, Roman capitals.15 for registering the dictionary portion of the Chinese word that mark with phonetic symbols character string and correspondence thereof are arranged.The 14th, serve as the retrieval key with the mark with phonetic symbols character string, detect the dictionary search part of corresponding Chinese word from above-mentioned dictionary portion 15.The 12nd, registration has all pronunciation of Chinese (1 syllable is formed) and with the dictionary word length information portion of the information (as the information corresponding with pronunciation) of the word length aspect of the Chinese word that is registered in this pronunciation beginning in the dictionary portion 15.The 11st, correspondence is imported each syllable of mark with phonetic symbols character string, takes out the dictionary word length search part of the information of corresponding word length aspect from above-mentioned dictionary word length information portion 12.The 13rd, cut from import the mark with phonetic symbols character string as the syllable of Chinese character transforming object the time, allow the long pronunciation (syllable) of word length that above-mentioned dictionary word length search part 11 detected be the 1st preferential, allow under the same word length earlier the syllable of input be the 2nd preferential, then cut after length equals the mark with phonetic symbols character string of the word length that the syllable selected according to these two principles takes out for beginning, allow 14 retrievals of dictionary search part to cutting out the Chinese word of mark with phonetic symbols character string, if corresponding word is arranged, just it is transformed into the conversion control part of the Chinese character string that constitutes this word.(annotate: use Chinese character string and plural form here, referred to word much less, not all right as for article, but because of also comprising article formula sentence, so be decided to be " string ".But, the place illustrated in the mark with phonetic symbols character string, and because of there being the situation of a Chinese character, therefore " string " do not have strict meaning and difference at odd number, aspect plural.) the 16th, the efferent that the result of 13 conversion of above-mentioned conversion control part is exported.
Fig. 3 is the data structure concept map in the dictionary portion 15 of present embodiment.Basic structure is made up of normally used mark with phonetic symbols character string and corresponding Chinese word thereof, and it is identical with conventional art to put in order.
Fig. 4 is the data structure concept map of the dictionary word length information portion 12 of present embodiment.Basically registration the information of all pronunciations of expression Chinese is arranged and the word length of the Chinese word that begins with this pronunciation that is registered in the dictionary portion 15 (as with the mark with phonetic symbols character string and the corresponding data of each pronunciation of all pronunciations of expression Chinese), promptly constitute the information of the syllable number of this word.If word length is 1, then there are not two words that above Chinese character is formed that begin with this pronunciation in expression.In addition, do not use with some pronunciation, for example the word that begins with " men5 " owing in the Chinese.If will go out with the syllable string of " men5 " beginning the object that detects from input mark with phonetic symbols character truncation, just not have this necessity as dictionary portion.
Below referring to Fig. 2 the work of treatment process of present embodiment is described.
S1: input mark with phonetic symbols character string.And enter S2.
S2: check whether current input mark with phonetic symbols character is the end of input key.If just enter S3.If not, just get back to S1, wait and wait upon next input.
S3: the word length information of retrieving corresponding word according to each syllable of input.
S4: according to the word length information of conversion syllable correspondence not as yet, take out untreated maximum word length and have the syllable of this word length, and, if a plurality of syllables are arranged under the same word length, behind the then preferential syllable that takes out input earlier, enter S5.
S5: can discussion with the syllable that takes out for cutting starting point, and with the word length that takes out for cutting length, go out continuous syllable string from input syllable truncation.If can, just enter S6.If can not, just enter S4.
S6: check in the syllable string that cuts whether the syllable of Chinese character conversion is arranged.If the syllable of conversion is arranged, then enter S7, if be provided with, then enter S4.
S7: whether register with current section in the retrieval dictionary portion and go out the corresponding word of syllable string.If corresponding word is arranged, then enter S8, if do not have, then enter S4.
S8: will cut the syllable string of going out as the searching object of current Chinese character conversion, and after Chinese character is transformed into the corresponding word that is retrieved by S7, enter S9.
S9: check the whether full Chinese character conversion of current section syllable that goes out.If Chinese character conversion entirely then enters S10,, then enter S4 if the syllable of not conversion is arranged.
S10: whole output transform result---Chinese character strings, the Chinese character conversion process of end input mark with phonetic symbols character string thus.
Now to the present embodiment of said structure, " " wo3 men5 da3qiu2 qu4 ba5 " is example, and its work is described to lift concrete input.
When importing this mark with phonetic symbols character string (S1) and pressing the end of input key (S2) of representing end of input, dictionary word length search part just detects corresponding word length information (S3) according to the syllable of each input.The input sequence of importing the corresponding word length information of each syllable with this moment is shown in Fig. 5.As shown in Figure 5, each imports the pronunciation that only begins with " da3 " in the syllable maximum word length " 7 " (registration has the word that " insisting on getting to the bottom of the matter " such 7 syllables constitute in the dictionary).In addition, pronunciation " men5 ", " ba5 " are not the above word of double-tone joint of beginning with them.
As for each syllable by 11 inputs of dictionary word length search part, learn the conversion control part 13 of its word length information, from not taking out maximum length " 7 " and, check whether to cut 7 the continuous syllables (S5) that come from the 3rd " da3 " beginning of input syllable string the pairing word length information of the syllable of Chinese character conversion as yet as behind the syllable " da3 " of i.e. the 3rd input of the syllable corresponding (S4) with " 7 ".Since the input syllable number this as 6, judge that it is impossible, so detect next starting point and length (S4) of might syllable cutting out.
At this moment, from as yet not the pairing word length information of conversion syllable take out untreated maximum length " 4 ".Take out the prepreerence promptly first syllable of importing in the syllable corresponding again with length " 4 ", be in the case the 1st input " wo3 " (S4).Certainly, the conversion control part judges that from then on syllable begins to cut continuous 4 syllables (S5), checks this section goes out in the syllable string " wo3men5 da3 qiu2 " whether converted syllable (S6) is arranged.Owing to be the complete not syllable of conversion, serve as the retrieval key therefore, with whether registering corresponding word (S7) in the dictionary search part retrieval dictionary portion with these all syllables.Because of there not being corresponding word, just transferring to and detect starting point and the length (S4) that next might syllable cuts out.
This moment because of the length that has word length information for " 4 " and do not become the syllable of searching object as yet, take out in this syllable the second preferential syllable also i.e. the 3rd input " da3 " (S4).The conversion control part judges that from then on triphone begins to cut 4 continuous syllables " da3qiu2 qu4 ba5 " (S5), checks this section goes out whether there is converted syllable (S6) in the syllable.Because the not conversion of all syllables, dictionary search part serve as whether to register corresponding word (S7) in the retrieval key retrieval dictionary portion with these all syllables.Owing to there is not corresponding word, therefore next starting point and length (S4) of might syllable cutting out is discussed.
Then, the word length information as same object of taking out still be " 4 " next preferential " qiu2 " (S4).The conversion control part judges that this is the syllable of the 4th input, can not cut out 4 continuous syllables (S5) that syllable from then on begins.Therefore, detect next possible word length (S4).
Order during with " 7 ", " 4 " is the same, for the syllable of Chinese character conversion not as yet, takes out maximum length " 3 " from the word length that becomes process object.Preferential cut with should " 3 " corresponding syllable in " da3 " that import at first (S4).Judge that by the conversion control part from then on triphone " da3 " begins to cut 3 continuous syllables " da3 qiu2 qu4 " (S5), and all be conversion syllable (S6) not.The dictionary search part serves as whether to have registered corresponding word (S7) in the retrieval key retrieval dictionary portion with these syllables.Because of there not being corresponding word, transferring to next starting point and length of might syllable cutting out is discussed.
Then, take out process object length still be the next preferential syllable of " 3 " also be the 5th input " qu4 " (S4).Yet the conversion control part judges to cut and comes from this syllable and play corresponding 3 continuous syllables (S5), transfers to next starting point and length (S4) of might syllable cutting out is discussed.
At this moment, the maximum length that becomes process object is " 2 ".On this basis, take out to input at first in should the syllable of length " 2 " also be first syllable " wo3 " (S4).Cut out 2 continuous syllables " wo3 men5 " of beginning from this first syllable (S5) by the conversion control part, and the not Chinese character conversion (S6) entirely of these syllables, whether dictionary portion exists corresponding word (S7) as the retrieval key in the retrieval dictionary portion.Owing to have corresponding word " we ", so " wo3 men " is transformed into " we ".
Then, the length of process object still is " 2 ", and also promptly the 3rd syllable " da3 " be (S4) to take out next corresponding with length " 2 " input.Cut 2 continuous syllables " da3 qiu2 " (S5) by the judgement of conversion control part since the 3rd " da3 " that imports, these are not Chinese character conversion as yet (S6) all.The dictionary search part serves as a retrieval key retrieval dictionary portion content with these syllables, and the result detects corresponding word and " plays ball ", on this basis, " da3 qiu2 " Chinese character is transformed into " the playing ball " that detects.
Then, process object length still is " 2 ", " qu4 " that take out " 2 " corresponding next preferential syllable therewith and also be the 5th input (S4).The judgement of conversion control part begins to cut 2 continuous syllables " qu4 ba5 " (S5) from this pentasyllable " qu4 ", and these are not Chinese character conversion as yet (S6) all.Then, the dictionary search part serves as whether the retrieval of retrieval key exists corresponding word (S7) with these syllables.Owing to there is not corresponding word, transfers to next starting point and length (S4) of might syllable cutting out is discussed.
At this moment, the length that becomes process object is " 1 ".At this moment, from input and as yet not the syllable of Chinese character conversion begin successively through dictionary portion content retrieval and the Chinese character conversion.Cut the 5th input " qu4 " (S5), the dictionary search part serves as a retrieval key retrieval dictionary portion content with this " qu4 ", " goes to " back (S7) at the highest literal of usage frequency that detects correspondence, carries out Chinese character conversion (S8).
Then, the 6th syllable " ba5 " becomes process object (S4), detects usage frequency is the highest the Chinese character of pronunciation " ba5 " literal " (outer 1) "-" " (S7) from dictionary portion, and Chinese character is transformed into " (outer 1) " (S8).At this moment, the conversion control part is judged the input syllable Chinese character conversion entirely as the Chinese character transforming object, on this basis, exports this transformation results " we play ball (outer 1) " to efferent, transfers to finishing the Chinese character conversion process.
Abovely according to embodiment the present invention has been described, much less, the present invention is not only limited to the foregoing description, and for example following situation is also included within the scope of the invention.
(1) needn't be as shown in Figure 2, must be transformed into Chinese character by all mark with phonetic symbols character strings that just begin behind the end of input key input mark with phonetic symbols character string, as long as input tone key, promptly, just begin to be converted into current input mark with phonetic symbols character string from the end of input position that is fed back into the mark with phonetic symbols character string whenever syllable of input.
(2) do not allow the dictionary word length information portion be individual member, make with dictionary portion and be integral.
(3) so-called " mark with phonetic symbols character " is meant the pronunciation of article by showing Chinese, sentence, word, Chinese character, is used for limiting especially the character of article that the importer wants, sentence, word, Chinese character.Much less comprise second type of phonetic symbol, this phonetic symbol in Taiwan and the roman alphabet pinyin mark that use in the continent, also comprise the phoneme character as the Japanese ideogram, other is as the proverb literary composition in Korea's literary composition etc.
" Chinese article " refers to the article based on ideographic character in addition, the file inscape is not only limited to Chinese character, Chinese word, comprises also that much less arabic numeral, China it seems the Chinese character " Toge " made by foreign Japan, foreign language " East capital " etc., as article, also comprise the Chinese in the Japanese.
(4) be added with such function, it is some specific syllable, with the beginning of its syllable do not have very much or all words without its syllable beginning (as " ん " in the Japanese), even it is a lot of to constitute the number of words of its word in the case, the retrieval of dictionary portion is still taken turns in the back, otherwise, some specific syllable is because of there being the very high word of usage frequency that begins with this syllable, to this syllable, even its word length is short, also make retrieval preferential.
Under these situations, be easy to the note of calibrating (seal) achieve the goal by on the respective word length information, adding.
(5) for ease of making etc., a member of the present invention physically, mechanically is divided into a plurality of, otherwise, a plurality of members physically, are mechanically made one, perhaps with they appropriate combination.
Perhaps again by making existing Chinese-characters changing device store program of the present invention, data, so that bring into play function of the present invention.
(6) from importer's custom, the article wanting to import considers, has under notable feature, the frequency at the word of finding to become transforming object and Chinese character itself, the Chinese character number that constitutes word, adds the learning functionality to its effective processing.
Under these situations, also can be by other simple storing apparatus, counting assembly are set, establish priority routine in the data under their effect in dictionary word length information portion, and arrange, be easy to realize by the registration of final change word length aspect information.
In sum, if utilize the present invention, when the syllable string with the input of mark with phonetic symbols character style is transformed into Chinese written language, needn't be to input syllable string, pass one by one according to disposable maximum word length, cut all possible syllable and retrieve dictionary again, but by information with reference to each syllable word length aspect, make word length big be the 1st preferential, the syllable that makes input earlier is the 2nd preferential, cut the syllable of the word that to become searching object, therefore, can reduce unnecessary dictionary retrieval.So, improved the efficient of Chinese-characters changing device, and its effect is fairly good.

Claims (1)

1. Chinese-characters changing device, it is characterized in that having: the input part of 1. importing the mark with phonetic symbols character string, 2. registration has the dictionary portion of the Chinese word of mark with phonetic symbols character string and correspondence thereof, 3. all pronunciations of Chinese are made the dictionary word length information portion of corresponding registration with the information of the word length aspect that is registered in the Chinese word that is begun by this pronunciation in the above-mentioned dictionary portion, 4. import each syllable of mark with phonetic symbols character string relatively, take out the dictionary word length search part of the information of corresponding word length aspect from above-mentioned dictionary word length information portion, 5. cut out input mark with phonetic symbols character string as the syllable of Chinese character transforming object the time, the big pronunciation of word length that allows above-mentioned dictionary word length detecting element take out is the 1st preferential, allow under the same word length earlier the syllable of input be the 2nd preferential, cut the syllable that length equals the mark with phonetic symbols character string of the word length that the syllable selected according to mentioned above principle takes out for beginning and cut out portion, 6. cutting out the mark with phonetic symbols character string that portion cuts out with above-mentioned syllable is the retrieval key, retrieve the dictionary search part of Chinese word corresponding in the above-mentioned dictionary portion, 7. the Chinese word that retrieves according to above-mentioned dictionary portion is transformed into above-mentioned mark with phonetic symbols character string the transformation component of corresponding Chinese character.
CN 94104871 1993-08-06 1994-04-26 Chinese character changing device Pending CN1115884A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP5196371A JP2997151B2 (en) 1993-08-06 1993-08-06 Kanji conversion device
JP196371/93 1993-08-06

Publications (1)

Publication Number Publication Date
CN1115884A true CN1115884A (en) 1996-01-31

Family

ID=16356752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 94104871 Pending CN1115884A (en) 1993-08-06 1994-04-26 Chinese character changing device

Country Status (2)

Country Link
JP (1) JP2997151B2 (en)
CN (1) CN1115884A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109643322A (en) * 2016-09-02 2019-04-16 株式会社日立高新技术 The processing system of the construction method of character string dictionary, the search method of character string dictionary and character string dictionary

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109643322A (en) * 2016-09-02 2019-04-16 株式会社日立高新技术 The processing system of the construction method of character string dictionary, the search method of character string dictionary and character string dictionary
CN109643322B (en) * 2016-09-02 2022-11-29 株式会社日立高新技术 Method for constructing character string dictionary, method for searching character string dictionary, and system for processing character string dictionary

Also Published As

Publication number Publication date
JPH0749858A (en) 1995-02-21
JP2997151B2 (en) 2000-01-11

Similar Documents

Publication Publication Date Title
CN1176456C (en) Automatic index based on semantic unit in data file system and searching method and equipment
US5164900A (en) Method and device for phonetically encoding Chinese textual data for data processing entry
CN1098500C (en) Method and apparatus for translation
Springmann et al. OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus
CN1008016B (en) Imput process system
JPH03224055A (en) Method and device for input of translation text
JP4738847B2 (en) Data retrieval apparatus and method
CN1115884A (en) Chinese character changing device
US20050080612A1 (en) Spelling and encoding method for ideographic symbols
JPH10269204A (en) Method and device for automatically proofreading chinese document
CN1019425B (en) Chinese input system and its key board
CN1043821C (en) Chinese character shifting apparatus
CN1111373A (en) Computer Chinese input scheme based on the Chinese Phonetic Alphabet
CN1323004A (en) Automatic conversion method from Chinese braille to Chinese character
CN1928789A (en) Chinese character input method for computer
CN1181529A (en) User's action recording device
CN1928790A (en) Novel spelling character
CN88100890A (en) The language input media of Chinese article writing device
CN1056457C (en) Hanyupinying writing inputing method for computer
JP3045886B2 (en) Character processing device with handwriting input function
CN1043490C (en) Muti-word exchanging apparatus and Chinese character exchanging apparatus
CN1609762A (en) Binary syllabification
CN1089736A (en) Fuzzy character transtormer
JPS6121581A (en) Character recognizer
JPH10240725A (en) Method for processing data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication