CN101324878A - Method and apparatus for automatically learning new words and character input system - Google Patents
Method and apparatus for automatically learning new words and character input system Download PDFInfo
- Publication number
- CN101324878A CN101324878A CNA2007101118424A CN200710111842A CN101324878A CN 101324878 A CN101324878 A CN 101324878A CN A2007101118424 A CNA2007101118424 A CN A2007101118424A CN 200710111842 A CN200710111842 A CN 200710111842A CN 101324878 A CN101324878 A CN 101324878A
- Authority
- CN
- China
- Prior art keywords
- word string
- new word
- new
- dictionary
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a new word automatic learning method suitable for an object language word input system comprising a word dictionary. The method comprises the following steps: the gathering step, character strings which do not exist in a new word dictionary and a word dictionary are gathered from an input object language corpus as the gathered character strings, and the new word dictionary is used for preserving the words which do not exist in the word dictionary; the first preserving step, the gathered character strings which do not exist in a temporary new character string word dictionary are preserved in the temporary new character string word dictionary as the temporary new character strings, and the gathered character strings which exist in the temporary new character string word dictionary but not in the new character string word dictionary are preserved in the new character string word dictionary as the new character strings; and the second preserving step, when the user selects the new character strings presenting as the input candidates and included in the new character string word dictionary during the object language word inputting process by using the object language word input system, the new character strings are preserved in the new word dictionary as the new words.
Description
Technical field
The present invention relates to the literal input, be specifically related to a kind of automatically learning new words method of object language character input system and character input system that uses the device of this method and use this method of being used for, can improve input efficiency, and be applicable to portable data assistance such as mobile phone such as Chinese or Japanese.
Background technology
Literal outside the western language, the orient characters such as Chinese and Japanese is the problem that perplexs people for a long time to the input of computing machine and so on digital device.In order to solve the problem of Computer Processing Chinese, developed various character input methods, improve the automatization level of information processing.
In order to improve the literal input efficiency, the character input method of part is supported in the new word of study in the input process.The method that two kinds of new words of study are arranged usually, a kind of is the method for manually adding new word, another kind is automatically new word learning method.
An example of the new word learning method of Chinese of the new word of existing manual interpolation is Microsoft's Chinese character coding input method.In input process, the user opens the new word study special software that the input in Chinese system carries, the new word that runs into added in the dictionary that the input in Chinese system safeguarded, thereby when run into this word next time, just this word can in candidate entries, occur, be convenient to the user and select.
Another example of the new word learning method of Chinese of the new word of existing manual interpolation is old bridge input method.According to this method, the user also can use the button on the keyboard to make new advances after the reference position and final position of word as alternate key when input in Chinese, and the input in Chinese system carries out record to the new word that is marked, and uses during family input for later use.
Chinese patent application CN94104905.1 and CN94106045.4 have disclosed the new word learning method of a kind of automatic Chinese.In the method, the new word global learning that occurs is noted, when the user imports new word string, new word string is recorded as new word, the cumulative calculation user keeps the high new word of usage frequency to the usage frequency of new word later on, deletes the low new word of those usage frequencies.Above-mentioned input in Chinese system can forever preserve new word.
Automatically another example of the new word learning method of Chinese is the purple light spelling input method.According to this method, the new word that occurs is temporarily stored in the machine internal memory, the usage frequency of the new word of cumulative calculation, and according to the order of the new word of this usage frequency adjustment as the input candidate, select to use for the user.
Clearly, manually Chinese new word learning method can not automatically be finished the study of new word when the user imports Chinese continuously.In the learning process of new word, the user will carry out the study that the auxiliary input in Chinese system of some manual operations finishes new word, has caused burden to the user, has reduced the efficient of input in Chinese.
In addition, existing automatically new word learning method all is applied to once to carry out under the input pattern of conversion of the phonetic of 2 above Chinese character word strings or other character codes.That is, the user once imports pinyin string or other character code strings of the above word string of corresponding 2 Chinese characters, selects corresponding Chinese character then, relies on the corresponding relation of this pinyin string or other character code Chinese character strings and the Chinese character string of being imported to carry out new word study.
But (in the Chinese word character input system, the user can only import phonetic, stroke or other codes of a corresponding Chinese character at every turn, carries out the conversion of a Chinese character in the Chinese word character input system that is applied to embedded device, portable terminal etc.The individual character input method mainly comes across some embedded devices, in the portable terminal, an example of input method of the single character is the T9 input method of using in the mobile phone) in, each all carry out the Chinese character conversion with phonetic or other character codes of single Chinese character, do not have the pinyin string of two above Chinese characters of continuous input or the situation of other character code strings, the new word learning method of existing automatic Chinese that the corresponding relation of dependence pinyin string or other character code strings and Chinese character string carries out word study can't be applied to carry out in the Chinese word character input system word study.
Moreover, existing automatically new word learning method deposits in the word string that does not have in the dictionary in the dictionary immediately, using as the input candidate, dependence is to the accumulative total of new word as the selected number of times of input candidate item, judge that this new word is continued to keep or deletion in dictionary, the new word that does not often use when user input even when being insignificant 2 word strings more than the Chinese character, existing method all can be used as new word to them and deposit in the dictionary, selects for use as the input candidate item.Like this, can occur in dictionary and the input candidate item much often not using or insignificant word string, the treatment effeciency and the user that influence the input in Chinese system select to import the efficient that candidate item is imported.
Summary of the invention
[technical matters that will solve]
In view of the above problems, finished the present invention.The object of the present invention is to provide a kind of automatically learning new words method and device and character input system, can improve the input efficiency of the object language literal such as Chinese or Japanese, and be applicable to portable data assistance such as mobile phone.
[means of technical solution problem]
In one aspect of the invention, a kind of automatically learning new words method that is applicable to the object language character input system that comprises word lexicon is provided, comprise: acquisition step, the word string that does not have in new word lexicon of collection and the word lexicon from the object language language material (material) of input is as gathering word string, and described new word lexicon is used for preserving the word that described word lexicon does not have; First preserves step, the collection word string that is not present in the described collection word string in the interim new word string dictionary is kept in the interim new word string dictionary as interim new word string, and will be present in the described interim new word string dictionary but the collection word string that is not present in the new word string dictionary is kept in the new word string dictionary as new word string; And second preserve step, utilizing the object language character input system to carry out user in the object language character input process when selecting to be used as new word string in that the input candidate item presents, the described new word string dictionary, should new word string be kept in the described new word lexicon as new word.
Preferably, described acquisition step comprises: the specific character in the object language language material that usefulness is imported continuously is with the described object language language material cutting section of being (segment); And section that will be different with the word in new word lexicon and the word lexicon is preserved as gathering word string.
Preferably, described specific character comprise character except the object language literal become with individual character speech the object language literal one of at least.
Preferably, described first preserves step comprises: do not preserve under the situation of described collection word string at new word string dictionary, the word string of described collection word string and interim new word string dictionary is compared; Do not preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into interim new word string dictionary as interim new word string dictionary; And preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into new word string dictionary as new word string, and will described interim new word string from newly delete the word string dictionary temporarily.
Preferably, the new word string of described new word string dictionary preservation is presented to the user as the input candidate item of object language character input system.
Preferably, stored in the described new word string dictionary and the new one to one word string zone bit of described new word string, and described new word string zone bit has default initial value.
Preferably, the described second preservation step also comprises:
Choose under the situation of other input candidate item as the input word the user, with the value increase or the minimizing predetermined number of described new word string zone bit.
Preferably, described second preserves step also comprises: choose under the situation of other input candidate item as the input word the user, with the value increase or the minimizing predetermined number of described new word string zone bit.
Preferably, when the value of described new word string zone bit is predetermined value, should new word string delete from new word string dictionary.
Preferably, in the continuous input object language of user language material, carry out the study of new word automatically.
Preferably, add up and preserve described collection word string, interim new word string, new word string, the word frequency of new word.
Preferably, described input candidate item sorts with word frequency.
In a second aspect of the present invention, a kind of automatically learning new words device is provided, be applicable to the object language character input system that comprises word lexicon, described automatically learning new words device comprises: display unit shows the candidate character string of described object language input system as more than one other transformation results of the word string of the object language word string of object language language material output and described input; New word lexicon is stored the word that does not have in the described word lexicon; The word string collecting unit is gathered the word string that does not have in word lexicon and the new word lexicon in the object language language material after described conversion; Interim new word string dictionary, the collection word string that is not present in new word string dictionary and the interim new word string dictionary in will the collection word string by the collection of described word string collecting unit is preserved as interim new word string; New word lexicon is present in will the collection word string by the collection of described word string collecting unit in the interim new word string dictionary but the collection word string that is not present in the new word string dictionary is preserved as new word string; First preserves the unit, and the collection word string condition according to the rules that is not present in interim new word string dictionary and the new word string dictionary in will the collection word string by the collection of described word string collecting unit is kept in interim new word string dictionary or the new word string dictionary; And second preserve the unit, when the candidate character string of selecting from the object language candidate character string that is shown in described display unit the user is new word string, it is saved in the described new word lexicon as new word.
[effect of the present invention]
Utilize method and apparatus of the present invention, in user's input object language language material, fully automatically learn new word, need not manual operation, improved the efficient of object language literal input.
In addition, because the present invention uses special character in the object language language material that the user imports with the segmentation of object language language material, carry out new word learning manipulation at the language material section, the existing automatically new word learning method of carrying out the study of new word with respect to the corresponding relation that utilizes phonetic or other character codes and word string can not be applied to the character input system of individual character input, and automatically learning new words method of the present invention can be applied to be applicable to the character input system of the individual character input of embedded device.
In addition, because the present invention screens by multiple usage frequency statistics the word string after gathering, delete the lower word string of usage frequency, only that usage frequency is high word string deposits new word lexicon in as new word, has improved the accuracy rate of new word study.
In addition, because the present invention screens by multiple usage frequency statistics the word string after gathering, the lower word string of deletion usage frequency, only that usage frequency is the high word string and the content of new word lexicon offer character input system as candidate item, have improved the efficient of literal input.
In addition, the word frequency (frequency of utilization) of the word string after the present invention will gather is carried out record, and when described word string offered character input system and uses as the input candidate item, its word frequency is also offered character input system, foundation as character input system sorts to candidate item has further improved the efficient of input in Chinese.
In addition, after character input system is closed, still preserve interim new word string dictionary, neologisms string dictionary, the content of new word dictionary makes each learning outcome can constantly accumulate reservation.
Description of drawings
By below in conjunction with description of drawings the preferred embodiments of the present invention, will make above-mentioned and other purpose of the present invention, feature and advantage clearer, wherein:
Fig. 1 is according to the input in Chinese system of the embodiment of the invention and the schematic block diagram of automatically learning new words device;
Fig. 2 is the overview flow chart of explanation according to the automatically learning new words method of the embodiment of the invention;
Fig. 3 shows the reciprocal process between the input in Chinese system and automatically learning new words device in the implementation of as shown in Figure 2 each step;
Fig. 4 shows the stored word information of interim new word string dictionary, new word lexicon and the data structure of the word information that provides to the input in Chinese system as the input candidate item;
Fig. 5 shows the data structure of the stored word information of new word string dictionary used in the embodiment of the invention;
Fig. 6 is a process flow diagram of describing the detailed process of word string collection;
Fig. 7 is the process flow diagram of detailed process of describing the judgement of new word string; And
Fig. 8 is the process flow diagram of detailed process of describing the judgement of new word.
Embodiment
To a preferred embodiment of the present invention will be described in detail, having omitted in the description process is unnecessary details and function for the present invention with reference to the accompanying drawings, obscures to prevent that the understanding of the present invention from causing.
Fig. 1 is according to the input in Chinese system of the embodiment of the invention and the schematic block diagram of automatically learning new words device.
As shown in Figure 1, input in Chinese system 100 comprises candidate item output 110, word lexicon 120, language material output 130 and first memory block 140.
After input in Chinese system 100 started, new word string and the new word from new word string dictionary 260 and new word lexicon 270 new word learning device 200 preserved read in first memory block 140.
The user is by by the button on the lower keyboard or be presented at button on the dummy keyboard on the screen and import code such as phonetic or stroke, candidate item output 110 presents these alternative word and word string according to code and the word in the word lexicon 120 of input, new word string and the new corresponding relation between the word in first memory block 140 to the user.
The user selects to want word or the word string imported from these alternative word and word string, by 130 outputs of language material output, be stored in other memory blocks (for example second memory block) of storer or be presented on the screen according to the input sequence of literal.
Automatically learning new words device 200 according to the embodiment of the invention comprises that the word string of gathering in the object language language material of importing is as the word string collecting part 220 of gathering word string, second memory block 230 of temporary transient storage of collected word string, be used for determining interim new word string and new word string and the new word string of its preservation is preserved part 240, be used for preserving the interim new word string dictionary 250 of the collection word string that interim new word string dictionary and new word string dictionary all do not have, be used for preserving the new word string dictionary 260 that Already in interim new word string dictionary still is not present in the collection word string of new word string dictionary, but the new word that is used for determining and preserves new word is preserved part 210 and is used for being present in new word string dictionary the new word lexicon 270 that the collection word string that is not present in new word lexicon is preserved as new word.
As mentioned above, in character input process, language material output 130 is stored in Chinese language material in second memory block 230 in real time, is used for the word string collection.Specific character in the language material of word string collecting part 220 identification input, punctuation mark for example, numeral, the symbol of alphabetic character of other countries such as English alphabet and so on except the object language language material is cut into each word string with language material that will input.If the word string of cutting has been stored in the built-in word lexicon 120 of input in Chinese system 100 or has been stored in the new word lexicon 270, this means that this word string has not been new word, deletes it from second memory block 230.If in word lexicon 120, do not find this word string, then this word string is retained in second memory block 230 as interim new word string.In said process, described specific character is not gathered as acquisition target.
Next, new word string preservation part 240 is compared the content of the collection word string of preservation in second memory block 230 with the content of new word string dictionary 260.If new word string dictionary 260 has been preserved described collection word string, described collection word string is deleted from second memory block 230.
If new word string dictionary 260 is not preserved described collection word string, again the content in described collection word string and the interim new word string dictionary 250 is compared, if interim new word string dictionary 250 is not preserved described collection word string, then described collection word string becomes interim new word string preserves into interim new word string dictionary 250, and described collection word string is deleted from second memory block 230.
If interim new word string dictionary 250 is preserved described collection word string, described collection word string becomes new word string and preserves into new word string dictionary 260, and described collection word string is deleted from second memory block 230, the deletion from interim new word string dictionary 250 of described interim new word string.As mentioned above, the new word string of described new word string dictionary 260 preservations is used as the input candidate item of input in Chinese system 100.
In the process of user's input characters, user-selected result is input to new word preserves in the part 210, judge whether the new word string of storage in the new word string dictionary 260 can become new word.
Chosen as the input word by the user as the input candidate item of input in Chinese system 100 when from the input language material, detecting described new word string, new word is preserved part 210 described new word string is preserved into new word lexicon 270 as new word, and described new word string is deleted from new word string dictionary 270.Choose other input candidate item as the input word when detecting the user, described new word string zone bit (initial value is 0) subtracts 1.When described new word string zone bit is M (M is a preset value, M<0), should new word string delete from new word string dictionary 260.Fig. 5 shows the data structure of the stored word information of new word string dictionary used in the embodiment of the invention 260.
Fig. 2 shows the overview flow chart of automatically learning new words method of the present invention.As shown in Figure 2, in the continuous input process of Chinese language material (S110), the language material of importing is cut into word string (S120) according to above-mentioned specific character.Then, judge whether the word string of gathering is new word string (S130), and judge that further new word string is that structure can become new word (S140).At last, new word string and new word are stored in new word string dictionary 260 and the new word lexicon 270.
Shown in Figure 3, after input in Chinese system 100 starts, input in Chinese system 100 offers automatically learning new words device 200 with the word information of preserving in self word lexicon 120, the word information that automatically learning new words device 200 will be preserved as candidate item, content in for example new word string dictionary 260 and the content in the new word lexicon, be provided in first memory block 140, present candidate's input item to the user.
In the word string gatherer process, the language material that the language material output 130 of input in Chinese system provides the user to import to second memory block 230 of automatically learning new words device 200 continuously.
In new word string deterministic process, input in Chinese system 100 is when automatically learning new words device 200 provides language material, the new word string that automatically learning new words device 200 will be judged is provided in first memory block 140 as the input candidate information, as the alternate item that presents in user's input characters process.
In new word deterministic process, input in Chinese system 100 is when automatically learning new words device 200 provides language material, new word in the new word lexicon after automatically learning new words device 200 will upgrade offers first memory block 140 as candidate item, as the alternate item that presents in user's input characters process.
Fig. 6 is a process flow diagram of describing the detailed process of word string collection.As shown in Figure 6, the user imports language material " science and technology develops very fast, " continuously.Word string collecting part 220 compares (S121) with two adjacent characters with the word that is stored in word dictionary 120 and the new word lexicon 270.
Here, suppose that " science " and " technology " is present in the built-in word lexicon 120 of input in Chinese system 100, " development " is present in the new word lexicon 270." very fast " be not in word lexicon 120 and new word lexicon 270.
Then, judge whether there be word or the new word (S122) that exists in word lexicon 120 and the new word lexicon 270 in this language material.Because " very fast " in word lexicon 120 and new word lexicon 270, do not preserve " very fast " to second memory block (S123).
Next, judge in the language material of importing above-mentioned specific character (S124) whether occurs.When detecting specific character ", ", finish for the word string collection of the Chinese language material of this section cut apart with ", ".(S125)。
Fig. 7 is the process flow diagram of detailed process of describing the judgement of new word string.As shown in Figure 7, new word string is preserved part 240 with the collection word string " very fast " of preserving in second memory block 230 compare with the content in the new word string dictionary 250 (S131).
If do not have " very fast " in the new word string dictionary 260, but there has been word string " very fast " in the interim new word string dictionary 250, then word string " very fast " has been saved in new word string dictionary, and revises the word frequency (S135) of this word string.From the interim new word string dictionary and second memory block 230, delete word string " very fast " (S136), and the content of the new word string dictionary after will upgrading offers input in Chinese system 100 as candidate item as word information.
Fig. 4 shows the stored word information of interim new word string dictionary, new word lexicon and the data structure of the word information that provides to the input in Chinese system as the input candidate item.As shown in Figure 4, each word (word string) is stored accordingly with corresponding word frequency.Input in Chinese system 100 comes to present alternative input item to the user according to word frequency at every turn.
If do not have word string " very fast " in new word string dictionary 260 and the interim new word string dictionary 250, word string " very fast " is saved in temporarily word string dictionary 250 (S134) newly, delete again in second memory block 230 collection word string " very fast " (S137).
Fig. 8 is the process flow diagram of detailed process of describing the judgement of new word.As shown in Figure 8, the corresponding zone bit of just having preserved in new word string dictionary of word string " very fast " is set to initial value 0 (S141).
During user's input Pinyin " feikuai ", " very fast " appears (S142) in the input candidate item.Judge whether the user should import candidate item as input language material (S143).When the user selects to import candidate item " very fast " as the input language material, from new word string dictionary 260 deletions " very fast " (S146), word string " very fast " is saved in new word lexicon 270 (S147).
If the user selects " very fast " word string in addition as the input candidate item, then the respective flag position with word string " very fast " in the new word string dictionary subtracts predetermined number, for example 1 (S144).
Then the judgement symbol position whether be M (M is a preset value, and M is the integer less than 0) (S145).After above process is carried out repeatedly, when the respective flag position of " very fast " equals M, delete in the new word string dictionary word string " very fast " (S146), and the content of the new word string dictionary after will upgrading offers the input in Chinese system as candidate item as word information.
Fig. 5 shows the data structure of the stored word information of new word string dictionary used in the embodiment of the invention.As shown in Figure 5, each new word string is not only corresponding with corresponding word frequency, also stores accordingly with its zone bit.Thereby, when zone bit is predetermined value M, the word string in the new word string dictionary 260 can be deleted.
Therefore, according to method of the invention process the user import continuously Chinese in, by to the continuously collection of the Chinese language material of the input word string of carrying out of user, and the word string of being gathered carried out probability statistical analysis, the new word that does not have in the word lexicon 120 of the input system that learns Chinese automatically 100.Whole process does not need the user to carry out any operation, uses the user and finishes automatically when input in Chinese system 100 imports.
The related calculating of automatically learning new words method of the present invention is simple, resource is occupied less, and can support to comprise and the various input systems of the input system of individual character input " can only " (can't carry out the phonetic conversion of speech unit) be applicable to embedded system and portable terminal.
In addition, the automatically learning new words device 200 of the embodiment of the invention can be integrated in the input in Chinese system 100 as the new word study module of dictionary, also can be used as independent plug-in unit and is connected with Chinese character coding input method by interface, is installed on various input in Chinese system.
Though in the above-described embodiment, when the user chooses this word string, the value of zone bit is deducted a predetermined value, for example 1.But the present invention is not limited thereto, also the value of the zone bit of word string can be added a predetermined value.Can obtain effect same as the previously described embodiments like this.
So far invention has been described in conjunction with the preferred embodiments.Should be appreciated that those skilled in the art can carry out various other change, replacement and interpolations under the situation that does not break away from the spirit and scope of the present invention.Therefore, scope of the present invention is not limited to above-mentioned specific embodiment, and should be limited by claims.
Claims (25)
1. automatically learning new words method that is applicable to the object language character input system that comprises word lexicon comprises:
Acquisition step, the word string that does not have in new word lexicon of collection and the word lexicon from the object language language material of input is as gathering word string, and described new word lexicon is used for preserving the word that described word lexicon does not have;
First preserves step, the collection word string that is not present in the described collection word string in the interim new word string dictionary is kept in the interim new word string dictionary as interim new word string, and will be present in the described interim new word string dictionary but the collection word string that is not present in the new word string dictionary is kept in the new word string dictionary as new word string; And
Second preserves step, utilizing the object language character input system to carry out user in the object language character input process when selecting to be used as new word string in that the input candidate item presents, the described new word string dictionary, should new word string be kept in the described new word lexicon as new word.
2. the method for claim 1, wherein said acquisition step comprises:
With the specific character in the object language language material of continuous input with the described object language language material cutting section of being; And
Section that will be different with the word in new word lexicon and the word lexicon is preserved as gathering word string.
3. method as claimed in claim 2, wherein said specific character comprise character except the object language literal become with individual character speech the object language literal one of at least.
4. the method for claim 1, wherein said first preserves step comprises:
Do not preserve under the situation of described collection word string at new word string dictionary, the word string of described collection word string and interim new word string dictionary is compared;
Do not preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into interim new word string dictionary as interim new word string; And
Preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into new word string dictionary as new word string, and described interim new word string is deleted from interim new word string dictionary.
5. method as claimed in claim 4, the new word string that wherein said new word string dictionary is preserved is presented to the user as the input candidate item of object language character input system.
6. the method for claim 1, wherein said second preserves step comprises:
Under the situation that described new word string is chosen as the input word by the user as the input candidate item of object language character input system, described new word string is preserved into new word lexicon as new word; And
Described new word string is deleted from new word string dictionary.
7. method as claimed in claim 6 wherein, stored in the described new word string dictionary and the new one to one word string zone bit of described new word string, and described new word string zone bit has default initial value.
8. method as claimed in claim 7, described second preserves step also comprises:
Choose under the situation of other input candidate item as the input word the user, with the value increase or the minimizing predetermined number of described new word string zone bit.
9. method as claimed in claim 8 when the value of wherein said new word string zone bit is predetermined value, should new word string be deleted from new word string dictionary.
10. the method for claim 1, wherein in the continuous input object language of user language material, carry out the study of new word automatically.
11. described collection word string is wherein added up and preserved to the method for claim 1, interim new word string, new word string, the word frequency of new word.
12. method as claimed in claim 11, wherein said input candidate item sorts with word frequency.
13. an automatically learning new words device is applicable to the object language character input system that comprises word lexicon, described automatically learning new words device comprises:
Display unit shows the candidate character string of described object language input system as more than one other transformation results of the word string of the object language word string of object language language material output and described input;
New word lexicon is stored the word that does not have in the described word lexicon;
The word string collecting unit is gathered the word string that does not have in word lexicon and the new word lexicon in the object language language material after described conversion;
Interim new word string dictionary, the collection word string that is not present in new word string dictionary and the interim new word string dictionary in will the collection word string by the collection of described word string collecting unit is preserved as interim new word string;
New word lexicon is present in will the collection word string by the collection of described word string collecting unit in the interim new word string dictionary but the collection word string that is not present in the new word string dictionary is preserved as new word string;
First preserves the unit, and the collection word string condition according to the rules that is not present in interim new word string dictionary and the new word string dictionary in will the collection word string by the collection of described word string collecting unit is kept in interim new word string dictionary or the new word string dictionary; And
Second preserves the unit, when the candidate character string of selecting from the object language candidate character string that is shown in described display unit the user is new word string, it is saved in the described new word lexicon as new word.
14. device as claimed in claim 13, with the described object language language material cutting section of being, and section that will be different with the word in new word lexicon and the word lexicon is preserved as gathering word string with the specific character in the object language language material of input continuously for wherein said word string collecting unit.
15. device as claimed in claim 14, wherein said specific character comprise character except the object language literal becomes speech with individual character object language literal one of at least.
16. device as claimed in claim 13, wherein, described first preserves the unit does not preserve under the situation of described collection word string at new word string dictionary, and the word string of described collection word string and interim new word string dictionary is compared; Do not preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into interim new word string dictionary as interim new word string; And preserve under the situation of described collection word string at interim new word string dictionary, described collection word string is preserved into new word string dictionary as new word string, and will described interim new word string from newly delete the word string dictionary temporarily.
17. device as claimed in claim 16, the new word string that wherein said new word string dictionary is preserved is presented to the user as the input candidate item of object language character input system.
18. device as claimed in claim 13, wherein said second preserves the unit under the situation that described new word string is chosen as the input word by the user as the input candidate item of object language character input system, and described new word string is preserved into new word lexicon as new word; And described new word string deleted from new word string dictionary.
19. device as claimed in claim 18 wherein, stored in the described new word string dictionary and the new one to one word string zone bit of described new word string, and described new word string zone bit has default initial value.
20. device as claimed in claim 19, described second preserves the unit chooses under the situation of other input candidate item as the input word the user, with the value increase or the minimizing predetermined number of described new word string zone bit.
21. device as claimed in claim 20 when wherein said new word string zone bit is predetermined value, should new word string be deleted from new word string dictionary.
22. device as claimed in claim 13 wherein carries out the study of new word automatically in the continuous input object language of user language material.
23. described collection word string is wherein added up and preserved to device as claimed in claim 13, interim new word string, new word string, the word frequency of new word.
24. device as claimed in claim 23, wherein said input candidate item sorts with word frequency.
25. a character input system comprises automatically learning new words device as claimed in claim 13.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710111842 CN101324878B (en) | 2007-06-15 | 2007-06-15 | Method and apparatus for automatically learning new words and character input system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710111842 CN101324878B (en) | 2007-06-15 | 2007-06-15 | Method and apparatus for automatically learning new words and character input system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101324878A true CN101324878A (en) | 2008-12-17 |
CN101324878B CN101324878B (en) | 2012-06-13 |
Family
ID=40188421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200710111842 Expired - Fee Related CN101324878B (en) | 2007-06-15 | 2007-06-15 | Method and apparatus for automatically learning new words and character input system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101324878B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109154940A (en) * | 2016-06-12 | 2019-01-04 | 苹果公司 | Learn new words |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3154875B2 (en) * | 1993-08-06 | 2001-04-09 | 松下電器産業株式会社 | Kanji conversion learning device |
CN1474304A (en) * | 2002-08-09 | 2004-02-11 | 无敌科技股份有限公司 | Computer executable word memory system and method |
CN100397392C (en) * | 2003-12-17 | 2008-06-25 | 北京大学 | Method and apparatus for learning Chinese new words |
-
2007
- 2007-06-15 CN CN 200710111842 patent/CN101324878B/en not_active Expired - Fee Related
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109154940A (en) * | 2016-06-12 | 2019-01-04 | 苹果公司 | Learn new words |
CN109154940B (en) * | 2016-06-12 | 2022-04-19 | 苹果公司 | Learning new words |
Also Published As
Publication number | Publication date |
---|---|
CN101324878B (en) | 2012-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106649783B (en) | Synonym mining method and device | |
CN106570180B (en) | Voice search method and device based on artificial intelligence | |
EP1686493A2 (en) | Dictionary learning method and device using the same, input method and user terminal device using the same | |
KR101465769B1 (en) | Dictionary word and phrase determination | |
US9195738B2 (en) | Tokenization platform | |
CN110020422A (en) | The determination method, apparatus and server of Feature Words | |
CN110362824B (en) | Automatic error correction method, device, terminal equipment and storage medium | |
CN102955773B (en) | For identifying the method and system of chemical name in Chinese document | |
CN112347767B (en) | Text processing method, device and equipment | |
WO2016095645A1 (en) | Stroke input method, device and system | |
CN112784009B (en) | Method and device for mining subject term, electronic equipment and storage medium | |
US20040006460A1 (en) | System and method for problem solution support, and medium storing a program therefor | |
CN108984159B (en) | Abbreviative phrase expansion method based on Markov language model | |
CN104281275A (en) | Method and device for inputting English | |
JP2006285656A (en) | Document search system, recording medium, program and document search method | |
US20050065947A1 (en) | Thesaurus maintaining system and method | |
US7197184B2 (en) | ZhuYin symbol and tone mark input method, and electronic device | |
CN101324878B (en) | Method and apparatus for automatically learning new words and character input system | |
WO2020071252A1 (en) | Document search device, document search program, and document search method | |
CN116340470A (en) | Keyword associated retrieval system based on AIGC | |
JP2005107931A (en) | Image search apparatus | |
US6963865B2 (en) | Method system and program product for data searching | |
CN108846094A (en) | A method of based on index in classification interaction | |
KR20180007183A (en) | sentence input method and devices using the Representative of alphabet with the spaces | |
JP2008059136A (en) | Leaking personal information retrieval system, leaking personal information retrieval method, leaking personal information retrieval device and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120613 Termination date: 20160615 |
|
CF01 | Termination of patent right due to non-payment of annual fee |