CN108664141A - Input method with document context self-learning function - Google Patents

Input method with document context self-learning function Download PDF

Info

Publication number
CN108664141A
CN108664141A CN201710209575.8A CN201710209575A CN108664141A CN 108664141 A CN108664141 A CN 108664141A CN 201710209575 A CN201710209575 A CN 201710209575A CN 108664141 A CN108664141 A CN 108664141A
Authority
CN
China
Prior art keywords
word
dictionary
response
input
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710209575.8A
Other languages
Chinese (zh)
Other versions
CN108664141B (en
Inventor
张威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Priority to CN201710209575.8A priority Critical patent/CN108664141B/en
Publication of CN108664141A publication Critical patent/CN108664141A/en
Application granted granted Critical
Publication of CN108664141B publication Critical patent/CN108664141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The various embodiments of the disclosure provide a kind of method realized by input method module.In the input method, the first word other than the existing dictionary of input method module is obtained from being already contained in one or more of electronic document word.First word is added to the customization dictionary of input method module, customized word Al Kut is due to electronic document and different from having dictionary.In addition, in response to receiving input associated with the first word, the first word is obtained from customization dictionary.In addition, including in the candidate word window of input method module, for selection by the user by the first word.

Description

Input method with document context self-learning function
Technical field
Embodiment of the disclosure is related to information input, and more particularly, to document context self-learning function Input method.
Background technology
Input method module or abbreviation input method allow user to such as mobile device or personal computer (PC) etc Electronic equipment passes through the information such as character, expression.In the input methods such as Chinese, user is by input Pinyin letter come input word.Such as Known, many input methods support user's word that once input includes multiple individual characters.Because including a large amount of homonyms in Chinese, Therefore user usually requires the word for selecting him/her to want input in multiple candidate words corresponding to identical phonetic.For this purpose, Propose word frequency adjustment radix.
In addition, when " neologisms " not having in the dictionary of user's input method module, user may have to input word one by one Included in individual character.For this purpose, some input methods can learn the neologisms of user's creation with the input of user.These study To neologisms can be saved in dictionary, such as specific in the neologisms dictionary of user.In this way, in user then to for example When other documents input the neologisms created before, these neologisms can serve as candidate word and be prompted to user, to facilitate use The input process at family.
Invention content
It is provided to be further improved efficiency and experience, the various embodiments of the disclosure of the user when using input method A method of it is realized by input method module.It, can be from one or more be already contained in electronic document according to this method The first word other than the existing dictionary of input method module is obtained in a word.First word is added to the customized word of input method module Library, wherein the customized word Al Kut are due to electronic document and different from having dictionary.Hereafter, if received related to the first word The input of connection can obtain the first word from customization dictionary, and it will be shown in the candidate word window of input method module For selection by the user.This way it is not necessary to the Learn New Words with the input of user and within the relatively long period, but can be with It is directly based upon existing document context and realizes the study of neologisms.
It is the specific implementation below in order to which simplified form introduces the mark to concept to provide Summary It will be further described in mode.Summary is not intended to identify the key feature or main feature of claimed theme, Also it is not intended to limit the range of claimed theme.
Description of the drawings
Fig. 1 is to show computer system drawings according to an embodiment of the present disclosure;
Fig. 2 shows the schematic diagrames according to the electronic document of one embodiment of the disclosure;
Fig. 3 shows the user interface of conventional input method;
Fig. 4 shows the flow chart of the input method of one embodiment according to the disclosure;
Fig. 5 shows the user interface of the input method of one embodiment according to the disclosure;
Fig. 6 shows the user interface of the input method of one embodiment according to the disclosure;And
Fig. 7 shows the user interface of the input method of one embodiment according to the disclosure.
In these attached drawings, same or similar reference mark is for indicating same or similar element.
Specific implementation mode
The disclosure is discussed now with reference to several example embodiments.It should be appreciated that discuss these embodiments only and be for So that those of ordinary skill in the art better understood when and therefore realize the disclosure, rather than imply the model to the disclosure Any restrictions enclosed.
As it is used herein, term " comprising " and its variant will be read as the opening for meaning " to include but not limited to " Formula term.Term "based" will be read as " being based at least partially on ".Term " one embodiment " and " a kind of embodiment " are wanted It is read as " at least one embodiment ".Term " another embodiment " will be read as " at least one other embodiment ".Art Language " first ", " second " etc. may refer to different or identical object.Hereafter it is also possible that other are specific and implicit Definition.
Below with reference to attached drawing come the basic principle for illustrating the disclosure and several example embodiments.Fig. 1, which is shown, to be implemented The block diagram of the equipment 100 of multiple embodiments of the disclosure.It should be appreciated that equipment 100 shown in figure 1 is only exemplary, Any restrictions without function and range to disclosure described embodiment should be constituted.As shown in Figure 1, equipment 100 is wrapped Include the equipment 100 of universal computing device form.The component of equipment 100 can include but is not limited to one or more processors or place Manage unit 110, memory 120, storage device 130, one or more communication unit 140, one or more input equipments 150 with And one or more output equipment 160.
In some embodiments, equipment 100 may be implemented as various user terminals or service terminal.Service terminal can be with It is server, the mainframe computing devices etc. that various service providers provide.The all any type of mobile terminals in this way of user terminal, Fixed terminal or portable terminal, including cell phone, multimedia computer, multimedia tablet, internet node, communicator, Desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, PCS Personal Communications System (PCS) equipment, personal navigation equipment, personal digital assistant (PDA), audio/video player, digital camera/video camera, positioning Equipment, television receiver, radio broadcast receiver, electronic book equipment, game station or its it is arbitrary combine, including these set Standby accessory and peripheral hardware or its arbitrary combination.It is also foreseeable that equipment 100 can support it is any type of be directed to user Interface (" wearable " circuit etc.).
Processing unit 110 can be reality or virtual processor and can according to the program stored in memory 120 come Execute various processing.In a multi-processor system, multiple processing unit for parallel execution computer executable instructions, to improve equipment 100 parallel processing capability.Processing unit 110 can also be referred to as central processing unit (CPU), microprocessor, controller, micro- Controller.
Equipment 100 generally includes multiple computer storage medias.Such medium can be that equipment 100 is addressable any The medium that can be obtained, including but not limited to volatile and non-volatile media, removable and non-removable media.Memory 120 can be volatile memory (such as register, cache, random access storage device (RAM)), nonvolatile memory (for example, read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory) or its certain combination.Storage Device 120 can packet input method module 122, be configured as execute various embodiments described herein function.Subject, at this In open, " input method ", " input method module " and " input method platform " the two terms are used interchangeably.Input method module 122 It can be accessed and run by processing unit 110, to realize corresponding function.Storage device 130 can be detachable or non-dismountable Medium, and may include machine readable media, it can be used to store information and/or data and can be in equipment 100 It is interior accessed.
The realization of communication unit 140 is communicated by communication media with other computing device.Additionally, equipment 100 The function of component can realize that these computing machines can pass through communication connection with single computing cluster or multiple computing machines It is communicated.Therefore, equipment 100 can use with other one or more servers, personal computer (PC) or another The logical connection of general networking node is operated in networked environment.Equipment 100 can also be as desired by communication unit Member 140 is communicated with one or more external equipment (not shown), external equipment storage device, display equipment etc., with One or more so that the equipment that interacts with equipment 100 of user communicate, or with make equipment 100 and one or more its Any equipment (for example, network interface card, modem etc.) of his computing device communication communicates.Such communication can be via defeated Enter/export (I/O) interface (not shown) to execute.
Input equipment 150 can be one or more various input equipments, for example, mouse, keyboard, touch screen, trackball, Voice-input device etc..Output equipment 160 can be one or more output equipments, such as display, loud speaker, printer Deng.
The principle of embodiment of the disclosure is discussed by taking Chinese pinyin input method as an example below.It should be pointed out, however, that this public affairs Open the concrete type for being not intended to be limiting input method.For example, language or Japanese based on letter etc. is waited to be based on word for English The input method of the language of root is applicable in embodiment of the disclosure.
User can input English, phonetic alphabet or radical by means of the input equipments such as keyboard 150.Input method module 122 In can from input equipment 150 receive user input, and will output (for example, candidate word etc.) be supplied to the outputs such as display to set Standby 160 for selection by the user.It will be understood that input method module 122 and the communication output and input between module 150 and 160 can The interface that is there is provided by means of the operating system (OS) in equipment 100 is realized.The example of this interface is including but not limited to each Kind Application Programming Interface (API).
Fig. 2 shows the schematic diagrames according to the electronic document 200 of one embodiment of the disclosure.Fig. 3 shows that tradition is defeated Enter the user interface 300 of method.In certain embodiments, electronic document 200 can have text input and editting function Document, the web page with information input field, electronic list or any document that can receive word input.At some In embodiment, electronic document 200 can be user's localling create in equipment 100.Alternatively, in certain embodiments, electric Subdocument 200 can be the user remotely receiving from equipment 100, pass through the modes such as Email.
In the example shown in fig. 2, electronic document 200 has contained following content 210:
The blank of string theory was invented by Cino Da Pistoia in dimension (Gabriele Veneziano) in nineteen sixty-eight.There is saying to claim, he Originally it is to look for the mathematical function that can describe intranuclear high forces, has then been had found in an old mathematics book Euler's Beta function of 200 years history, this function can describe his the required high forces solved.
Now, user wishes to input the document 200 into edlin or subsequently by conventional input method, wherein editing Or follow-up input may include content identical with having content in the document.For example, user may want in follow-up input Input word " Cino Da Pistoia in dimension " again.As shown in figure 3, user inputs phonetic alphabet sequence corresponding with " Cino Da Pistoia in dimension " first “weineiqinuo”310.Then, traditional input method returns to some candidate words by the candidate word window at its interface 300.For example, These candidate words may include " Cino Da Pistoia in stomach ", " in stomach ", " in dimension " etc..
As can be seen that under traditional input method module, since " Cino Da Pistoia in dimension " in most of fields and is of little use, belong to In user from neologism, therefore " Cino Da Pistoia in dimension " is not revealed as candidate word.At this point, user has to one by one " dimension ", "inner", " strange ", " promise " this four individual characters are selected, which increase inputs to bear.This word is initially inputted several times in user When, it is known that neologisms learning functionality can not include in candidate word window by the desired word of user.This be unfavorable for operating efficiency and User experience, it is relatively uncommon especially for some but needed in specific area (for example, the subjects such as mathematics, physics) The word frequently entered.
The input method that embodiment of the disclosure is proposed includes the self-learning function based on document context, that is, in user In the case of not discovering or knowing without user, learn subject term library etc. automatically from the existing content in open electronic document Have some so-called " neologisms " other than dictionary.In this way, when user needed in the follow-up editor to same document it is defeated again When entering the neologisms, this self-learning function will be obviously improved operating efficiency and user experience.That is, different from it is traditional with User's input process and " passive " study of gradual Learn New Words, embodiment of the disclosure can directly utilize in document There is information and " initiatively " learns.In this way, the speed of neologisms study will significant increase.
Fig. 4 shows the flow chart for the method 400 that the input method module is realized.It will be understood that method 400 can be by input method Module 122 is realized.For discussion purposes, following description still carries out example described in reference diagram 2.
At 410, word is obtained from being already contained in one or more of electronic document 200 word, acquired word Other than the existing dictionary of input method module 122.That is, the word is in the existing dictionary of input method module 122, and Thus it can not be presented to the user as candidate word during input by user.It, will be at 410 from electricity for the sake of discussing conveniently The word obtained in subdocument is known as " the first word ".Note that the first word of more than one can be obtained 410.
In accordance with an embodiment of the present disclosure, the first word can be obtained from any word segment in electronic document.For example, such as In example shown in Fig. 2, one or more words can be obtained from body part 210, it can also be from the bibliography portion of ending place Point, any parts such as header or footer (not shown) obtain word.
In some embodiments it is possible to be opened by user in response to electronic document 200 and obtain the first word.For example, one In a little embodiments, his/her first pre-editing can be opened in response to user and the electronic document 200 of preservation and obtain first Word.Alternatively or additionally, in further embodiments, if the electricity that user's preview in such as e-mail applications receives Subdocument 200 can also trigger the acquisition to the first word.
In some exemplary embodiments, 410, the first word can be obtained by cutting word.Specifically, included in electricity It is more more than or equal to two Chinese characters that multiple words (for example, word in word section 210) in subdocument 200 can be divided into length A word.Hereafter, the word being not included in the existing dictionary of input method module 122 can be selected to make from multiple words after segmentation For the first word.
In an example embodiment, the segmentation of one or more words can be calculated by such as condition random field (CRF) Method etc realizes that detailed process is known, details are not described herein based on the algorithm of artificial intelligence.It may be noted that CRF An only example, it is not intended to limit the scope of the present disclosure in any way.On the contrary, any can realize that the method for word segmentation is equal It can be used in combination with embodiment of the disclosure, it is whether currently known or exploitation in the future.
In the figure 2 example, by means of CRF algorithms, content 210 can be divided as follows:
String theory// blank/be/in/1968/ year Cino Da Pistoia/(/Gabriele//Veneziano/)/hair in/by/dimension Bright/./ have/saying/title/,/he/script/be/wanting/look for/energy/description/atomic nucleus/interior// strong/active force// mathematics/letter Number/,/then// mono-/it is old// mathematics/book it is/inner/have found/have/200/ year/history// Euler's Beta/letter Number/,/this/function/can/description/he// solve// strong/active force/./
Then, the selection and withdrawal word from the above 210 after segmentation.For example, for Chinese, length can be extracted Word more than or equal to 2.Among word of these length more than or equal to 2, it is assumed that " string theory ", " blank ", " are said " invention " Method ", " script ", " description ", " atomic nucleus ", " active force ", " mathematics ", " function ", " then ", " one ", " old ", " number Learn ", " having found ", " history ", " function ", " can ", " description ", " being wanted ", " solution " and " active force " be included in it is existing Dictionary (for example, subject term library, hot word bank or any other specific to user dictionary) in.On the other hand, " dimension in Cino Da Pistoia " and " Euler's Beta " is then the first word other than above-mentioned existing dictionary, and therefore can be obtained as the first word.
With continued reference to Fig. 4, at 420, by the first word obtained at 410 (be in the example in figure 2 " dimension in Cino Da Pistoia " and " Euler's Beta ") it is added to the customization dictionary of input method module 122.As described above, the customization dictionary is specific for electronic document 200 and different from having dictionary.In one embodiment, which can be in response to the opening of electronic document 200 Or other predefined trigger conditions and be generated.In certain embodiments, customization dictionary can also be with electronic document 200 Closing or other predetermined conditions and be removed.By this method, customization dictionary is only bound with specific electronic document 200.This Storage resource has been saved, input method module 122 is avoided to occupy more and more resources.
Input method module 122 can continue to operate to handle the input of user.If receiving and being stored in customization before The corresponding user input of any one of dictionary word obtains the word then at 430 from customization dictionary, and 440 by its It is presented to the user as candidate word, so that user selects to be input in electronic document.
Fig. 5 shows that the input method module 122 according to one embodiment of the disclosure offers existing user interface 500. With continued reference to the example in Fig. 2, it is assumed that user wishes to input again in electronic document 200 " Cino Da Pistoia in dimension ".For this purpose, user is logical Cross 500 input Pinyin of interface " weineiqinuo " 510.In response, the retrieval of input method module 122 customization dictionary, and it was found that Neologisms that are corresponding, previously having learnt from document context " Cino Da Pistoia in dimension " are inputted with user.The word is thus from customization dictionary In be acquired and be present in user interface 500.More specifically, in this instance, word " Cino Da Pistoia in dimension " is displayed on candidate Second position 522 in word window 520 is for selection by the user.
In certain embodiments, position of the neologisms learnt from the context in candidate word window 520, can basis The interbehavior of user and adjust.For example, if the neologisms are easily selected by a user, when user inputs the word again, it can be with It is displayed on the first place in candidate word window 520.This is shown in FIG. 6, and in this example, is inputted again in response to user " weineiqinuo " 510, corresponding word " Cino Da Pistoia in dimension " are displayed on the first place 521 of candidate word window 520.
In some cases, the neologisms learnt from the existing content of electronic document 200 may due to various reasons and It is not that user is desired.For example, may include misspelling in the neologisms learnt.User is in subsequent operation as a result, In may not the reselection neologisms.In another case, the neologisms learnt may be inherently more deserted, therefore The possibility inputted again is also relatively low.At this point, if still by such word include in candidate window it is earlier Position, then can cause undesirable influence to the input of user.Herein, this influence can be quantified as recognizing user " annoyance level " in intimate Neo-Confucianism, and adjust according to annoyance level the position of candidate word.
In certain embodiments, if the first word learnt at 410 after shown as candidate word not by User selects, then moves the position that the word is presented in candidate word window after determining at once.That is, ought next time user input with When the word is corresponding alphabetical, the position of the word is pulled back such as one.It is of course also possible to will be candidate according to any strategy appropriate Word is rear in candidate word window to move more than one.
In further embodiments, if first word is not selected, the word can be based on previously non-selected time Number and its currently the location of in candidate window, to determine annoyance level caused by the word inputs user.If interference Degree has reached predetermined threshold, then can remove associated word from customization dictionary.This is described still referring to Figure 2 below The embodiment of sample.
In the illustrated example shown in fig. 2, if after the first word " Cino Da Pistoia in dimension " learnt appears in candidate word window, The word can be recorded to appear in candidate word window but non-selected number.Meanwhile it and recording the word and appears in each time Position in candidate word window 520.Each position in candidate word window 520 is endowed corresponding weighted value.Generally, candidate Forward position has relatively high weight in word window 520.For example, table 1 shows the example weight of candidate word position. In this example, there is the 2nd position in candidate word window weight " 3 ", the 3-5 position to have weight " 2 ", the 6-7 position It sets with weight " 1 ", all positions (if any) weight hereafter is " 0 ".It should be appreciated that these numerical value are only example Property, it is not intended to it limits the scope of the present disclosure in any way.
Position in 1. candidate word window of table and respective weights
Assuming that the threshold value for removing neologisms from customization dictionary is 6.In the example shown in upper table, if in customization dictionary When one given word is not selected twice present in the 2nd position in candidate word window, annoyance level can be considered as 2*3=6 has reached predetermined threshold, therefore just deletes the first word from candidate word window.Alternatively, when the first word appears in When being not selected three times at any position in the 3rd to 5 position section in candidate word window, annoyance level can be considered It is 3*2=6, has reached predetermined threshold, therefore just delete the first word from candidate word window.Alternatively, when the first word is distinguished It is primary present in the 2nd position in candidate word window, primary present in any position in the 3rd to 5 position section and Present in any position in the 6th to 7 position section it is primary and when being not selected, annoyance level can be considered as 1*3+ 1*2+1*1=6 has reached predetermined threshold, therefore just deletes the first word from candidate word window.When candidate word appears in After 7 positions, the follow-up input not to user can be regarded as and had any impact, therefore, can not delete and appear in the 7th The first word after position.
By this method, the above-mentioned word comprising misspelling or deserted word will not occupy candidate word window for a long time and right The follow-up input of user interferes, but can promptly be moved from candidate word window according to predetermined threshold value It removes.
In certain embodiments, in addition at 410 learn the first word other than, user it is follow-up input be also possible to it is right simultaneously Should be in other words in other dictionaries (for example, subject term library) of input method module 122, and the priority of these dictionaries could possibly be higher than Customize dictionary.Convenient to discuss, the word in this more advanced dictionary is referred to as " the second word ".When the first word and the second word go out simultaneously Now, in certain embodiments, the second word is displayed in candidate word window before the first word.
Referring still to Fig. 5, in this example, when user input " weineiqinuo " 510 when, equally with " weineiqinuo " 510 is corresponding, is stored in existing dictionary the second word " Cino Da Pistoia in stomach ", which is displayed on, to be learnt Before first word " Cino Da Pistoia in dimension ", and only just inputted next time in user after user's selection " Cino Da Pistoia in dimension " Before first word " Cino Da Pistoia in dimension " is shown in the second word " Cino Da Pistoia in stomach " by " weineiqinuo " 510, as shown in Figure 6.
This embodiment described above may be beneficial.It is appreciated that relative to based on document context study and For the customization dictionary of generation, the higher dictionary of other priority such as subject term library may have higher reliability and confidence level. Therefore, by the candidate word from these dictionaries include from customization dictionary candidate word before, may be user more accurately Recommended candidate word.Moreover, in this way, it is possible to being supplied to the more chances of user to confirm whether the word that study is arrived is exactly user It is expected that the correct word of input.
The principle and thought of the disclosure are discussed by taking input in Chinese as an example above.It is understood that above-mentioned public affairs The principle for the embodiment opened is equally applicable to other language.English is described referring still to flow chart shown in Fig. 4 400 below Input embodiment.At this time.Can have except dictionary from being already contained in obtain in one or more of electronic document word Word or phrase (frame 410).For example, including following content in electronic document:
MircoSmartInput says:“Hello World”!
Assuming that " MircoSmartInput " and " Hello World " be respectively input method module 122 existing dictionary other than Word and phrase.Correspondingly, during these neologisms are added to specific to the customization dictionary of document (frame 420).Referring to Fig. 7, when User wishes to input " MircoSmartInput " in the document again and start to input corresponding English alphabet sequence When a part, for example, " micro " 710, the word learnt before being retrieved from customization dictionary " MircoSmartInput " (frame 430), and it is more completely shown in such as the 4th position 524 in candidate word window 520 (frame 440) for selection by the user.Similarly, user, which can conveniently enter, previously learns and is saved in customization dictionary Word.
It should be noted that above-described all features are suitable for other language except Chinese, details are not described herein.And And Chinese is only based on the example of the language of radical, and English is the example of the language based on letter.Embodiment of the disclosure Suitable for any other language, language itself does not constitute any restrictions to disclosure range.
It is listed below some example embodiments of the disclosure.
According to some embodiments, a kind of method realized by input method module is provided.This method includes:It is wrapped from It is contained in one or more of electronic document word and obtains the first word, the first word is other than the existing dictionary of input method module;It will First word is added to the customization dictionary of input method module, and customized word Al Kut is due to electronic document and different from having dictionary;It rings Ying Yu receives input associated with the first word, and the first word is obtained from customization dictionary;And by the first word include inputting In the candidate word window of method module, for selection by the user.
In some embodiments, obtaining the first word includes:It is opened in response to electronic document and obtains the first word.
In some embodiments, this method further includes:It is closed in response to electronic document, removes customization dictionary.
In some embodiments, this method further includes:It is not selected in an at least word subsequent operation in response to the first word, Reduce the position that the first word is presented in candidate word window.
In some embodiments, reducing the position that the first word is presented in candidate word window includes:In response to the first word It is not selected, based on the previous current position in candidate window of non-selected number and the first word of the first word, to determine Annoyance level caused by one word inputs user;Reach predetermined threshold in response to annoyance level, first is removed from customization dictionary Word.
In some embodiments, this method further includes:In response to input also be different from the of the first word in existing dictionary Two words are associated, show the second word before the first word in candidate word window.
In some embodiments, the first word is Chinese character and input associated with the first word is at least one phonetic word Mother, and wherein obtain the first word and include:By the content for including in the electronic document is segmented obtain one or Multiple words;And from the word selected in one or more words other than the existing dictionary as the first word.
According to some embodiments, a kind of equipment is provided.The equipment includes:Processing unit;And memory, it is coupled to place It manages unit and is stored with instruction, instruction executes following action when being executed by processing unit:It is literary from electronics is already contained in The first word is obtained in one or more of shelves word, the first word is other than the existing dictionary of input method module;First word is added To the customization dictionary of input method module, customized word Al Kut is due to electronic document and different from having dictionary;In response to receiving Input associated with the first word obtains the first word from customization dictionary;And by the first word include the time in input method module It selects in word window, for selection by the user.
In some embodiments, obtaining the first word includes:It is opened in response to electronic document and obtains the first word.
In some embodiments, action further includes:It is closed in response to electronic document, removes customization dictionary.
In some embodiments, action further includes:It is not selected in an at least word subsequent operation in response to the first word, drop The position that low first word is presented in candidate word window.
In some embodiments, reducing the position that the first word is presented in candidate word window includes:In response to the first word It is not selected, based on the previous current position in candidate window of non-selected number and the first word of the first word, to determine Annoyance level caused by one word inputs user;And reach predetermined threshold in response to annoyance level, it is removed from customization dictionary First word.
In some embodiments, action further includes:In response to input also in existing dictionary be different from the first word second Word is associated, shows the second word before the first word in candidate word window.
In some embodiments, the first word is Chinese character and input associated with the first word is at least one phonetic word Mother, and wherein obtain the first word and include:By the content for including in the electronic document is segmented obtain one or Multiple words point;And from the word selected in one or more words other than the existing dictionary as the first word.
According to some embodiments, a kind of computer program product is provided.The computer program product is visibly stored In non-transient computer-readable media and include machine-executable instruction, machine-executable instruction makes machine when executed Execute following action:The first word is obtained from being already contained in one or more of electronic document word, the first word is inputting Other than the existing dictionary of method module;First word is added to the customization dictionary of input method module, customized word Al Kut is due to electronics text Shelves and different from have dictionary;In response to receiving input associated with the first word, the first word is obtained from customization dictionary; And by the first word include in the candidate word window of input method module, for selection by the user.
In some embodiments, obtaining the first word includes:It is opened in response to electronic document and obtains the first word.
In some embodiments, action further includes:It is closed in response to electronic document, removes customization dictionary.
In some embodiments, action further includes:It is not selected in an at least word subsequent operation in response to the first word, drop The position that low first word is presented in candidate word window.
In some embodiments, reducing the position that the first word is presented in candidate word window includes:In response to the first word It is not selected, based on the previous current position in candidate window of non-selected number and the first word of the first word, to determine Annoyance level caused by one word inputs user;Reach predetermined threshold in response to annoyance level, first is removed from customization dictionary Word.
In some embodiments, action further includes:In response to input also in existing dictionary be different from the first word second Word is associated, shows the second word before the first word in candidate word window.
Function described herein can be executed by one or more hardware logic components at least partly.Example Such as, without limitation, the hardware logic component for the exemplary type that can be used includes:Field programmable gate array (FPGA), specially With integrated circuit (ASIC), Application Specific Standard Product (ASSP), system on chip (SOC), complex programmable logic equipment (CPLD) etc. Deng.
Any combinations that one or more programming languages may be used in program code for implementing disclosed method are come It writes.These program codes can be supplied to the place of all-purpose computer, special purpose computer or other programmable data processing units Manage device or controller so that program code makes defined in flowchart and or block diagram when by processor or controller execution Function/operation is carried out.Program code can execute completely on machine, partly execute on machine, as stand alone software Is executed on machine and partly execute or executed on remote machine or server completely on the remote machine to packet portion.
In the context of the disclosure, machine readable media can be tangible medium, can include or be stored for The program that instruction execution system, device or equipment are used or is used in combination with instruction execution system, device or equipment.Machine can It can be machine-readable signal medium or machine-readable storage medium to read medium.Machine readable media can include but is not limited to electricity Son, magnetic, optical, electromagnetism, infrared or semiconductor system, device or equipment or the above any conjunction Suitable combination.The more specific example of machine readable storage medium will include being electrically connected of line based on one or more, portable meter Calculation machine disk, hard disk, random access memory (RAM), read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM Or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage facilities or Any appropriate combination of the above.
Although in addition, depicting each operation using certain order, this is not construed as requiring these operations with institute The certain order that shows is executed in sequential order to execute.Under certain environment, multitask and parallel processing may be advantageous 's.Similarly, although containing several specific implementation details in being discussed above, these are not construed as to this public affairs The limitation for the range opened.Certain features described in the context of individual embodiment can also be realized in combination single real It applies in example.On the contrary, the various features described in the context of single embodiment can also be individually or with any suitable The mode of sub-portfolio is realized in various embodiments.
Although having used specific to this theme of the language description of structure feature and/or method logical action, answer When understanding that the theme defined in the appended claims is not necessarily limited to special characteristic described above or action.On on the contrary, Special characteristic described in face and action are only to realize the exemplary forms of claims.

Claims (20)

1. a kind of method realized by input method module, including:
The first word is obtained from being already contained in one or more of electronic document word, first word is in the input method Other than the existing dictionary of module;
First word is added to the customization dictionary of the input method module, the customized word Al Kut schedules the electronic document And it is different from the existing dictionary;
In response to receiving input associated with first word, first word is obtained from the customization dictionary;And
Include in the candidate word window of the input method module, for selection by the user by first word.
2. according to the method described in claim 1, wherein obtaining first word and including:
It is opened in response to the electronic document and obtains first word.
3. according to the method described in claim 1, further including:
It is closed in response to the electronic document, removes the customization dictionary.
4. according to the method described in claim 1, further including:
It is not selected in an at least word subsequent operation in response to first word, first word is reduced in the candidate word window The position being presented in mouthful.
5. according to the method described in claim 4, wherein reducing the position that first word is presented in the candidate word window Set including:
Not selected in response to first word, based on first word, previously non-selected number and first word are current Position in the candidate window, to determine first word to annoyance level caused by user's input;
Reach predetermined threshold in response to the annoyance level, first word is removed from the customization dictionary.
6. according to the method described in claim 1, the priority of the wherein described existing dictionary is higher than the customization dictionary, the side Method further includes:
It is also related to second word of the first word is different from the existing dictionary of the input method module in response to the input Connection, shows second word in the candidate word window before first word.
7. according to the method described in claim 1, wherein described first word is Chinese character and associated defeated with first word Enter at least one phonetic alphabet, and wherein obtains first word and include:
One or more of words are obtained by being segmented to the content for including in the electronic document;And
From the word selected in one or more of words other than the existing dictionary as first word.
8. a kind of equipment, including:
Processing unit;And
Memory is coupled to the processing unit and is stored with instruction, and described instruction is held when being executed by the processing unit The following action of row:
The first word is obtained from being already contained in one or more of electronic document word, first word is in the input method Other than the existing dictionary of module;
First word is added to the customization dictionary of the input method module, the customized word Al Kut schedules the electronic document And it is different from the existing dictionary;
In response to receiving input associated with first word, first word is obtained from the customization dictionary;And
Include in the candidate word window of the input method module, for selection by the user by first word.
9. equipment according to claim 8, wherein obtaining first word and including:
It is opened in response to the electronic document and obtains first word.
10. equipment according to claim 8, wherein the action further includes:
It is closed in response to the electronic document, removes the customization dictionary.
11. equipment according to claim 8, wherein the action further includes:
It is not selected in an at least word subsequent operation in response to first word, first word is reduced in the candidate word window The position being presented in mouthful.
12. equipment according to claim 11, wherein reducing what first word was presented in the candidate word window Position includes:
Not selected in response to first word, based on first word, previously non-selected number and first word are current Position in the candidate window, to determine first word to annoyance level caused by user's input;
Reach predetermined threshold in response to the annoyance level, first word is removed from the customization dictionary.
13. equipment according to claim 8, wherein the action further includes:
It is also related to second word of the first word is different from the existing dictionary of the input method module in response to the input Connection, shows second word in the candidate word window before first word.
14. equipment according to claim 8, wherein first word is Chinese character and associated defeated with first word Enter at least one phonetic alphabet, and wherein obtains first word and include:
One or more of words are obtained by being segmented to the content for including in the electronic document;And
From the word selected in the multiple word other than the existing dictionary as first word.
15. a kind of computer program product, the computer program product is tangibly stored in non-transient computer-readable Jie In matter and include machine-executable instruction, the machine-executable instruction makes machine execute following action when executed:
The first word is obtained from being already contained in one or more of electronic document word, first word is in the input method Other than the existing dictionary of module;
First word is added to the customization dictionary of the input method module, the customized word Al Kut schedules the electronic document And it is different from the existing dictionary;
In response to receiving input associated with first word, first word is obtained from the customization dictionary;And
Include in the candidate word window of the input method module, for selection by the user by first word.
16. computer program product according to claim 15, wherein obtaining first word and including:
It is opened in response to the electronic document and obtains first word.
17. computer program product according to claim 15, wherein the action further includes:
It is closed in response to the electronic document, removes the customization dictionary.
18. computer program product according to claim 15, wherein the action further includes:
It is not selected in an at least word subsequent operation in response to first word, first word is reduced in the candidate word window The position being presented in mouthful.
19. computer program product according to claim 15, wherein reducing by first word in the candidate word window In the position that is presented include:
Not selected in response to first word, based on first word, previously non-selected number and first word are current Position in the candidate window, to determine first word to annoyance level caused by user's input;
Reach predetermined threshold in response to the annoyance level, first word is removed from the customization dictionary.
20. computer program product according to claim 15, wherein the action further includes:
It is also related to second word of the first word is different from the existing dictionary of the input method module in response to the input Connection, shows second word in the candidate word window before first word.
CN201710209575.8A 2017-03-31 2017-03-31 Input method with document context self-learning function Active CN108664141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710209575.8A CN108664141B (en) 2017-03-31 2017-03-31 Input method with document context self-learning function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710209575.8A CN108664141B (en) 2017-03-31 2017-03-31 Input method with document context self-learning function

Publications (2)

Publication Number Publication Date
CN108664141A true CN108664141A (en) 2018-10-16
CN108664141B CN108664141B (en) 2022-08-09

Family

ID=63784053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710209575.8A Active CN108664141B (en) 2017-03-31 2017-03-31 Input method with document context self-learning function

Country Status (1)

Country Link
CN (1) CN108664141B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669551A (en) * 2018-11-06 2019-04-23 闽江学院 A kind of input method information processing method and device
CN109683724A (en) * 2018-11-12 2019-04-26 闽江学院 A kind of method and device for adding input method library
CN109683723A (en) * 2018-11-06 2019-04-26 闽江学院 A kind of control method and device handling library in input method system
CN109725740A (en) * 2018-11-12 2019-05-07 闽江学院 A kind of text editing processing method and processing device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060088356A1 (en) * 2004-08-13 2006-04-27 Bjorn Jawerth One-row keyboard and approximate typing
CN1912872A (en) * 2006-07-25 2007-02-14 北京搜狗科技发展有限公司 Method and system for abstracting new word
CN101334774A (en) * 2007-06-29 2008-12-31 北京搜狗科技发展有限公司 Character input method and input method system
CN101694608A (en) * 2008-12-04 2010-04-14 北京搜狗科技发展有限公司 Input method and system of same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060088356A1 (en) * 2004-08-13 2006-04-27 Bjorn Jawerth One-row keyboard and approximate typing
CN1912872A (en) * 2006-07-25 2007-02-14 北京搜狗科技发展有限公司 Method and system for abstracting new word
CN101334774A (en) * 2007-06-29 2008-12-31 北京搜狗科技发展有限公司 Character input method and input method system
CN101694608A (en) * 2008-12-04 2010-04-14 北京搜狗科技发展有限公司 Input method and system of same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李中才等编著: "《快易通中文速录键盘教程》", 31 March 2011, 西南交通大学出版社 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669551A (en) * 2018-11-06 2019-04-23 闽江学院 A kind of input method information processing method and device
CN109683723A (en) * 2018-11-06 2019-04-26 闽江学院 A kind of control method and device handling library in input method system
CN109683724A (en) * 2018-11-12 2019-04-26 闽江学院 A kind of method and device for adding input method library
CN109725740A (en) * 2018-11-12 2019-05-07 闽江学院 A kind of text editing processing method and processing device

Also Published As

Publication number Publication date
CN108664141B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
US11947911B2 (en) Method for training keyword extraction model, keyword extraction method, and computer device
KR102577514B1 (en) Method, apparatus for text generation, device and storage medium
EP3920075A1 (en) Text recognition method and apparatus, electronic device, and storage medium
US20210200947A1 (en) Event argument extraction method and apparatus and electronic device
JP6594534B2 (en) Text information processing method and device
US11050685B2 (en) Method for determining candidate input, input prompting method and electronic device
KR101465770B1 (en) Word probability determination
CN108664142A (en) Input method with self-learning function between document
US9442902B2 (en) Techniques for assisting a user in the textual input of names of entities to a user device in multiple different languages
US11720757B2 (en) Example based entity extraction, slot filling and value recommendation
US20170270092A1 (en) System and method for predictive text entry using n-gram language model
KR20210154705A (en) Method, apparatus, device and storage medium for matching semantics
CN108664141A (en) Input method with document context self-learning function
US20180173694A1 (en) Methods and computer systems for named entity verification, named entity verification model training, and phrase expansion
CN112000792A (en) Extraction method, device, equipment and storage medium of natural disaster event
CN109933217B (en) Method and device for pushing sentences
CN113657113B (en) Text processing method and device and electronic equipment
CN114861889B (en) Deep learning model training method, target object detection method and device
CN111460135B (en) Method and device for generating text abstract
CN111831814A (en) Pre-training method and device of abstract generation model, electronic equipment and storage medium
CN109800427B (en) Word segmentation method, device, terminal and computer readable storage medium
CN111931500A (en) Search information processing method and device
RU2712101C2 (en) Prediction of probability of occurrence of line using sequence of vectors
CN111522944A (en) Method, apparatus, device and storage medium for outputting information
CN113761923A (en) Named entity recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant