CN101645065B - Determine the method for the auxiliary lexicon needing loading, device and input method system - Google Patents

Determine the method for the auxiliary lexicon needing loading, device and input method system Download PDF

Info

Publication number
CN101645065B
CN101645065B CN200810117750.1A CN200810117750A CN101645065B CN 101645065 B CN101645065 B CN 101645065B CN 200810117750 A CN200810117750 A CN 200810117750A CN 101645065 B CN101645065 B CN 101645065B
Authority
CN
China
Prior art keywords
user
auxiliary lexicon
feature
load
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN200810117750.1A
Other languages
Chinese (zh)
Other versions
CN101645065A (en
Inventor
张扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN200910137634.0A priority Critical patent/CN101645088B/en
Priority to CN200810117750.1A priority patent/CN101645065B/en
Publication of CN101645065A publication Critical patent/CN101645065A/en
Application granted granted Critical
Publication of CN101645065B publication Critical patent/CN101645065B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method determining to need the auxiliary lexicon loaded, the method comprises: the input information gathering user; Analyze the information that collects, recording in described information can the feature of characterizing consumer interest; The auxiliary lexicon needing to load is determined according to recorded feature.The invention also discloses a kind of device and the input method system of determining to need the auxiliary lexicon loaded.By the embodiment of the present invention, improve the accuracy judging user interest, for determining to need the auxiliary lexicon loaded to provide reliable foundation.

Description

Determine the method for the auxiliary lexicon needing loading, device and input method system
Technical field
The present invention relates to input method field, particularly relate to method, device and the input method system of determining the auxiliary lexicon that needs load.
Background technology
Along with the development of computer technology, input method receives more and more higher concern as the important means of man-machine interaction.User, to the requirement of input accuracy with input speed, has promoted input method towards intellectuality, hommization, personalized future development.
Current input method system (comprising Chinese, Japanese, Korean etc.) all based on its word bank system for user provides candidate word in input process.For promoting the fluency of user's input, a developing direction of input method software expands the entry quantity of including, and reduces the number of times word for word selecting word as far as possible, improve the accuracy rate of first-selected word.Therefore, input method system, except basic dictionary, can also have auxiliary lexicon, namely usually said cell dictionary.Wherein, the entry in basic dictionary is that most of input method user accepts and widely uses, and has certain versatility.Entry in auxiliary lexicon has specialization, personalized feature, such as chemical industry dictionary, financing dictionary, American-European film dictionary etc., this dictionary for different majors field, different interest user for intellectualization of input method requirement and design.Fully, reasonably utilize these dictionaries can contribute to improving user and input accuracy and input speed.
But dictionary scale can not be pursued large and complete simply, otherwise repeated code conflict, hydraulic performance decline, software size can be brought to become the adverse effect such as large.Simultaneously the professional domain of user and interest are also diversified, unified unrealistic to a large and complete dictionary.Processing scheme is according to the concrete needs of each user at specific area, loads relevant auxiliary lexicon, wherein, how to judge the problem which auxiliary lexicon user needs to load and become more crucial.
The method of current loading auxiliary lexicon is, according to the relevant information of the current input environment of user, loads corresponding auxiliary lexicon, because current input environment may the professional domain of characterizing consumer or interest.Wherein said current input environment comprises current application program title, current window title, file name etc.Such as, the current input environment of user is certain online game interface, then automatically load the auxiliary lexicon of this game; When user uses input method in certain document, then automatically load the auxiliary lexicon matched with the content in the document title or document.
But, title due to document is normally arbitrarily edited by user, be called for short " document ", " work ", " memorandum " etc., some document, then directly by titles general such as " new files 1 ", " new files 2 " given tacit consent to, therefore judges and loads relevant auxiliary lexicon to have limitation from filename, program name.In addition, carry out judging also accurate not according to the content in document because the document not necessarily active user edit, its content differs and characterizes the interest of active user surely.And the judgement that may make the mistake, such as, occurred " chemical industry " word in document, so be loaded with chemical industry dictionary, but what in fact occur in document is " procedure work ", obvious the document might not be relevant to chemical industry, and such loading can harm users be experienced on the contrary.
Therefore, the technical matters that present stage needs those skilled in the art urgently to solve how accurately to judge field belonging to user or user interest, provides reliable foundation for loading the auxiliary lexicon needed.
Summary of the invention
In view of this, the object of the present invention is to provide the method for auxiliary lexicon, device and the input method system determining to need to load, to solve inaccurate problem when prior art determines the user thesaurus needing to load.
For achieving the above object, the invention provides following scheme:
Determine the method needing the auxiliary lexicon loaded, comprising:
Gather the input information of user; The input information of described user comprises the list entries of user;
Analyze the information that collects, extracting from described list entries in described list entries can the feature of characterizing consumer interest, and record in described information can the feature of characterizing consumer interest;
The auxiliary lexicon needing to load is determined according to recorded feature.
Preferred:
Judge in described information, whether to there is the feature matched with the feature in preset features list, if existed, then the feature matched described in record.
Preferably, when meeting trigger condition, determine the auxiliary lexicon needing to load according to recorded feature.
Preferably, recorded feature is added up, determine the auxiliary lexicon needing to load according to statistics.
Preferably, based on preset statistical model, recorded feature is added up.
Preferred:
Based on preset statistical model, according to recorded feature, each auxiliary lexicon is marked, appraisal result is defined as the auxiliary lexicon needing to load higher than the auxiliary lexicon of preset threshold value.
Preferred:
Based on preset statistical model, according to recorded feature, Comprehensive Assessment is carried out to all auxiliary lexicons, calculate the probability that each auxiliary lexicon needs to load, the auxiliary lexicon of probability sorting before preset threshold value is defined as the auxiliary lexicon needing to load.
Preferably, based on preset rule model, recorded feature is added up.
Preferably, the input information of described user also comprises:
The environmental information of the content in user thesaurus, user's input and input behavior.
Preferably, also comprise:
Recommend or automatically load the described auxiliary lexicon needing to load.
Preferably, the relevant information loading auxiliary lexicon is saved in server.
Determine the device needing the auxiliary lexicon loaded, comprising:
Information acquisition unit, for gathering the input information of user; The input information of described user comprises the list entries of user;
Analytic unit, for analyzing the information collected, extracting from described list entries in described list entries can the feature of characterizing consumer interest, and record in described information can the feature of characterizing consumer interest;
Judging unit, for determining the auxiliary lexicon needing to load according to recorded feature.
Preferably, described analytic unit comprises:
Whether judgment sub-unit, exist for judging the feature matched with the feature in preset features list in described information;
Record subelement, for the feature matched described in recording.
Preferably, described judging unit comprises:
Triggers unit, meets trigger condition for judging whether;
First performs subelement, for determining the auxiliary lexicon needing to load according to recorded feature.
Preferably, described judging unit comprises:
Statistics subelement, for adding up recorded feature;
Second performs subelement, for determining the auxiliary lexicon needing to load according to statistics.
Preferred:
Described statistics subelement is added up recorded feature based on preset statistical model.
Preferably, described statistics subelement comprises:
Scoring subelement, for based on preset statistical model, marks to each auxiliary lexicon according to recorded feature;
Relatively subelement, for being defined as needing the auxiliary lexicon of loading by marking higher than the auxiliary lexicon of preset threshold value.
Preferably, described statistics subelement comprises:
Probability calculation subelement, for based on preset statistical model, carries out Comprehensive Assessment according to recorded feature to all auxiliary lexicons, calculates the probability that each auxiliary lexicon may need to load;
Chooser unit, for being defined as the auxiliary lexicon needing to load by the auxiliary lexicon of probability sorting before preset threshold value.
Preferred:
Described statistics subelement is added up recorded feature based on preset rule model.
Preferred:
Described information acquisition unit is also for gathering content in user thesaurus, the environmental information of user's input and input behavior.
Preferably, also comprise:
Event responds unit, for recommending or automatically load the described auxiliary lexicon needing to load.
Preferably, also comprise:
Account management unit, for user bound, and is saved in server by the relevant information loading auxiliary lexicon.
A kind of input method system, comprising:
Information acquisition unit, for gathering the input information of user, described input information comprises content in the list entries of user, user thesaurus, the environmental information of user's input and input behavior;
Analytic unit, for analyzing the information collected, extracting from described list entries in described list entries can the feature of characterizing consumer interest, and record in described information can the feature of characterizing consumer interest;
Judging unit, for determining the auxiliary lexicon needing to load according to recorded feature;
Event responds unit, for recommending or automatically load the described auxiliary lexicon needing to load.
Preferably, described judging unit comprises:
Triggers unit, meets trigger condition for judging whether;
First performs subelement, for determining the auxiliary lexicon needing to load according to recorded feature.
Preferably, described judging unit comprises:
Statistics subelement, for adding up recorded feature;
Second performs subelement, for determining the auxiliary lexicon needing to load according to statistics.
Preferred:
Described statistics subelement is added up recorded feature based on preset statistical model.
Preferred: described statistics subelement is added up recorded feature based on preset rule model.
Preferably, also comprise:
Account management unit, for user bound, and is saved in server by the relevant information loading auxiliary lexicon.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
The first, the input information of the present invention to user is analyzed, in this, as the Main Basis of analysis and judgment user interest; Due to the feature of characterizing consumer interest can be extracted in the input information of user more accurately, therefore improve the accuracy judging user interest, for determining to need the auxiliary lexicon loaded to provide reliable foundation.
The second, by the data accumulation in one period, the feature of record can be added up, the accuracy of judgement can be improved further.
3rd, when gathering user's input information, not only gather the list entries of user, can also analyze in conjunction with integrated informations such as the entries (the long-term input of user) in the current input environment of user, user's input behavior, user thesaurus, contribute to filtering out interfere information, make the user interest judged more accurate.
4th, the relevant information (the auxiliary lexicon list, entry list, auxiliary lexicon service condition etc. of loading) loading auxiliary lexicon can be preserved on the server by account management, obtain from server the relevant information loading auxiliary lexicon after user completes Account Logon, make user on other computing machines, also can use the relevant information having loaded auxiliary lexicon.In addition, use the situation of same computer owing to there is many people, the administrative mechanism that accesses to your account can also avoid the interference between different users.
5th, after the professional domain knowing user or interest, the present invention can when first time Setup Wizard, update guide time or user carry out the loading of corresponding auxiliary lexicon when inputting in real time, the mode loaded also can be recommend prompting or automatically load, and therefore makes the mode of loading more flexible and hommization; Wherein recommend the interaction embodied in the mode loaded between user, contribute to getting the information of exact representation user interest more, reduce the possibility that harm users is experienced.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the method that the embodiment of the present invention provides;
Fig. 2 is the schematic diagram of the first device that the embodiment of the present invention provides;
Fig. 3 is the schematic diagram of the second device that the embodiment of the present invention provides;
Fig. 4 is the schematic diagram of the 3rd device that the embodiment of the present invention provides;
Fig. 5 is the schematic diagram of the 4th device that the embodiment of the present invention provides;
Fig. 6 is the schematic diagram of the 5th device that the embodiment of the present invention provides;
Fig. 7 is the schematic diagram of the 6th device that the embodiment of the present invention provides;
Fig. 8 is the schematic diagram of the first input method system that the embodiment of the present invention provides;
Fig. 9 is the schematic diagram of the second input method system that the embodiment of the present invention provides;
Figure 10 is the schematic diagram of the 3rd input method system that the embodiment of the present invention provides;
Figure 11 is the schematic diagram of the 4th input method system that the embodiment of the present invention provides;
Figure 12 is the schematic diagram of the 5th input method system that the embodiment of the present invention provides.
Embodiment
The invention provides a kind of method of auxiliary lexicon determining to need to load, for making above-mentioned purpose of the present invention, feature and advantage become apparent more, are described in more detail the present invention below in conjunction with the drawings and specific embodiments.
See Fig. 1, provided by the inventionly determine to need the method for auxiliary lexicon loaded to comprise the following steps:
S101: the input information gathering user;
In the embodiment of the present invention, described input information can be the information such as list entries, the input environment of user or the input behavior of user.In the embodiment of the present invention, input mode can comprise keyboard symbol, hand-written information and phonetic entry etc., and therefore described list entries can comprise coded string, hand-written information, speech input information etc.
S102: analyze the information that collects, recording in described information can the feature of characterizing consumer interest;
The information that user inputs by the present invention is as passing judgment on the foundation of user interest, and therefore direct extraction from the input information of user can the feature of characterizing consumer interest.Such as, in the list entries of user, described feature can comprise the entry, frequency etc. of the Feature Words such as " chemical industry ", " Warcraft ", because Feature Words itself may have certain discrimination, such as, entry " is asked group " and is often appeared in online game, but this word does not have the discrimination judging which kind of online game concrete.Therefore, described feature not only can comprise frequency, the entry number of Feature Words, and the tightness degree of Feature Words in list entries, can also comprise the information such as the discrimination of Feature Words.
S103: determine the auxiliary lexicon needing to load according to recorded feature.
Visible, the input information of the embodiment of the present invention to user is analyzed, and therefrom extract can the feature of characterizing consumer interest, because these features can in the interest of characterizing consumer to a greater extent, therefore with rely on external input environment to carry out compared with interest judges, the information of representative of consumer interest can be extracted more exactly.For determining to need the auxiliary lexicon loaded to provide foundation more reliably.
In actual applications, feature conveniently in recording user input information, the mode of preset features table can be adopted, can the feature of characterizing consumer interest be kept in this feature vocabulary, and be loaded in internal memory when user calls input method, during such analysis user's input information, directly the feature in input information and feature vocabulary can be compared, such as, if occurred in input information and the Feature Words that the feature in mark sheet matches, then record the features such as the entry of the Feature Words that this matches, word frequency, discrimination.
In order to make method of the present invention more effective, data accumulation can be carried out through the information in one period, then going according to the data of accumulation the interest judging user.Therefore, the mode of setting trigger condition can be adopted, when a triggering condition is met, determine the auxiliary lexicon needing to load according to the feature of record.Wherein, described trigger condition can be whether reach predetermined length integration time, the data of accumulation whether reach predetermined quantity or whether user terminates to call input method etc.In addition, also can add up the feature of bulk registration, determine the auxiliary lexicon needing to load according to the result of statistics.
Wherein, can having multiple to the mode that recorded feature is added up, introduce several preferred mode below, only for illustrating realization of the present invention, and should not be construed as limitation of the present invention.
Mode one: based on preset statistical model, adds up the feature of record.Described preset statistical model can be trained by developer in the development phase, this statistical model can be according to the language material marked (some each user's input amendment, general each sample, by manually determining whether this loading dictionary, can be regarded as a multi-object classify problem in machine learning) training pattern.Because each feature is to judging that the reference value of user interest is different, therefore can by giving different weights to realize to different features.Such as, there is discrimination and frequency in each entry: the discrimination in " Alterac mountain valley ", " rough pawl cave " is very high, and their appearance almost can infer the interest of Warcraft aspect, but the frequency that they occur is relatively low; " magic ", " injury " are then just in time contrary, because these two entries also usually occur in other linguistic context.Generally to consider this two factors, to each entry with than suitable weight.Therefore, the statistical model trained can show as the weight vectors of some features, in such as certain classification auxiliary lexicon, the weight vectors of each feature is <0.33,-0.11,0.1 ... 0.03,0.001> can the weight of representation feature 1 be 0.33, the weight of feature 2 was-0.11 (negative value represents that the appearance of this feature is unfavorable to this dictionary of recommendation, is negative characteristics).When recording the feature in input content, can be expressed as the form of vector, each characteristic parameter can be recorded as simply and appear as 1, not appear as 0, also can be represented by series of discrete value.So just by judge user interest this qualitatively problem be converted into quantitative problem, the result that statistics is obtained has higher reliability, then determines to need the auxiliary lexicon loaded according to this result.
During adding up according to the feature of preset model to record, to determine to need the auxiliary lexicon loaded, also various ways can be adopted, such as can based on preset statistical model, according to recorded feature, each auxiliary lexicon being marked, being defined as needing the auxiliary lexicon of loading higher than the auxiliary lexicon of certain preset threshold value by marking.Result of wherein giving a mark can be carried out dot product and be obtained by the feature weight vector in the proper vector recorded and model.Such as in once adding up, it is <1 that record obtains about the proper vector of certain classification auxiliary lexicon, 0, 1.33, 0.78, 0.46>, feature weight vector in statistical model is <0.33,-0.11, 0.1 ... 0.03, 0.001>, then this statistics for the marking result of this classification auxiliary lexicon is: 1 × 0.33+0 × (-0.11)+... + 0.46 × 0.001, then judge whether this marking result is greater than certain preset threshold value, if be greater than, then such other auxiliary lexicon is defined as the auxiliary lexicon needing to load.In addition, can also based on preset model, the probability loaded is needed according to each auxiliary lexicon of recorded feature calculation, probability is defined as the auxiliary lexicon that may need to load higher than the auxiliary lexicon of preset threshold value, or by each probable value by sorting from high to low, the auxiliary lexicon of rank former is defined as the auxiliary lexicon needing to load.
Mode two: recorded feature is added up based on preset rule model.This rule model can be the rule that developer is arranged by various means such as experiences, can formulate as very simple form, such as: if there is the mark sheet of 7 and above hit Warcraft dictionary in 50 words inputting continuously of user, then Warcraft dictionary is defined as the auxiliary lexicon that may need to load.
Wherein, the input information of described user that the embodiment of the present invention gathers can refer to the direct list entries of user; Simultaneously, due to the content that the environmental information of user's input, the input behavior of user and user inputted in the past, for the interest passing judgment on user, all there is certain help, if consider these information, then can judge the interest of user more exactly, for determining to need the auxiliary lexicon loaded to provide stronger foundation.Therefore, in a preferred embodiment of the invention, the input information gathered not only can comprise the direct list entries of user, the entry (entry that user inputted in the past) in the environmental information (comprise host program, installation procedure list, fire wall arrange) of input, the input behavior (comprise whether word for word upper screen, whether use backspace key, average page turning select word number of times etc.) of user and user thesaurus can also be comprised, equally these information are analyzed, record the feature wherein occurred.Such as, when analyzing, can using the direct list entries of user as the Main Basis passed judgment on, simultaneously according to the input behavior information of user, by the entry of mistake accidentally inputted, no longer include feature candidate word in after being deleted by user, therefore possessed the function of filtering data noise.Finally all features of record are added up, the information of characterizing consumer interest can be obtained more exactly, determine which auxiliary lexicon may be the auxiliary lexicon needing to load.
After utilizing said method to judge the auxiliary lexicon that may need to load, the step loading auxiliary lexicon can be completed.When loading auxiliary lexicon, the present invention can adopt the mode automatically loaded, and namely after judging the auxiliary lexicon that may need to load, if the confidence level of this judgement is higher, then can directly load these auxiliary lexicons; But when the confidence level judged is not very high, automatic loading may can hurt the experience of user on the contrary, cause the dislike of user, therefore the present invention can also adopt the mode recommending to load, namely after judging the auxiliary lexicon that may need to load, first make recommendation to user, prompting user can load these auxiliary lexicons, then loads after user determines or after making a choice again.
Adopt and recommend the benefit loaded also to be, can realize and interaction between user, the feedback receiving user can judge the true interest of user better.May be there is so a kind of hidden danger in the mode simultaneously automatically loaded: user opens multiple application program simultaneously, and may switch between program.Now may judge multiple auxiliary lexicon needing to load, if all automatically load these auxiliary lexicons, then committed memory may be made too much, influential system performance.Therefore utilize the mode of recommendation, user selects to need to load which dictionary according to actual needs, and judges to need to load but the nonoptional auxiliary lexicon of user will not be loaded, thus avoids the problems referred to above.
But, no matter be automatically load or recommend to load, all may face such problem: if user program is in screen mode toggle (as online games such as Warcrafts) at that time, judge automatically to load auxiliary lexicon to user or to make recommendation, download this dictionary so send to the webserver or eject the request recommending prompting.If this request is caused prompting frame to eject by network firewall interception, will bring the user carrying out playing and greatly dislike, harm users is experienced.Therefore, the embodiment of the present invention can also formulate recommendation rules, such as specifies " user's working procedure under screen mode toggle then will not be recommended or automatically load ".In actual applications, this recommendation rules can combine with statistical model when judging or rule model, such as: after statistics scoring is greater than certain threshold value, going back this recommendation rules of demand fulfillment could recommend or automatically load.Certainly, also can not formulate recommendation rules like this, and using whether this feature full frame is as a negative effect of statistical model, its weight is set to the higher negative value of absolute value, also can reach same object.
The above all carries out when user inputs in real time recommending or automatically loading auxiliary lexicon, in order to make application more flexible, the present invention also can carry out loading or recommending when first time utilize Setup Wizard to install input method, and now the information such as main environmentally information, installation procedure judges the auxiliary lexicon that may need to load; Can also carry out loading or recommending when update guide, now the main auxiliary lexicon judging may to need to load according to the content information in user thesaurus and environmental information.
In addition, the present invention can realize the recommendation process of iteration, when accumulation data judge the user interest that makes new advances, and then judge to make new advances may need the auxiliary lexicon of loading time, then to the recommendation made new advances; Meanwhile, when judging that the service condition of user to certain auxiliary lexicon has in limited time, user can be pointed out to unload this dictionary.
In above-described method, the relevant information (comprising the information such as the auxiliary lexicon list of loading, entry list, auxiliary lexicon service condition) having loaded auxiliary lexicon can be kept at computing machine this locality usually.But in actual applications, also may there are such two kinds of situations: a kind of is that different members in family uses same computer, then has more people to use same computer in the public places such as Internet bar.Because different people has different professional domains or interest usually, be kept at the relevant information loading auxiliary lexicon on same computer, the interference between different users may be caused.Another kind of situation is, same user may use different computing machines, as office computer, home computer, other portable computers etc., if the relevant information loading auxiliary lexicon to be kept at this locality, then this user cannot use these information on other computing machines.Therefore, in a preferred embodiment of the invention, account management mechanism can be adopted, distinguish different users according to account name, and utilize account management server to carry out synchronizing information, the relevant information loading auxiliary lexicon can be preserved on the server.After such user logs in, then can make corresponding recommendation according to different users, simultaneously, user can obtain from server and upgrade, the relevant information loading auxiliary lexicon is updated on any computing machine of current use, even if make user change computing machine, also can on the computing machine after replacing directly use loaded the relevant information of auxiliary lexicon.
In order to make the embodiment of the present invention of above introduction more become apparent, further introduce below by the example in two embody rule scenes.
Use scenes one: recommend when input method is upgraded to the new edition supporting auxiliary lexicon to recommend by input method user A.Installation procedure analyzes local user thesaurus, the input entry judging in the user thesaurus of user A relates to chemical industry, financing, the several specialty/category of interest of American-European film and confidence level is higher, and the confidence level of Jiangxi, outdoor exercises, automobile, house property four classifications is relatively lower.In a certain step of Setup Wizard, the dialog box that program dynamically generates, prompting user is the need of these auxiliary lexicons of loading: acquiescence has chosen chemical industry, financing, American-European film dictionary, point out but do not choose Jiangxi, outdoor exercises, automobile, house property four dictionaries, providing " more dictionaries " for dictionary function of search.User A has chosen " Jiangxi " and " outdoor exercises " auxiliary lexicon on this basis.After auxiliary lexicon loads, the input data of user are still used in follow-up analysis, when finding that user interest changes, may recommend new auxiliary lexicon, the existing dictionary of prompting unloading.
Use scenes two: user B plays online game World of Warcraft Internet bar, after entering screen mode toggle this user recall input method discuss with a player together with go battlefield, Alterac mountain valley task to practice level.The characteristic information that input method software obtains comprises: host program World of Warcraft; The Feature Words that the discrimination such as " Alterac mountain valley ", " rough pawl cave ", " white wolf ", " glug Longde ", " thunder lance medal " is higher is comprised in user input content; Comprise the supplemental characteristic word that the discrimination such as " forming a team ", " alliance ", " coordinate ", " battlefield ", " magic bottle " is relatively low; User takes word for word screen connection mode when inputting these Feature Words.Input method is judged that this user has and is loaded Warcraft auxiliary lexicon demand by analysis, and points out active user when this machine firewall rule detected is network interaction.In this case whether input method is selected after network game is exited, point out user to load.User selects " loading ", allows firewall rule to be revised as dictionary loading network of relation simultaneously and is operating as " automatically ".Input method software loads Warcraft auxiliary lexicon after user B enters World of Warcraft again, and this dictionary has during renewal to be downloaded automatically.This scene is under cafe environment, and user does not carry out input method Account Logon, and what judge to take to user interest is real-time mode for user input sequence.
Corresponding with the method that the embodiment of the present invention provides, the embodiment of the present invention additionally provides a kind of device determining to need the auxiliary lexicon loaded, and see Fig. 2, this device comprises:
Information acquisition unit U201, for gathering the input information of user;
Analytic unit U202, for analyzing the information collected, recording in described information can the feature of characterizing consumer interest;
Judging unit U203, for determining the auxiliary lexicon needing to load according to recorded feature.
User calls input method when inputting, and information acquisition unit U201 gathers the input information of user; Analytic unit U202 analyzes the information collected, and what occur in the information collected described in record can the feature of characterizing consumer interest, and then judging unit U203 determines the auxiliary lexicon needing to load according to recorded feature.
Analytic unit U202 can adopt various ways to complete analysis to the information collected, preferably, whether there is the feature matched with the feature in preset features list described in judgement in the information collected, if existed, then and the feature matched described in record.See Fig. 3, analytic unit U302 may further include:
Whether judgment sub-unit U3021, exist in the information that collects described in judging the feature matched with the feature in preset features list;
Record subelement U3022, for the feature matched described in recording.
Information acquisition unit U301 in Fig. 3 and judging unit U303 is identical with judging unit U203 with the information acquisition unit U201 in Fig. 2.
In order to be more conducive to the interest judging user, can at the auxiliary lexicon determining again to need to load after data accumulation after a while, therefore, see Fig. 4, judging unit U403 can comprise following two subelements:
Triggers unit U4031, meets trigger condition for judging whether;
First performs subelement U4032, for determining the auxiliary lexicon needing to load according to recorded feature.
Information acquisition unit U401 in Fig. 4, analytic unit U402 are identical with the information acquisition unit U201 in Fig. 2, analytic unit U202.
In addition, see Fig. 5, judging unit U503 also can comprise following two subelements:
Statistics subelement U5031, for adding up recorded feature;
Second performs subelement U5032, for determining the auxiliary lexicon needing to load according to statistics.
Wherein, statistics subelement U5031 also can adopt various ways to add up recorded feature, a kind ofly preferably adds up recorded feature based on preset statistical model.In this manner, add up subelement U5031 can comprise:
Scoring subelement U50311, for based on preset model, marks to each auxiliary lexicon according to recorded feature;
Relatively subelement U50312, for being defined as needing the auxiliary lexicon of loading by marking higher than the auxiliary lexicon of preset threshold value.
Information acquisition unit U501 in Fig. 5, analytic unit U502 are identical with the information acquisition unit U201 in Fig. 2, analytic unit U202.
See Fig. 6, statistics subelement U6031 also may further include:
Probability calculation subelement U60311, for based on preset model, carries out Comprehensive Assessment according to recorded feature to all auxiliary lexicons, calculates the probability that each auxiliary lexicon may need to load;
Chooser unit U60312, for being defined as the auxiliary lexicon needing to load by the auxiliary lexicon of probability sorting before preset threshold value.
Information acquisition unit U601 in Fig. 6, analytic unit U602, judging unit U603, second perform subelement U6032, and with the information acquisition unit U501 in Fig. 5, analytic unit U502, that judging unit U503, second performs subelement U5032 is identical.
Preferably, statistics subelement U6031 adds up recorded feature based on preset rule another kind.
In order to get the information of characterizing consumer interest exactly, filter out interfere information, in a preferred embodiment of the invention, the user's input information that information acquisition unit U201 gathers can comprise content in the list entries of user, user thesaurus, the environmental information of user's input and input behavior, analytic unit U202 analyzes all these information, and record wherein can the feature of characterizing consumer interest.
After judging the auxiliary lexicon needing to load, can recommend to load the described auxiliary lexicon needing to load to user, or automatically load, see Fig. 7, this device also comprises event responds unit U704, for recommending or automatically load the described auxiliary lexicon needing to load.
In order to avoid multiple user uses the interference problem between the user produced during a computing machine, and same user uses during multiple stage computing machine and cannot directly use the problem loading auxiliary lexicon relevant information, this device can also comprise account management unit U705, for user bound, and the relevant information loading auxiliary lexicon is saved in server.User can sign in server by account administrative unit U705, just can identify the identity of user, makes corresponding recommendation respectively according to different users, and the relevant information loading auxiliary lexicon is saved on server.Even if when user calls input method on other computing machines, also can obtain from server the information loading auxiliary lexicon by account administrative unit U705.
Information acquisition unit U701 in Fig. 7, analytic unit U702, judging unit U703 are identical with the information acquisition unit U201 in Fig. 2, analytic unit U202, judging unit U203 respectively.
See Fig. 8, the figure shows a kind of input method system, this system comprises:
Information acquisition unit U801, for gathering the input information of user, described input information comprises content in the direct list entries of user, user thesaurus, the environmental information of user's input and input behavior;
Analytic unit U802, for analyzing the information collected, recording in described information can the feature of characterizing consumer interest;
Judging unit U803, for determining the auxiliary lexicon needing to load according to recorded feature;
Event responds unit U804, for recommending or automatically load the described auxiliary lexicon needing to load.
In actual applications, above-mentioned each unit can be increased in original input method system, realize the auxiliary lexicon determining to need to load, and recommend or automatically load the described function needing the auxiliary lexicon loaded.For simplicity, below by the instantiation in an application, this system is described in detail.
See Fig. 9, this input method system can comprise:
Input content receiving element U901, the sequence that inputted by various input tool (qwerty keyboard, 9 key boards, handwriting pad etc.) for receiving terminal user (phonetic, five, natural code, handwriting recognition results, voice sequence or other input forms), be mapped to unified coded sequence.
Decoding unit U902, resolves for the coded sequence imported into by input content receiving element U901, transfers to candidate generation unit U903 to generate candidate.
Candidate generation unit U903, for processing the decoding sequence obtained, generating candidate list, transferring to user to select by event responds unit U908.The process that candidate generates, can be first search input method dictionary (basic dictionary/auxiliary lexicon) that rm-cell U904 provides and user thesaurus checks the entry that whether there is coupling list entries, otherwise carry out group word, give the dictionary of separate sources with different weight, use dynamic programming to find optimal path.User interest is correlated with the loading of auxiliary lexicon, is an important supplement of input method dictionary, can promotes the fluency of user's input to a certain extent.
Rm-cell U904, for providing various dictionary resource for candidate generation unit U903 generates candidate, comprises the basic dictionary of input method, group word information storehouse, local user vocabulary, user configuration information, and the auxiliary lexicon selected by user or automatically load.
Information acquisition unit U905, for gathering the input information of user, described input information comprises long term data (content in user thesaurus) and the short-term data (content of the current input of user) of user's input, also comprises the input behavior information of the environmental information of current input, user.
Analytic unit U906, for analyzing the information collected, recording in described information can the feature of characterizing consumer interest.
Judging unit U907, for determining the auxiliary lexicon needing to load according to recorded feature.
Event responds unit U908, for recommending or automatically load the described auxiliary lexicon needing to load.
Wherein, input content receiving element U901, decoding unit U902, candidate generation unit U903, rm-cell U904 can be the basic functional units of input method system.Input method system of the present invention can be that original input method system collects the list entries of user and relevant context information etc., and judge with this auxiliary lexicon needing loading, then the described auxiliary lexicon needing to load is loaded, to reach the object facilitating user to input by the mode of recommending or automatically load.
See Figure 10, judging unit U1007 can comprise following two subelements:
Triggers unit U10071, meets trigger condition for judging whether;
First performs subelement U10072, for determining the auxiliary lexicon needing to load according to recorded feature.
Input content receiving element U1001 in Figure 10, decoding unit U1002, candidate generation unit U1003, rm-cell U1004, information acquisition unit U1005, analytic unit U1006, event responds unit U1008 are identical with input content receiving element U901, the decoding unit U902 in Fig. 9, candidate generation unit U903, rm-cell U904, information acquisition unit U905, analytic unit U906, event responds unit U908 respectively.
See Figure 11, judging unit 1107 also can comprise following two subelements:
Statistics subelement U11071, for adding up recorded feature;
Second performs subelement U11072, for determining the auxiliary lexicon needing to load according to statistics.
Wherein, statistics subelement U11071 can adopt and add up recorded feature based on preset statistical model, or carries out the statisticals such as statistics based on preset rule to recorded feature.
Input content receiving element U1101 in Figure 11, decoding unit U1102, candidate generation unit U1103, rm-cell U1104, information acquisition unit U1105, analytic unit U1106, event responds unit U1108 are identical with input content receiving element U901, the decoding unit U902 in Fig. 9, candidate generation unit U903, rm-cell U904, information acquisition unit U905, analytic unit U906, event responds unit U908 respectively.
In order to avoid the interference problem that multiple user uses a computing machine to produce, and same user uses the problem that directly cannot use auxiliary lexicon relevant information produced during multiple stage computing machine, see Figure 12, this input method system can also comprise account management unit U1209, for user bound, and the relevant information loading auxiliary lexicon is saved in server, wherein, the described relevant information having loaded auxiliary lexicon can comprise the auxiliary lexicon list, entry list, auxiliary lexicon service condition etc. of loading.User, can be mutual with remote server after being logged in by account administrative unit U1209, to obtain user thesaurus, user configuration information and to have loaded the renewal of the content such as relevant information of auxiliary lexicon.
Input content receiving element U1201 in Figure 12, decoding unit U1202, candidate generation unit U1203, rm-cell U1204, information acquisition unit U1205, analytic unit U1206, judging unit U1207, event responds unit U1208 are identical with input content receiving element U901, the decoding unit U902 in Fig. 9, candidate generation unit U903, rm-cell U904, information acquisition unit U905, analytic unit U906, judging unit U907, event responds unit U908 respectively.
Above to the method for auxiliary lexicon, device and the input method system of determining to need to load provided by the present invention, be described in detail, apply specific case herein to set forth principle of the present invention and embodiment, the explanation of above embodiment just understands method of the present invention and core concept thereof for helping; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Claims (28)

1. determine the method needing the auxiliary lexicon loaded, it is characterized in that, comprising:
Gather the input information of user; The input information of described user comprises the list entries of user;
Analyze the list entries that collects, recording in described list entries can the feature of characterizing consumer interest;
The auxiliary lexicon needing to load is determined according to recorded feature.
2. method according to claim 1, is characterized in that:
Judge in described information, whether to there is the feature matched with the feature in preset features list, if existed, then the feature matched described in record.
3. method according to claim 1, is characterized in that, when meeting trigger condition, determines the auxiliary lexicon needing to load according to recorded feature.
4. method according to claim 1, is characterized in that, adds up recorded feature, determines the auxiliary lexicon needing to load according to statistics.
5. method according to claim 4, is characterized in that, adds up recorded feature based on preset statistical model.
6. method according to claim 5, is characterized in that:
Based on preset statistical model, according to recorded feature, each auxiliary lexicon is marked, appraisal result is defined as the auxiliary lexicon needing to load higher than the auxiliary lexicon of preset threshold value.
7. method according to claim 5, is characterized in that:
Based on preset statistical model, according to recorded feature, Comprehensive Assessment is carried out to all auxiliary lexicons, calculate the probability that each auxiliary lexicon needs to load, the auxiliary lexicon of probability sorting before preset threshold value is defined as the auxiliary lexicon needing to load.
8. method according to claim 4, is characterized in that, adds up recorded feature based on preset rule model.
9. method according to claim 1, is characterized in that, the input information of described user also comprises: the environmental information of the content in user thesaurus, user's input and input behavior;
Described method also comprises:
Analyze the content in the user thesaurus that collects, the environmental information of user's input and input behavior, recording in the content in described user thesaurus, the environmental information of user's input and input behavior can the feature of characterizing consumer interest;
The auxiliary lexicon needing to load is determined according to recorded feature.
10. method according to claim 1, is characterized in that, also comprises:
Recommend or automatically load the described auxiliary lexicon needing to load.
11. methods according to claim 10, is characterized in that, the relevant information loading auxiliary lexicon is saved in server.
Determine the device needing the auxiliary lexicon loaded, it is characterized in that, comprising for 12. 1 kinds:
Information acquisition unit, for gathering the input information of user; The input information of described user comprises: the list entries of user;
Analytic unit, for analyzing the list entries collected, recording in described list entries can the feature of characterizing consumer interest;
Judging unit, for determining the auxiliary lexicon needing to load according to recorded feature.
13. devices according to claim 12, is characterized in that, described analytic unit comprises:
Whether judgment sub-unit, exist for judging the feature matched with the feature in preset features list in described information;
Record subelement, for the feature matched described in recording.
14. devices according to claim 12, is characterized in that, described judging unit comprises:
Triggers unit, meets trigger condition for judging whether;
First performs subelement, for determining the auxiliary lexicon needing to load according to recorded feature.
15. devices according to claim 12, is characterized in that, described judging unit comprises:
Statistics subelement, for adding up recorded feature;
Second performs subelement, for determining the auxiliary lexicon needing to load according to statistics.
16. devices according to claim 15, is characterized in that:
Described statistics subelement is added up recorded feature based on preset statistical model.
17. devices according to claim 16, is characterized in that, described statistics subelement comprises:
Scoring subelement, for based on preset statistical model, marks to each auxiliary lexicon according to recorded feature;
Relatively subelement, for being defined as needing the auxiliary lexicon of loading by marking higher than the auxiliary lexicon of preset threshold value.
18. devices according to claim 16, is characterized in that, described statistics subelement comprises:
Probability calculation subelement, for based on preset statistical model, carries out Comprehensive Assessment according to recorded feature to all auxiliary lexicons, calculates the probability that each auxiliary lexicon may need to load;
Chooser unit, for being defined as the auxiliary lexicon needing to load by the auxiliary lexicon of probability sorting before preset threshold value.
19. devices according to claim 15, is characterized in that:
Described statistics subelement is added up recorded feature based on preset rule model.
20. devices according to claim 12, is characterized in that:
Described information acquisition unit is also for gathering content in user thesaurus, the environmental information of user's input and input behavior;
Described analytic unit is also for analyzing content in the user thesaurus that collects, the environmental information of user's input and input behavior, and recording in the content in described user thesaurus, the environmental information of user's input and input behavior can the feature of characterizing consumer interest.
21. devices according to claim 12, is characterized in that, also comprise:
Event responds unit, for recommending or automatically load the described auxiliary lexicon needing to load.
22. devices according to claim 21, is characterized in that, also comprise:
Account management unit, for user bound, and is saved in server by the relevant information loading auxiliary lexicon.
23. 1 kinds of input method systems, is characterized in that, comprising:
Information acquisition unit, for gathering the input information of user, described input information comprises the list entries of user;
Analytic unit, for analyzing the list entries collected, recording in described list entries can the feature of characterizing consumer interest;
Judging unit, for determining the auxiliary lexicon needing to load according to recorded feature;
Event responds unit, for recommending or automatically load the described auxiliary lexicon needing to load.
24. systems according to claim 23, is characterized in that, described judging unit comprises:
Triggers unit, meets trigger condition for judging whether;
First performs subelement, for determining the auxiliary lexicon needing to load according to recorded feature.
25. systems according to claim 23, is characterized in that, described judging unit comprises:
Statistics subelement, for adding up recorded feature;
Second performs subelement, for determining the auxiliary lexicon needing to load according to statistics.
26. systems according to claim 25, is characterized in that:
Described statistics subelement is added up recorded feature based on preset statistical model.
27. systems according to claim 25, is characterized in that:
Described statistics subelement is added up recorded feature based on preset rule model.
28. systems according to claim 23, is characterized in that, also comprise:
Account management unit, for user bound, and is saved in server by the relevant information loading auxiliary lexicon.
CN200810117750.1A 2008-08-05 2008-08-05 Determine the method for the auxiliary lexicon needing loading, device and input method system Active CN101645065B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN200910137634.0A CN101645088B (en) 2008-08-05 2008-08-05 Determine the method for auxiliary lexicon, device and the input method system that need to load
CN200810117750.1A CN101645065B (en) 2008-08-05 2008-08-05 Determine the method for the auxiliary lexicon needing loading, device and input method system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810117750.1A CN101645065B (en) 2008-08-05 2008-08-05 Determine the method for the auxiliary lexicon needing loading, device and input method system

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN200910137634.0A Division CN101645088B (en) 2008-08-05 2008-08-05 Determine the method for auxiliary lexicon, device and the input method system that need to load

Publications (2)

Publication Number Publication Date
CN101645065A CN101645065A (en) 2010-02-10
CN101645065B true CN101645065B (en) 2016-02-24

Family

ID=41656953

Family Applications (2)

Application Number Title Priority Date Filing Date
CN200810117750.1A Active CN101645065B (en) 2008-08-05 2008-08-05 Determine the method for the auxiliary lexicon needing loading, device and input method system
CN200910137634.0A Active CN101645088B (en) 2008-08-05 2008-08-05 Determine the method for auxiliary lexicon, device and the input method system that need to load

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN200910137634.0A Active CN101645088B (en) 2008-08-05 2008-08-05 Determine the method for auxiliary lexicon, device and the input method system that need to load

Country Status (1)

Country Link
CN (2) CN101645065B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102346731B (en) 2010-08-02 2014-09-03 联想(北京)有限公司 File processing method and file processing device
US9626429B2 (en) * 2010-11-10 2017-04-18 Nuance Communications, Inc. Text entry with word prediction, completion, or correction supplemented by search of shared corpus
CN103389979B (en) 2012-05-08 2018-10-12 深圳市世纪光速信息技术有限公司 Recommend system, the device and method of classified lexicon in input method
CN102929401A (en) * 2012-09-27 2013-02-13 百度国际科技(深圳)有限公司 Method and device for processing input method application resource or function based on input behavior
CN103870000B (en) * 2012-12-11 2018-12-14 百度国际科技(深圳)有限公司 The method and device that candidate item caused by a kind of pair of input method is ranked up
CN104156365B (en) * 2013-05-14 2018-05-11 中国移动通信集团湖南有限公司 A kind of monitoring method of file, apparatus and system
CN103399890B (en) * 2013-07-22 2016-10-26 百度在线网络技术(北京)有限公司 At the method and apparatus that input method client collects words
CN103870553B (en) 2014-03-03 2018-07-10 百度在线网络技术(北京)有限公司 A kind of input resource supplying method and system
CN104133855B (en) * 2014-07-11 2017-12-19 中安消技术有限公司 A kind of method and device of input method intelligent association
CN104765609B (en) * 2015-04-03 2018-12-07 安一恒通(北京)科技有限公司 Software context resource recommendation method, acquisition methods and corresponding device
CN107346182B (en) * 2016-05-05 2021-11-02 北京搜狗科技发展有限公司 Method and device for constructing user word bank
CN106896932B (en) * 2016-06-07 2019-10-15 阿里巴巴集团控股有限公司 A kind of candidate's words recommending method and device
CN106293119A (en) * 2016-07-29 2017-01-04 百度在线网络技术(北京)有限公司 A kind of method and apparatus carrying out information recommendation in input method
CN108536480B (en) * 2017-12-28 2021-05-28 Oppo广东移动通信有限公司 Input method configuration method and related product
CN111868668B (en) * 2018-09-03 2024-06-18 华为技术有限公司 Chinese input method candidate word searching method, terminal and server
CN110222256B (en) * 2019-05-06 2021-10-22 北京搜狗科技发展有限公司 Information recommendation method and device and information recommendation device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1089375A (en) * 1992-12-31 1994-07-13 陈劲松 " frequent association environment " word input method
CN1490701A (en) * 2002-10-15 2004-04-21 英业达股份有限公司 Inputting method system with dynamic adjustable lexicon and method thereof
CN101051323A (en) * 2007-05-22 2007-10-10 北京搜狗科技发展有限公司 Character input method, input method system and method for updating word stock

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1089375A (en) * 1992-12-31 1994-07-13 陈劲松 " frequent association environment " word input method
CN1490701A (en) * 2002-10-15 2004-04-21 英业达股份有限公司 Inputting method system with dynamic adjustable lexicon and method thereof
CN101051323A (en) * 2007-05-22 2007-10-10 北京搜狗科技发展有限公司 Character input method, input method system and method for updating word stock

Also Published As

Publication number Publication date
CN101645065A (en) 2010-02-10
CN101645088B (en) 2016-06-01
CN101645088A (en) 2010-02-10

Similar Documents

Publication Publication Date Title
CN101645065B (en) Determine the method for the auxiliary lexicon needing loading, device and input method system
CN110287479B (en) Named entity recognition method, electronic device and storage medium
CN112632385A (en) Course recommendation method and device, computer equipment and medium
US10332514B2 (en) Using multiple modality input to feedback context for natural language understanding
CN110888990B (en) Text recommendation method, device, equipment and medium
CN101470732B (en) Auxiliary word stock generation method and apparatus
CN107679144A (en) News sentence clustering method, device and storage medium based on semantic similarity
CN107818781A (en) Intelligent interactive method, equipment and storage medium
CN105956053B (en) A kind of searching method and device based on the network information
CN106682192A (en) Method and device for training answer intention classification model based on search keywords
CN111324771B (en) Video tag determination method and device, electronic equipment and storage medium
CN103870000A (en) Method and device for sorting candidate items generated by input method
CN110069724A (en) The quick jump method of application program, device, electronic equipment and storage medium
CN111694937A (en) Interviewing method and device based on artificial intelligence, computer equipment and storage medium
CN101382946A (en) Information processing apparatus, information processing method, and program
CN113590810B (en) Abstract generation model training method, abstract generation device and electronic equipment
CN109325124A (en) A kind of sensibility classification method, device, server and storage medium
CN111291551B (en) Text processing method and device, electronic equipment and computer readable storage medium
CN112395421A (en) Course label generation method and device, computer equipment and medium
KR102206781B1 (en) Method of fake news evaluation based on knowledge-based inference, recording medium and apparatus for performing the method
CN110647504B (en) Method and device for searching judicial documents
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN113569118A (en) Self-media pushing method and device, computer equipment and storage medium
CN117370190A (en) Test case generation method and device, electronic equipment and storage medium
CN111046674A (en) Semantic understanding method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant