CN102163198B - A method and a system for providing new or popular terms - Google Patents

A method and a system for providing new or popular terms Download PDF

Info

Publication number
CN102163198B
CN102163198B CN201010113873.5A CN201010113873A CN102163198B CN 102163198 B CN102163198 B CN 102163198B CN 201010113873 A CN201010113873 A CN 201010113873A CN 102163198 B CN102163198 B CN 102163198B
Authority
CN
China
Prior art keywords
neologisms
user
hot word
word
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010113873.5A
Other languages
Chinese (zh)
Other versions
CN102163198A (en
Inventor
贾剑峰
张扬
王砚峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201010113873.5A priority Critical patent/CN102163198B/en
Publication of CN102163198A publication Critical patent/CN102163198A/en
Application granted granted Critical
Publication of CN102163198B publication Critical patent/CN102163198B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a method and a system for providing new or popular terms. The method comprises the steps of making statistical analysis of words and phrases input by users through an input method system, so as to obtain new or popular terms; and providing new or popular terms in real time to the users on the internet through an input method server. By the method and the system in the invention, new or popular terms obtained can be in real-time application.

Description

The method and system of neologisms or hot word are provided
Technical field
The present invention relates to input method technique field, particularly relate to the method and system that neologisms or hot word are provided.
Background technology
Along with the progress in universal, epoch of internet, everyone can deliver oneself view on network, and the word of input is more and more personalization also, simultaneously, along with being on the increase of Internet user, individual's article word is also on the increase, and personalized neologisms also constantly emerge.In addition, by network, between different users, can also carry out interaction, for example, can initiate to discuss etc. with regard to some common topic of paying close attention to, certainly, this discussion normally be take word as carrier carries out, and in this course, also can emerge some hot words.
Wherein, neologisms are the high level overviews to new events, new things, and broad sense comprises that new entry, old word are newly used, new language phenomenons such as phrase, as " happy dinner " etc.; Hot word is popular vocabulary, as a kind of vocabulary phenomenon, reflected a country, area one period people's question of common concern or things.
Input method system is as interactive interface, and the coding method that various symbols input computing machines or other equipment (as mobile phone) are adopted is provided.For needs coding, just can complete the word of input, if want some word content to be input in computing machine, need to be by means of input method system, the input of neologisms, hot word is certainly no exception.But, neologisms, hot word are due to region and temporal feature, lack the cumulative statistics information with general entry equity, if do not carry out special processing, for neologisms, hot word, carrying out words when conversion, input method system possibly cannot embody when inputting other common entries the same intelligent.
For the problems referred to above, the method of prior art is that input method server passes through the technology such as search engine, web crawlers and captures neologisms, hot word from network, form special neologisms, hot word dictionary, input method client can download to this locality by this dictionary on server; Certainly server also can initiatively send to input method client by this dictionary, then adopts the fixing update cycle to upgrade neologisms, the hot word dictionary of client.For example, the update cycle can be one day, and the neologisms of client, hot word dictionary can upgrade once every day.
But in the method for the prior art, the neologisms that get or hot word cannot be applied in real time.
Summary of the invention
The invention provides the method and system that neologisms or hot word are provided, be conducive to make the neologisms or the hot word that get to be applied in real time.
The invention provides following scheme:
The method that neologisms or hot word are provided, comprising:
Obtain the words that user inputs by input method system, after described words is the complete a string coded string of the every input of user, in a plurality of candidate word that provide in input method, select and definite words;
Collecting the words that user selects by input method, described words is added up, whether be the prerequisite of neologisms or hot word, if so, these neologisms or hot word are carried out to record if adding up described words;
By input method server, in real time described neologisms or hot word are offered to the user in network;
Wherein, the described user who in real time described neologisms or hot word is offered in network comprises:
User, carry out in the process of words input, apply in real time described neologisms or hot word and provide word candidate item for the user in network;
Wherein, the described neologisms of described real-time application or hot word comprise for the user in network provides word candidate item:
While comprising the candidate item with described neologisms or hot word repeated code in described word candidate item, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as candidate item;
Wherein, described method also comprises:
Obtain described neologisms or user's characteristic information corresponding to hot word;
User in described judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If the user in described network has described neologisms or user's characteristic information corresponding to hot word, described neologisms or hot word are offered to the user in described network as candidate item;
And/or,
Obtain the keyword with described neologisms or hot word with semantic collocation relation;
User in described judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If comprise described keyword in the context of the current input of user in described network, described neologisms or hot word are offered to the user in described network as candidate item.
Preferably, described user's characteristic information comprises user's location message, if the user in described network has described neologisms or user's characteristic information corresponding to hot word, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If the user in described network is positioned at the region that described location message is corresponding, described neologisms or hot word are offered to the user in described network as candidate item.
Preferably, while comprising at least two entries in the coded string of the user's input in network, the described neologisms of described real-time application or hot word also comprise for the user in network provides word candidate item:
The corresponding relation of the described neologisms of application or hot word and described keyword, organizes word for described coded string in real time, and group word result is offered to the user in described network.
Preferably, the corresponding relation of the described neologisms of described real-time application or hot word and described keyword, organizes word for described coded string, and the user that group word result is offered in described network comprises:
Obtain the group word result for described coded string, and each group word result is given a mark;
In certain group word result, comprise described neologisms or hot word, and comprise the keyword corresponding with these neologisms or hot word in this group word result, increase the mark of this group word result;
According to the final mark of each group word result, described group of word result offered to the user in described network.
Preferably, when the described group word result providing is during at least two, also comprise:
By including the group word result of described neologisms or hot word and other group word results, distinguish and represent.
Preferably, while comprising at least two entries in the coded string of the user's input in network, the described neologisms of described real-time application or hot word also comprise for the user in network provides word candidate item:
While comprising described neologisms or hot word in described at least two entries, described neologisms or hot word are offered to the user in described network as candidate item;
If the user in described network accepts described neologisms or hot word, from described neologisms or hot word, start to organize forward and/or backward word, for the user in network provides the complete candidate item for described coded string.
Preferably, the described user who in real time described neologisms or hot word is offered in network comprises:
In real time for the user in network represents described neologisms or hot word, and provide the entrance that obtains described neologisms or relevant information corresponding to hot word.
Preferably, the described words that user is inputted by input method system is added up, and therefrom obtains neologisms or hot word comprises:
Obtain the user's characteristic information of each user in network, based on described user's characteristic information, each user in network is classified, obtain at least two class of subscribers;
From the words of described user's input, obtain neologisms or the hot word for each class of subscriber.
Preferably, the described user who in real time described neologisms or hot word is offered in network comprises:
Judge that whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, offers the user in described network by described neologisms or hot word.
Preferred:
Described by input method server, the user who in real time described neologisms or hot word is offered in network comprises: the neologisms of described record or hot word are offered to input method user in real time with presetting rule.
Preferably, after obtaining neologisms or hot word described in, also comprise: described neologisms or hot word are kept in the neologisms or hot word dictionary of input method server end;
Described by input method server, the user who in real time described neologisms or hot word is offered in network comprises: by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
The system that neologisms or hot word are provided, comprising:
Acquiring unit, the words of inputting by input method system for obtaining user, after described words is the complete a string coded string of the every input of user, selects and definite words in a plurality of candidate word that provide in input method;
Collector unit, the words of selecting by input method for collecting user;
Statistic unit, for described words is added up, whether be the prerequisite of neologisms or hot word, if so, these neologisms or hot word are carried out to record if adding up described words;
Neologisms or hot word provide unit, for by input method server, in real time described neologisms or hot word are offered to the user in network;
Wherein, described neologisms or hot word provide unit to comprise:
Candidate item provides unit, for carry out the process of words input user, applies in real time described neologisms or hot word and provides word candidate item for the user in network;
Wherein, described candidate item provides unit to comprise:
Judging unit, when comprising the candidate item with described neologisms or hot word repeated code when described word candidate item, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as candidate item;
Wherein, described system also comprises:
Characteristic acquisition unit, for obtaining described neologisms or user's characteristic information corresponding to hot word;
Described judging unit specifically for, judge whether the user in described network has described neologisms or user's characteristic information corresponding to hot word, if had, described neologisms or hot word are offered to the user in described network as candidate item;
And/or,
Keyword acquiring unit, for obtaining the keyword with described neologisms or hot word with semantic collocation relation;
Described judging unit specifically for, judge in the context of the current input of user in described network whether comprise described keyword, if comprised, described neologisms or hot word are offered to the user in described network as candidate item.
Preferably, described user's characteristic information comprises user's location message, described judging unit specifically for, judge whether the user in described network is positioned at the region that described location message is corresponding, if so, described neologisms or hot word are offered to the user in described network as candidate item.
Preferably, while comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Group word unit, for applying in real time the corresponding relation of described neologisms or hot word and described keyword, organizes word for described coded string, and group word result is offered to the user in described network.
Preferably, described group of word unit comprises:
Group word result is obtained subelement, for obtaining the group word result for described coded string, and each group word result is given a mark;
Mark is adjusted subelement, for comprising described neologisms or hot word when certain group word result, and comprises the keyword corresponding with these neologisms or hot word in this group word result, increases the mark of this group word result;
Result provides subelement, for according to the final mark of each group word result, described group of word result is offered to the user in described network.
Preferably, when the described group word result providing is during at least two, also comprise:
First represents unit, for distinguishing and represent including the group word result of described neologisms or hot word and other group word results.
Preferably, while comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Unit is provided first, for when described at least two entries comprise described neologisms or hot word, described neologisms or hot word is offered to the user in described network;
Again organize word unit, for judging whether the user of described network accepts described neologisms or hot word, if accept, start to organize forward and/or backward word from described neologisms or hot word, for the user in network provides the complete candidate item for described coded string.
Preferably, described neologisms or hot word provide unit to comprise:
Correlated information exhibition unit, for the user of network represents described neologisms or hot word, and provides the entrance that obtains described neologisms or relevant information corresponding to hot word in real time.
Preferably, described acquiring unit comprises:
Classification subelement, for obtaining each user's of network user's characteristic information, classifies to each user in network based on described user's characteristic information, obtains at least two class of subscribers;
Obtain subelement, for obtaining neologisms or the hot word for each class of subscriber from the words of described user's input.
Preferably, described neologisms or hot word provide unit specifically for, judge whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, described neologisms or hot word are offered to the user in described network.
Preferably,
Described neologisms or hot word provide unit specifically for the neologisms of described record or hot word are offered to input method user in real time with presetting rule.
Preferably, also comprise:
Storage unit, after obtaining neologisms or hot word, is kept at described neologisms or hot word in the neologisms or hot word dictionary of input method server end;
Described neologisms or hot word provide unit specifically for, by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
According to specific embodiment provided by the invention, the invention discloses following technique effect:
The words that the present invention can input by input method system user is added up, and therefrom obtains neologisms or hot word, by input method server, in real time described neologisms or hot word is offered to the user in network.Visible, by the present invention, make the neologisms or the hot word that get can access real-time application.In addition, the present invention can obtain neologisms or hot word from the words of user's input, therefore, can improve accuracy and the efficiency of obtaining neologisms, hot word.
In addition, owing to having considered the information such as user location when obtaining neologisms or hot word, therefore, neologisms or the hot word that user among a small circle can be used extract, for these interior other users among a small circle.In other words, because neologisms or hot word may have the features such as region, if the whole users in Network Based add up, possibly cannot find these neologisms or hot word, but the present invention can add up based on certain user, can find to greatest extent these neologisms or hot word, and the user who offers in network other uses.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the process flow diagram of the method that provides of the embodiment of the present invention;
Fig. 2 is the process flow diagram of the other method that provides of the embodiment of the present invention;
Fig. 3 is the process flow diagram of a method again that the embodiment of the present invention provides;
Fig. 4 is the schematic diagram of the system that provides of the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
Embodiment mono-
Referring to Fig. 1, the method that neologisms or hot word are provided that the embodiment of the present invention provides comprises the following steps:
S101: the words that user is inputted by input method system is added up, therefrom obtains neologisms or hot word;
In the embodiment of the present invention, can directly from the words of user's input, obtain neologisms or hot word, with respect to obtaining neologisms or hot word in the article from network, the benefit of this method is: because user is in carrying out the process of words input, can be initiatively to oneself wanting the sentence of input to carry out participle, be equivalent to take full advantage of the information of user when using input method to carry out words input, what collect is that some think with user the words that the form of word or phrase exists.And if obtain neologisms or hot word in the article from network, need so first according to punctuation mark, article to be cut into sentence one by one, then also to carry out participle (complete sentence being cut into word or word by program) to long sentence, finally could judge in the word that cuts out or word whether comprise neologisms or hot word; But the process of machine participle can produce error unavoidably, and need to expend more calculating and storage resources.Therefore, utilize the method for directly obtaining neologisms or hot word from the words of user's input, can improve the efficiency of obtaining neologisms, and the error that can avoid machine participle to bring.
During specific implementation, in user's input process, obtain the words that user selects, in a plurality of candidate word that wherein, the words that user selects refers to after the complete a string coded string of the every input of user, provide in input method, select and definite words; Then, this user-selected words and existing words are compared, according to comparison result, obtain user personality words, then according to the time tag of user personality words, frequency characteristic etc., these personal words are screened, just can therefrom obtain neologisms or hot word.
More specifically, can judge in the following manner whether the words that user inputs is neologisms or hot word: if find that certain personal word does not belong to existing words, can judge this personal word is neologisms; If find that the frequency of utilization of certain personal word within a period of time is very high, can continue to judge whether this personal word is existing words, if existing words, this personal word may be hot word, or old word is newly used, if not existing words, this personal word may be not only hot word, but also is neologisms.
Meanwhile, by the method, obtain neologisms or hot word, there is higher dirigibility, can also allow user to improve by some operations initiatively the efficiency that server obtains hot word.For example, if certain user wishes the serviced device of certain entry to be identified as hot word, this user can input this entry at short notice repeatedly, or continuous several times is inputted this entry; For server, if find that user has this behavior, can directly using this entry as hot word, add in hot word dictionary.
Also can find out from the above description the relation between neologisms and hot word: if certain words is neologisms, but hot word not necessarily, if hot word, also neologisms not necessarily, are the situation of neologisms and hot word but also exist.
S102: by input method server, in real time described neologisms or hot word are offered to the user in network.
Obtain neologisms or hot word in step S101 after, neologisms or hot word can also be kept in the neologisms or hot word dictionary of input method server end; Then by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
First it should be noted that, the method that the embodiment of the present invention provides had both gone for desktop input method, also went for input method in network.For input method in network, because client only has input and output and communication function, concrete calculating is completed by input method server, therefore, the executive agent of step S101 is input method server,, input method server can comprehensive statistics network in the words of each user's input, and therefrom judge neologisms or hot word, then, in step S102, be directly kept in the neologisms or hot word dictionary of server.
For desktop input method, because having, the client of input system calculates and memory function, therefore, the executive agent of step S101 can be the client of input method system, can be by client to using the words of user's input of this client to judge, if discovery neologisms,, in this step S102, send to input method server by these neologisms; Certainly, the executive agent of step S101 can be also input method server, for example, client can regularly or be uploaded user thesaurus on one's own initiative, owing to having recorded words and the frequency of utilization thereof of user's input in user thesaurus, therefore, be equivalent to user's words and frequency of utilization etc. to be synchronized to input method server; Then by input method server, user's words of each user and existing words are compared, obtain neologisms or hot word, and directly the neologisms that get or hot word are kept in the neologisms or hot word dictionary of server in step S102.
It should be noted that, if carry out the judgement of neologisms or hot word in client, be the equal of the input words judgement neologisms for unique user; And if carried out the judgement of neologisms or hot word by server, can judge whether to exist neologisms or hot word for the words of all user's inputs in network, this mode is particularly effective for obtaining hot word.Because hot word is the frequency of utilization according to user's words, judge, if for certain entry, it may not be very high that each user uses the frequency of this entry, and therefore, client may not can be judged as hot word by this entry; But section, has many consumers and has all used this entry in fact at one time, therefore, this entry is likely hot word, and only has while judging by server, server can be seen each user's entry service condition on the whole, just this entry may be judged as to hot word.In addition, due to screen information on the real-time record of input method in network, can catch the peak of input in short-term of entry, aspect real-time, there is superiority.
In a word, after getting neologisms or hot word, the neologisms that can both in time these newly be got or hot word are saved on input method server, therefore, can pass through input method server, the neologisms or the hot word that in real time these are newly got offer each user in network, and need not wait until that the dictionary of subscriber's local completes renewal.
As fully visible, the embodiment of the present invention is added up at the words that user is inputted by input method system, while therefrom obtaining neologisms or hot word, specifically can carry out like this: the words that collection user selects by input method (comprises that user selected the still words of upper screen output, or selected and the words of upper screen output), add up the prerequisite whether this words meets neologisms or hot word, if so, input method server carries out record by these neologisms or hot word.Wherein, prerequisite can arrange as required, for example, if certain entry at short notice (concrete time span also can be set as required) by a lot of users, used, this entry is carried out to record as hot word; Or, if certain entry in dictionary before, do not occur, and the number of users of inputting this entry surpasses certain threshold value (this threshold value also can be set as required), using this entry as neologisms record; Certainly, it can also be the mode of previously described user intervention, if certain user's continuous several times is inputted certain entry, or at short notice, same user repeatedly inputs same entry, thinks that this user may want to make this entry to be identified as hot word, now, also this entry can be included as hot word, etc.
Accordingly, by input method server, the user who in real time described neologisms or hot word is offered in network can be specifically: the neologisms of described record or hot word are offered to input method user in real time with presetting rule.The presetting rule here can be multiple, for example, can be to carry out in the process of words input user, applies in real time described neologisms or hot word and provides word candidate item for the user in network; Be that user is after input coding character string, input method system need to show words corresponding to coded string for user and select according to dictionary, in this process, user may need to input certain neologisms or hot word, now, input method server just can directly utilize neologisms or the hot word collected to provide candidate item to user.
For carrying out in the process of words input user, apply in real time described neologisms or hot word and at length introduce for the user in network provides the situation of word candidate item below.
Concrete, for input method in network, server, can be directly according to the input method rule corresponding neologisms of coupling or hot word after the coded string that receives user's input is waited for converted contents, if have neologisms or the hot word of coupling, directly neologisms or hot word fed back to user.
For desktop input method, because also having with server, input method client carries out mutual passage, therefore, input method client is after receiving the coded string or other forms of content to be converted of user's input, can first utilize local database to change, if the entry not mating completely in local dictionary, what think user's needs may be the local neologisms that do not upgrade, therefore the coded string of user's input can be sent to server, server is after receiving user's coded string, can utilize neologisms or hot word dictionary to mate, if there is the entry of coupling, return to corresponding client.
Certainly, client also can be directly send to server by the coded string of user's input, now, is equivalent in this locality and server is parallel that coded string is changed.; by client, realized the calculating of coded string conversion candidates item; and user end to server sends coded string; if the coded string of user's input is long; by server, to coded string cutting, whether judgement wherein contains neologisms or hot word, if find neologisms or hot word; send to client to represent, or be used for affecting the candidate item of client.While representing, can only represent neologisms or hot word, if user accepts this, represent result, client be take these neologisms or hot word and as basis, is regenerated the candidate item for whole coded strings.Meanwhile, client can also select user the information of these neologisms or hot word to feed back to server, increases the frequency of utilization of these neologisms or hot word.
Visible, by the present invention, be equivalent to realize the process of an iteration, that is, from the words of user input, obtain neologisms or hot word, meanwhile, utilize in real time the neologisms that get or hot word to provide input method service for other users; When user has used when neologisms or hot word are provided, selection information can also be fed back to server, so that the information of its preservation is optimized, upgraded to server in time, for input method user provides more excellent candidate item.
Example below by a practical application, embodies the application of this embodiment.
Suppose sunrise in January 1 TV play be called " soldier is holy ", user A original timing more under new model the update cycle be 2 days, this user A just can obtain " soldier is holy " these neologisms the soonest January 3, therefore, when if this user A wants to input " soldier sage " in January 1 or January 2, can only adopt the mode of word for word selecting, first select " soldier " and then selection " sage "; And the scheme of the use embodiment of the present invention one, as long as input method server has been saved in neologisms or hot word dictionary by " soldier is holy " as neologisms, when user inputs the phonetic " bingsheng " of " soldier is holy " so, will activate immediately client for the access of server, what no matter user A was used is input method in network or local client, all can just directly tap on same day January 1 " soldier is holy " this entry.
In the above-described embodiments, after obtaining neologisms or hot word, when the described neologisms of application or hot word provide word candidate item for the user in network, to all users, be all indiscriminate.But may there is such situation in actual applications: having some neologisms may be to have the partials of words to obtain by enchashment, and some hot words itself may be existing entries, therefore, make some neologisms or hot word may with existing duplication.For example, neologisms " blog fight " and existing words " fight " are exactly repeated code.If user has inputted pinyin string " bodou " so, should how to provide corresponding candidate item is a considerable problem.Method of the prior art is: the weight that presets neologisms is greater than the weight of existing words,, when there is repeated code, preferentially neologisms is recommended to user; But, if certain user wants to input existing words " fight " exactly, cannot, by pressing directly upper screen of space bar, even may need to increase the number of times of button; If this user has not heard the literary style of " blog fight ", also may feel baffled.
In order to address this problem, when the embodiment of the present invention can be worked as the candidate item comprising in word candidate item with described neologisms or hot word repeated code, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as first-selected candidate item.With this, solve when there is repeated code, how to guarantee the problem of the accuracy rate of first-selected candidate item.Wherein, the embodiment that the user of judgement in network need to input the probability of described neologisms or hot word can have multiple, exemplarily introduces wherein several below.
Embodiment one
In actual applications, may there is following situation: some neologisms or hot word may be only only and need having the user (user of identity as special in some or special region etc.) of some common trait.For example, for the neologisms that occur in certain online game, may be only the user who plays this game, to be only and to need equally; For certain city, there are the neologisms that new building name, restaurant's name etc. are corresponding, may only have the user in this city just can use; The a certain media event occurring for somewhere, also may only have this regional user just can be concerned about etc.
Therefore, in this embodiment one, after judging neologisms or hot word, can obtain these neologisms or user's characteristic information corresponding to hot word, these neologisms or user's characteristic information corresponding to hot word are saved in to input method server; Then, when each user in network provides candidate item, apply in real time described neologisms or hot word and provide word candidate item for thering is the user of described user characteristics in network.
Concrete, referring to Fig. 2, the method that this embodiment one provides can comprise the following steps:
S201: the words that user is inputted by input method system is added up, therefrom obtains neologisms or hot word;
This step S201 can be identical with S101, repeats no more here.
S202: obtain described neologisms or user's characteristic information corresponding to hot word;
Because in step S101 be obtains neologisms or hot word from the words of user's input, no matter be input method in network or desktop input method, can know that each neologisms or hot word are the words of inputting from which or which user, and then can, by knowing these user's characteristic information, know neologisms or user's characteristic information corresponding to hot word.
Wherein, user's characteristic information can be obtained from user's log-on message, for example, can comprise user's the information such as location, identity, age, hobby.Concrete, after getting certain neologisms or hot word, first these neologisms or hot word are from which user, then from these users' log-on message, obtain each characteristic information of user, and therefrom select the feature with general character, as these neologisms or user's characteristic information corresponding to hot word.
Certainly, user's characteristic information also can obtain by other approach, for example, can also obtain user's IP address, by IP section, judges the location that user is current; Or, can obtain the cell dictionary that user chooses, by can simply judging interest of user etc. to the selection of cell dictionary.
S203: carry out in the process of words input user, by input method server, apply in real time described neologisms or hot word and provide word candidate item for thering is the user of described user characteristics in network.Concrete, if the user in described network has described neologisms or user's characteristic information corresponding to hot word, described neologisms or hot word are offered to the user in described network as candidate item.
That is,, before application neologisms or hot word provide word candidate item for the user in network, need to judge user's user characteristics.Concrete, if find that the coded string of certain user's input can hit certain neologisms or hot word, before returning to these neologisms or hot word to this user, can first take out this user's log-on message, judge whether this user has these neologisms or user's characteristic information corresponding to hot word, if had, then these neologisms or hot word are offered to user.Wherein, these neologisms or hot word can be offered to user as first-selected candidate item, if this user wants to input these neologisms or hot word really like this, directly press the upper screen output that space bar can complete this entry, obviously improve input efficiency.Certainly, also these neologisms or hot word can be presented to candidate frame place in addition, for example may be displayed on the blank parts of input frame, with this, point out relative other entries of this entry to there is singularity, can certainly provide corresponding options button for it, when user presses this options button, by screen output on this entry.
Wherein, user's characteristic information can comprise user's location message, and now, the user that can to apply in real time described neologisms or hot word be the corresponding region of location message described in network provides word candidate item.For example, when getting " big fire " this hot word, by analysis, inputting the user's of this entry characteristic information finds, these users are positioned near five road junctions, Beijing, therefore, can determine that user's characteristic information corresponding to " big fire " this hot word is: user location is near five road junctions, Beijing, and preserves this information.Then, when having user to input " dahuo " this phonetic, can first judge the location that obtains this user from this user's log-on message, if this user is positioned near five road junctions, Beijing really, directly " big fire " be offered to this user as first-selected candidate item; Otherwise, if user is not or not this region, can think that this user wants input " big fire " this hot word, therefore, provides or option to user in a conventional manner, as, according to word frequency, user thesaurus etc., each candidate item is sorted, provide the candidate item such as " everybody ", " bulk production ", " obtaining greatly ", or, also " big fire " can be presented on the end option of time choosing or first screen, by user, be confirmed whether voluntarily to need input " big fire ".
As can be seen here, owing to having considered the information such as user location when obtaining neologisms or hot word, therefore, neologisms or the hot word that user among a small circle can be used extract, for these interior other users among a small circle.In other words, because neologisms or hot word may have the features such as region, if the whole users in Network Based add up, possibly cannot find these neologisms or hot word, but the present invention can add up based on certain user, can find to greatest extent these neologisms or hot word, and the user who offers in network other uses.
Embodiment two
In previous embodiment one, while there is the candidate item with described neologisms or hot word repeated code in candidate item, be some features of utilizing neologisms or hot word itself, go to judge whether should have as preference to offer certain user.But some neologisms or hot word may not have obvious user's characteristic information.For example, the neologisms of mentioning in example above " blog fight ", these neologisms possibly cannot obtain user's characteristic information, if utilize the method for embodiment one, may or cannot judge whether as preference, this neologisms or hot word are offered to user.
For this reason, in this embodiment two, provide following method: when obtaining neologisms or hot word or afterwards, can also obtain these neologisms or hot word language environment, front and back entry in abutting connection with information such as number of times, by these information, often can obtain and the information such as keyword of these neologisms or hot word co-occurrence, these keywords and corresponding neologisms or hot word are formed to semantic collocation relation, and this semanteme collocation relation is preserved.Like this, when the coded string of user's input hits certain neologisms or hot word, can obtain the information such as context of the current input of this user, if comprise certain these neologisms or keyword corresponding to hot word in these information, these neologisms or hot word can be offered to this user.Equally, now also these neologisms or hot word can be offered to this user as first-selected candidate item, can certainly adopt other modes.
Above, be all to suppose that user is only for neologisms or hot word input coding character string, as, user wants input " big fire ", and the coded string of input is " dahuo ".But, in actual applications, also may there is such situation: the coded string of user's input may be long, wherein may comprise at least two entries, certainly, wherein also may comprise some neologisms or hot word.
For this situation, the semanteme collocation relation providing in above-mentioned embodiment two can also be provided, realize the group word for the coded string that comprises neologisms.Apply in real time the corresponding relation of described neologisms or hot word and described keyword, for described coded string, organize word, and group word result is offered to the user in described network.
For example, and neologisms " blog fight " (refer to utilize between netizen " quarrel " that web blog carries out or the run foul of each other) keyword that forms semantic collocation relation have " Han Han ", " blog ", " on the net ", " Li Chengpeng " etc.When if the pinyin string of user input is " hanhanzaiwangshangyurenbodou(Han Han on the net with people (blog fight or fight etc.)) ", in group, find semanteme between " on the net " and " blog fight " relation of arranging in pairs or groups during word, so preferably provide " Han Han on the net with people's blog fight " candidate item.Certainly, also can first by contextual semantic collocation relation, organize word, and then judge neologisms or the hot word whether mating.
Due to except needs are processed the candidate item for neologisms or hot word, also to obtain other candidate item, for the ease of processing, in embodiments of the present invention, can, on the basis of original universal model, user model, add neologisms or hot word model.
Wherein, universal model is a general frame, can provide conventional vocabulary, grammer and semanteme and provide a plurality of candidate item and scoring accordingly: λ common;
User model is searched and is comprised user's word and user habit usage in universal model candidate item, for the candidate item of mating with user habit adds corresponding mark: λ user;
Neologisms or hot word model add corresponding mark by the candidate item that comprises neologisms or hot word and front and back environment thereof, application collocation: λ new.
Finally, three mark weightings can be obtained to best candidate item, then according to user's configuration, pack and send it back client.Certainly, be not all three models all must be used in each transfer process, according to actual conditions, use.That is, can utilize universal model and/or user model to obtain the group word result for described coded string, and each group word result is given a mark; And then utilize neologisms model to judge, and in certain group word result, comprise described neologisms or hot word, and comprise the keyword corresponding with these neologisms or hot word in this group word result, increase the mark of this group word result; Finally, according to the final mark of each group word result, described group of word result offered to the user in described network.For example, can organize the final mark of word result according to each, each group word result is sorted, and each group word result is offered to the user in described network in order; Or, also can only the highest group word result of score be offered to the user in described network, etc.
The user of still take input " hanhanzaiwangshangyurenbodou " is example, first, can utilize universal model from all vocabulary of general dictionary, user thesaurus, neologisms or hot word dictionary, according to universal model, builds several candidate item, as:
1. " Han Han wrestles on the net ";
2. " Han Han is blog fight on the net ";
3. " Han Han shells beans on the net "
And neologisms or hot word model are given a mark to each candidate item by the semanteme collocation relation of preserving, adjust the comprehensive weights of candidate item.As for above-mentioned example, find that " blog fight " and remainder are in close relations, be that second candidate item increases corresponding mark, then, the mark of each candidate item gained is merged, and the group word result of best result is selected and sent to user, or, the candidate item of best result is sent to user as preference.
It should be noted that, while normally all can not get entry in all dictionaries (comprising new dictionary), enter group word process, for desktop input method, if the new term of group word is not in this locality, now parallel neologisms communication is sent, first group word module can carry out conventional group word voluntarily, once obtain the communication result with server end, reads immediately new Word library updating group word result; If wait timeout returns to original group word result.
Server is when providing group word result to user, complete group word entry can be provided, also can only the neologisms that comprise in group word entry or hot word be turned back to client (as possible request is " eating happy dinner " for neologisms " happy dinner "), after client shows these neologisms, if user has accepted this neologisms, can from these neologisms or hot word, be started to organize forward and/or backward word by server or client, provide the complete candidate item for whole coded string.
For desktop input method, after user has edited entry, contrast its edit step and consult whether relate to existing neologisms, if last upper screen word string comprises existing neologisms, record the context of these neologisms in this sentence, the frequency of utilization of the local neologisms that upgrade in time; Otherwise not comprising existing neologisms, is likely also undiscovered neologisms, sends it to neologisms communication module, reports server, immediately monitoring neologisms.
Embodiment bis-
In previous embodiment one, after hypothesis has been found neologisms or hot word, obtain the user's characteristic information that these neologisms or hot word are corresponding or there is the keyword of semantic collocation relation, but, when finding neologisms or hot word, be to add up for the unique user in network or whole user, therefore possibly cannot obtain some and there is territoriality, regional neologisms or hot word.For example, new Kai Liaoyijia restaurant, Beijing area is " happy dinner ", and " happy dinner " has high frequency and the paroxysmal feature of neologisms for Beijing area, and still, if be placed in all user's set, its frequecy characteristic may be difficult to be found out.And for example, in the example of embodiment bis-, near five road junctions, Beijing, may there is a big fire, for near the user five road junctions, the frequency of utilization of " big fire " may be to raise suddenly at short notice, if but based on all users, add up, possibly cannot find this hot word.
In order to address this problem, the embodiment of the present invention two provides corresponding method: first, can obtain the user's characteristic information of each user in network, utilize user's characteristic information to classify to user, for example, user's IP section, the cell dictionary of the preference that the current residing application program of user's input method system, user are chosen etc. can be as the foundations of classification; Obviously, same user can belong to different classifications simultaneously.Then, when whether the words that judges user's input is neologisms or hot word, can judge respectively whether this words has the feature of neologisms or hot word under various classifications, if certain words has the feature of neologisms or hot word under certain classification, can be using this words the neologisms under this classification or hot word, then be saved in classified lexicon corresponding to this classification, or give such other label for these neologisms or hot word.
Referring to Fig. 3, the method that this embodiment bis-provides comprises the following steps:
S301: obtain user's characteristic information, based on user's characteristic information, user is classified, obtain a plurality of class of subscribers;
S302: obtain neologisms or the hot word for each class of subscriber from the words of described user's input;
S304: by described input method server, in real time the described neologisms for each class of subscriber or hot word are offered to the user of the respective classes in network.
In a word, in this embodiment bis-, in the words that judges user's input, whether comprise in the process of neologisms or hot word, not only to know the words that user selects, also will know each user's IP section, the information such as cell dictionary of the preference that the current residing application program of each user's input method system or each user choose.
It should be noted that, because some user characteristics may change, therefore, be not unalterable to user's classification, may be a dynamic process.For example, user, carry out in the process of words input, the current residing application program of input method system may change, and as original user is playing certain online game, this user can be assigned in the classification of this online game together with other users that play this online game; But user exited this online game afterwards, and open certain instant communication software and good friend chat, now, this user can be assigned to again other and be assigned to the classification of this instant communication software together with using the user of this software.Certainly, user also may be when playing online game, utilize instant communication software and good friend's chat, now, this user's the residing application program of input method may often switch, accordingly, the residing classification of this user also can change at any time, and the words of inputting in different application can judge separately.In addition, may constantly have user to login or exit input method server, therefore, also making user in each classification constantly change, may constantly have new user to add certain classification, or original user exits this classification, etc.
It should be noted that in addition, when user being inputted to words and judging respectively according to above-mentioned classification, can also judge by all words based on all users, do not conflict between the two, complement one another on the contrary, mutually promote.
In the method for this embodiment bis-, owing to finding neologisms or hot word by the class of subscriber based on different, therefore, having improved neologisms or hot word can found possibility.Meanwhile, if the user based on certain classification has found certain neologisms or hot word, make these neologisms or hot word there is such other attribute.The high frequency of the user's input under these labels of statistics such as hobby label that in other words, can be by user, region label, focus vocabulary are as classification entry.
Such as current, according to IP section, all users are carried out to category index, find in the IP section of Beijing area, the frequecy characteristic of " happy dinner " has the feature of short-term high frequency, " happy dinner " can be updated in the classified lexicon of Beijing area as neologisms or hot word.As shown in table 1, in this classified lexicon, can increase " happy dinner " entry.
Table 1
It should be noted that, although the user in of all categories may constantly change, in certain classified lexicon or belong to for the neologisms that have been found that or hot word of certain classification, can not change because of middle user's of all categories variation.For example, the user by Beijing area, has found " happy fete " these neologisms, if sometime, the user under this classification is off-line, and " happy dinner " can still be saved, and still have " Beijing area " this category attribute.
Therefore, a kind of preferred embodiment in, can utilize this feature, while solving in candidate item the candidate who exists with described neologisms or hot word repeated code, the problem how each candidate item being sorted.Concrete, the described neologisms for each class of subscriber of in real time application of step S303 or hot word for network in each user when word candidate item is provided, can judge whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, described neologisms or hot word are offered to the user in described network, same, now also these neologisms or hot word can be offered to this user as first-selected candidate item, like this, can improve the accuracy rate of preference.Certainly, also can adopt other mode that neologisms or hot word are provided, avoid affecting user's normal input.
That is to say, because some neologisms or hot word may be that user based on certain kind finds, therefore, illustrate that these neologisms or hot word may have certain category attribute, and the user under possible this classification just may need to use these neologisms or hot word, therefore, when finding that the coded string of certain user's input can hit certain neologisms or hot word, can first judge whether these neologisms or hot word have category attribute, if had, can also judge whether this user belongs to this classification, if belonged to, these neologisms or hot word can be offered to this user, otherwise, provide in a conventional manner candidate item, or, also these neologisms or hot word can be presented on the end option of time choosing or first screen, by user, be confirmed whether voluntarily to carry out representing of neologisms or hot word.
It should be noted that, because being offers the user in network by server by neologisms or hot word, therefore, server can record the whole word frequency of all users in each neologisms or the relative network of hot word, suppose that the number of times that user A inputs certain neologisms is 10 times, user B inputs 5 times of number of times of these neologisms, and the number of times that user C inputs these neologisms is 20 times, and suppose that other users did not input these neologisms, the whole word frequency of these neologisms is 35; Now, can be using this integral body word frequency each user's in network local word frequency, like this, even if certain user uses the number of times of these neologisms also fewer, also can obtain higher local word frequency, next time, while inputting these neologisms, may rise in the position of these neologisms in candidate item again.
Or, also can utilize the relative word frequency of obtaining neologisms or hot word for all-network user, upgrade user's local word frequency.For example, suppose that certain neologisms access times are at the appointed time 50 times, and the access times of all entries under neologisms unisonance are (to be assumed to be in the collection threshold value that the access times of 50 times obtain at neologisms for 100 times altogether, can be registered as in the situation of neologisms), these neologisms are 50% in the relative word frequency of server end, that is to say, if the access times of all entries under local neologisms unisonance are 20 times, word frequency while making to add neologisms in dictionary is made as 20 times, and forming the ratio that makes the access times of neologisms in local dictionary account for total access times is 50%.Now, also the relative word frequency of these neologisms in subscriber's local dictionary can be revised as to 50%.
By each embodiment above, the method for word candidate item that provides provided by the invention has been carried out at length introducing, in other embodiments, feature for neologisms or hot word, can also be after finding neologisms or hot word, some external resource information by resource acquisitions such as search engines about these neologisms or hot word, for example, if neologisms, can obtain the lexical or textual analysis of these neologisms, or related news summary etc.; If hot word can obtain the hot news relevant to this hot word etc.When certain user has inputted neologisms or hot word, these external resource information can be offered to user, make user to obtain more information by input method.For example, when the actions such as user's click, mouse pass through, show the demonstration of the information such as encyclopaedia explanation, event navigation.Or, also the link of external resource can be offered to user, like this, user directly clickthrough enters the corresponding page, and this has been equivalent to provide a kind of shortcut of obtaining information, for example, if hot word, neologisms can provide the shortcut that enters relevant hot news representing, if can represent and provide the shortcut that enters neologisms encyclopaedia or related news simultaneously simultaneously.
In the specific implementation, the use of can interosculating of each embodiment and various embodiment, and, can obtain user profile required under various specific implementations, environmental information etc. simultaneously, specifically can comprise: the phonetic of entry, frequency, front and back entry in abutting connection with applicable webpage interactive environment of number of times, entry applicable application environment, entry etc., certainly can also obtain some external resource information from external resource, comprise encyclopaedia lexical or textual analysis, related news of entry etc.In embodiments of the present invention, these information can be called to the rich information of entry, the database of preserving these information can be called rich information bank.When adopting concrete embodiment, can arrive in this richness information bank and obtain required information.
In addition, in each embodiment of the present invention, while comprising the candidate item with described neologisms or hot word repeated code in described word candidate item, in client, neologisms or hot word and other candidate item can be distinguished and represented.Concrete, can give indicating of sytlized font or color etc. in prompting position.For example, neologisms or hot word independently can be presented in and represent the positions such as frame with coding, rather than be presented in candidate and represent in frame, and, the position that represents that can also promote or reduce neologisms or hot word according to information such as the applied environment of neologisms or hot word, short-term input are historical; Also can change the color of neologisms or hot word, can also add special sign etc. for these neologisms or hot word.By these modes, can provide more information to user, allow user know that other popular word of this word compares and have singularity.
When the coded string of user's input comprises a plurality of entry, if a plurality of groups of word results are all offered to user, in client, also can distinguish and represent including the entry of neologisms or hot word and other entries, same, can give indicating of sytlized font or color etc. in prompting position.For example, this entry independently can be presented in coding and represent frame position, rather than be presented in candidate and represent in frame, similarly, the position that represents that also can promote or reduce neologisms or hot word according to information such as the applied environment of neologisms or hot word, short-term input are historical; Also can change the color of this entry, the neologisms in this entry or hot word can also be highlighted, or add special sign etc. for these neologisms or hot word.By these modes, can point out and in this entry, contain neologisms or hot word to user equally.
The above has introduced neologisms or hot word in the application providing for user aspect input candidate item, in actual applications, can also be that the user in network represents described neologisms or hot word in real time, and the entrance that obtains described neologisms or relevant information corresponding to hot word is provided.In this case, though user current be not to carry out words input, also can the neologisms that newly get or hot word be prompted to user by input method system.Now, input method system can be used as the instrument of user's obtaining information, by input method system, can in real time the neologisms of newly collecting or hot word be offered to user, can also provide the entrance that obtains relevant information to user simultaneously, like this, after user finds neologisms or hot word that input method provides, if interested in it, can obtain the information relevant to these neologisms or hot word by this entrance, for example, corresponding certain the hot news event of certain hot word possibility, this hot word is represented to after user, user just can pass through corresponding entrance, get the detailed content of this hot news event.Wherein, described neologisms or hot word itself just can be used as the entrance of obtaining information, that is, neologisms or hot word can be presented as the form of similar link, and user clicks these neologisms or hot word by mouse, just can directly get relevant information.Specifically, when providing relevant information to user, may need to use browser, specifically how by input method system, call browser, belong to prior art, repeat no more here.
The method that neologisms or hot word are provided providing with the embodiment of the present invention is corresponding, and the embodiment of the present invention also provides a kind of system that neologisms or hot word are provided, and referring to Fig. 4, this system comprises:
Acquiring unit 401, adds up for the words that user is inputted by input method system, therefrom obtains neologisms or hot word;
Candidate item provides unit 402, for by input method server, in real time described neologisms or hot word is offered to the user in network.
Concrete, it can be to carry out, in the process of words input, being used to user that candidate item is provided user that neologisms or hot word provide unit; If user is current, be not at the input entry relevant to certain neologisms or hot word, even may not carry out words input, also the neologisms that newly get or hot word can be shown to user, because these neologisms or hot word are associated with certain media event, focus incident etc. conventionally, there is some relevant information, therefore, can also provide the entrance that shows these relevant informations for user simultaneously.
Accordingly, neologisms or hot word provide unit 402 to comprise: candidate item provides unit, for carry out the process of words input user, apply in real time described neologisms or hot word and provide word candidate item for the user in network.
Or neologisms or hot word provide unit 402 also can comprise: correlated information exhibition unit, for the user of network represents described neologisms or hot word, and provides the entrance that obtains described neologisms or relevant information corresponding to hot word in real time.
Wherein, described candidate item provides unit to comprise:
Judging unit, when comprising the candidate item with described neologisms or hot word repeated code when described word candidate item, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as candidate item.
During specific implementation, this system can also comprise:
Characteristic acquisition unit, for obtaining described neologisms or user's characteristic information corresponding to hot word;
Described judging unit specifically for, judge whether the user in described network has described neologisms or user's characteristic information corresponding to hot word, if had, described neologisms or hot word are offered to the user in described network as candidate item.
Wherein, described user's characteristic information comprises user's location message, described preference determining unit specifically for, judge whether the user in described network is positioned at the region that described location message is corresponding, if so, described neologisms or hot word are offered to the user in described network as candidate item.
Or this system can also comprise:
Keyword acquiring unit, for obtaining the keyword with described neologisms or hot word with semantic collocation relation;
Described preference determining unit specifically for, judge in the context of the current input of user in described network whether comprise described keyword, if comprised, described neologisms or hot word are offered to the user in described network as candidate item.
While comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Group word unit, for applying in real time the corresponding relation of described neologisms or hot word and described keyword, organizes word for described coded string, and group word result is offered to the user in described network.
Wherein, described group of word unit comprises:
Group word result is obtained subelement, for obtaining the group word result for described coded string, and each group word result is given a mark;
Mark is adjusted subelement, for comprising described neologisms or hot word when certain group word result, and comprises the keyword corresponding with these neologisms or hot word in this group word result, increases the mark of this group word result;
Result provides subelement, for according to the final mark of each group word result, described group of word result is offered to the user in described network.
Accordingly, when the described group word result providing is during at least two, also comprise:
First represents unit, for distinguishing and represent including the group word result of described neologisms or hot word and other group word results.
Or while comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Unit is provided first, for when described at least two entries comprise described neologisms or hot word, described neologisms or hot word is offered to the user in described network;
Again organize word unit, for judging whether the user of described network accepts described neologisms or hot word, if accept, start to organize forward and/or backward word from described neologisms or hot word, for the user in network provides the complete candidate item for described coded string.
In order to get in time neologisms or hot word, acquiring unit 401 can comprise:
Classification subelement, for obtaining each user's of network user's characteristic information, classifies to each user in network based on described user's characteristic information, obtains at least two class of subscribers;
Obtain subelement, for obtaining neologisms or the hot word for each class of subscriber from the words of described user's input.
Wherein, described neologisms or hot word provide unit specifically for, judge whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, described neologisms or hot word are offered to the user in described network as candidate item.
In actual applications, acquiring unit 401 specifically can be for, collects the words that user selects by input method, adds up the prerequisite whether this words meets neologisms or hot word, and if so, input method server carries out record by these neologisms or hot word;
Accordingly, neologisms or hot word provide the unit 402 specifically can be for, and the neologisms of described record or hot word are offered to input method user in real time with presetting rule.
Wherein, this device can also comprise:
Storage unit, after obtaining neologisms or hot word, is kept at described neologisms or hot word in the neologisms or hot word dictionary of input method server end;
Accordingly, neologisms or hot word provide unit 402 specifically for, by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
Above to the method and system that neologisms or hot word are provided provided by the present invention, be described in detail, applied specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment is just for helping to understand method of the present invention and core concept thereof; Meanwhile, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Claims (22)

1. the method that neologisms or hot word are provided, is characterized in that, comprising:
The words that user is inputted by input method system is added up, and therefrom obtains neologisms or hot word;
By input method server, in real time described neologisms or hot word are offered to the user in network;
Wherein, the described words that user is inputted by input method system is added up, therefrom obtain neologisms or hot word comprises: collect the words that user selects by input method, add up the prerequisite whether this words meets neologisms or hot word, if so, input method server carries out record by these neologisms or hot word;
Described by input method server, the user who in real time described neologisms or hot word is offered in network comprises: the neologisms of described record or hot word are offered to input method user in real time with presetting rule;
Wherein, the described user who in real time described neologisms or hot word is offered in network comprises:
User, carry out in the process of words input, apply in real time described neologisms or hot word and provide word candidate item for the user in network;
Wherein, the described neologisms of described real-time application or hot word comprise for the user in network provides word candidate item:
While comprising the candidate item with described neologisms or hot word repeated code in described word candidate item, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as candidate item;
Wherein, described method also comprises: obtain described neologisms or user's characteristic information corresponding to hot word;
User in described judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If the user in described network has described neologisms or user's characteristic information corresponding to hot word, described neologisms or hot word are offered to the user in described network as candidate item;
And/or,
Obtain the keyword with described neologisms or hot word with semantic collocation relation;
User in described judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If comprise described keyword in the context of the current input of user in described network, described neologisms or hot word are offered to the user in described network as candidate item.
2. method according to claim 1, it is characterized in that, described user's characteristic information comprises user's location message, if the user in described network has described neologisms or user's characteristic information corresponding to hot word, the user who described neologisms or hot word is offered in described network as candidate item comprises:
If the user in described network is positioned at the region that described location message is corresponding, described neologisms or hot word are offered to the user in described network as candidate item.
3. method according to claim 1, is characterized in that, while comprising at least two entries in the coded string of the user's input in network, the described neologisms of described real-time application or hot word also comprise for the user in network provides word candidate item:
The corresponding relation of the described neologisms of application or hot word and described keyword, organizes word for described coded string in real time, and group word result is offered to the user in described network.
4. method according to claim 3, is characterized in that, the corresponding relation of the described neologisms of described real-time application or hot word and described keyword, organizes word for described coded string, and the user that group word result is offered in described network comprises:
Obtain the group word result for described coded string, and each group word result is given a mark;
In certain group word result, comprise described neologisms or hot word, and comprise the keyword corresponding with these neologisms or hot word in this group word result, increase the mark of this group word result;
According to the final mark of each group word result, described group of word result offered to the user in described network.
5. according to the method described in claim 3 or 4, it is characterized in that, when the described group word result providing is during at least two, also comprise:
By including the group word result of described neologisms or hot word and other group word results, distinguish and represent.
6. method according to claim 1, is characterized in that, while comprising at least two entries in the coded string of the user's input in network, the described neologisms of described real-time application or hot word also comprise for the user in network provides word candidate item:
While comprising described neologisms or hot word in described at least two entries, described neologisms or hot word are offered to the user in described network as candidate item;
If the user in described network accepts described neologisms or hot word, from described neologisms or hot word, start to organize forward and/or backward word, for the user in network provides the complete candidate item for described coded string.
7. method according to claim 1, is characterized in that, the described user who in real time described neologisms or hot word is offered in network comprises:
In real time for the user in network represents described neologisms or hot word, and provide the entrance that obtains described neologisms or relevant information corresponding to hot word.
8. method according to claim 1, is characterized in that, the described words that user is inputted by input method system is added up, and therefrom obtains neologisms or hot word comprises:
Obtain the user's characteristic information of each user in network, based on described user's characteristic information, each user in network is classified, obtain at least two class of subscribers;
From the words of described user's input, obtain neologisms or the hot word for each class of subscriber.
9. method according to claim 8, is characterized in that, the described user who in real time described neologisms or hot word is offered in network comprises:
Judge that whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, offers the user in described network by described neologisms or hot word.
10. method according to claim 1, is characterized in that:
Described by input method server, the user who in real time described neologisms or hot word is offered in network comprises: the neologisms of described record or hot word are offered to input method user in real time with presetting rule.
11. methods according to claim 1, is characterized in that, described in also comprise after obtaining neologisms or hot word: described neologisms or hot word are kept in the neologisms or hot word dictionary of input method server end;
Described by input method server, the user who in real time described neologisms or hot word is offered in network comprises: by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
12. 1 kinds of systems that neologisms or hot word are provided, is characterized in that, comprising:
Acquiring unit, adds up for the words that user is inputted by input method system, therefrom obtains neologisms or hot word;
Neologisms or hot word provide unit, for by input method server, in real time described neologisms or hot word are offered to the user in network;
Wherein, the words that described acquiring unit is selected by input method specifically for collecting user, adds up the prerequisite whether this words meets neologisms or hot word, and if so, input method server carries out record by these neologisms or hot word;
Described neologisms or hot word provide unit specifically for the neologisms of described record or hot word are offered to input method user in real time with presetting rule;
Wherein, described neologisms or hot word provide unit to comprise:
Candidate item provides unit, for carry out the process of words input user, applies in real time described neologisms or hot word and provides word candidate item for the user in network;
Wherein, described candidate item provides unit to comprise:
Judging unit, when comprising the candidate item with described neologisms or hot word repeated code when described word candidate item, user in judgement network need to input the probability of described neologisms or hot word, if described probability meets prerequisite, described neologisms or hot word are offered to the user in described network as candidate item;
Wherein, described system also comprises: characteristic acquisition unit, for obtaining described neologisms or user's characteristic information corresponding to hot word;
Described judging unit specifically for, judge whether the user in described network has described neologisms or user's characteristic information corresponding to hot word, if had, described neologisms or hot word are offered to the user in described network as candidate item;
And/or,
Keyword acquiring unit, for obtaining the keyword with described neologisms or hot word with semantic collocation relation;
Described judging unit specifically for, judge in the context of the current input of user in described network whether comprise described keyword, if comprised, described neologisms or hot word are offered to the user in described network as candidate item.
13. systems according to claim 12, it is characterized in that, described user's characteristic information comprises user's location message, described judging unit specifically for, judge whether the user in described network is positioned at the region that described location message is corresponding, if so, described neologisms or hot word are offered to the user in described network as candidate item.
14. systems according to claim 12, is characterized in that, while comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Group word unit, for applying in real time the corresponding relation of described neologisms or hot word and described keyword, organizes word for described coded string, and group word result is offered to the user in described network.
15. systems according to claim 14, is characterized in that, described group of word unit comprises:
Group word result is obtained subelement, for obtaining the group word result for described coded string, and each group word result is given a mark;
Mark is adjusted subelement, for comprising described neologisms or hot word when certain group word result, and comprises the keyword corresponding with these neologisms or hot word in this group word result, increases the mark of this group word result;
Result provides subelement, for according to the final mark of each group word result, described group of word result is offered to the user in described network.
16. according to the system described in claims 14 or 15, it is characterized in that, when the described group word result providing is during at least two, also comprises:
First represents unit, for distinguishing and represent including the group word result of described neologisms or hot word and other group word results.
17. systems according to claim 12, is characterized in that, while comprising at least two entries in the coded string of the user's input in network, described candidate item provides unit also to comprise:
Unit is provided first, for when described at least two entries comprise described neologisms or hot word, described neologisms or hot word is offered to the user in described network;
Again organize word unit, for judging whether the user of described network accepts described neologisms or hot word, if accept, start to organize forward and/or backward word from described neologisms or hot word, for the user in network provides the complete candidate item for described coded string.
18. systems according to claim 12, is characterized in that, described neologisms or hot word provide unit to comprise:
Correlated information exhibition unit, for the user of network represents described neologisms or hot word, and provides the entrance that obtains described neologisms or relevant information corresponding to hot word in real time.
19. systems according to claim 12, is characterized in that, described acquiring unit comprises:
Classification subelement, for obtaining each user's of network user's characteristic information, classifies to each user in network based on described user's characteristic information, obtains at least two class of subscribers;
Obtain subelement, for obtaining neologisms or the hot word for each class of subscriber from the words of described user's input.
20. systems according to claim 19, it is characterized in that, described neologisms or hot word provide unit specifically for, judge whether the user in described network belongs to described neologisms or class of subscriber corresponding to hot word, if belonged to, described neologisms or hot word are offered to the user in described network.
21. systems according to claim 12, is characterized in that,
Described neologisms or hot word provide unit specifically for the neologisms of described record or hot word are offered to input method user in real time with presetting rule.
22. systems according to claim 12, is characterized in that, also comprise:
Storage unit, after obtaining neologisms or hot word, is kept at described neologisms or hot word in the neologisms or hot word dictionary of input method server end;
Described neologisms or hot word provide unit specifically for, by input method server, in real time neologisms or hot word in described neologisms or hot word dictionary are offered to the user in network.
CN201010113873.5A 2010-02-24 2010-02-24 A method and a system for providing new or popular terms Active CN102163198B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010113873.5A CN102163198B (en) 2010-02-24 2010-02-24 A method and a system for providing new or popular terms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010113873.5A CN102163198B (en) 2010-02-24 2010-02-24 A method and a system for providing new or popular terms

Publications (2)

Publication Number Publication Date
CN102163198A CN102163198A (en) 2011-08-24
CN102163198B true CN102163198B (en) 2014-10-22

Family

ID=44464431

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010113873.5A Active CN102163198B (en) 2010-02-24 2010-02-24 A method and a system for providing new or popular terms

Country Status (1)

Country Link
CN (1) CN102163198B (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103370677A (en) * 2011-06-29 2013-10-23 宇龙计算机通信科技(深圳)有限公司 Mobile terminal and method, system for inputting network hot words into mobile terminal
CN102955825B (en) * 2011-08-30 2016-04-06 北京搜狗科技发展有限公司 A kind of method and system upgrading input method dictionary
CN103150310A (en) * 2011-12-07 2013-06-12 腾讯科技(深圳)有限公司 Method and device for extracting hot spot information
CN102710795B (en) * 2012-06-20 2015-02-11 北京奇虎科技有限公司 Hotspot collecting method and device
CN103678298B (en) * 2012-08-30 2016-04-13 腾讯科技(深圳)有限公司 A kind of information displaying method and equipment
CN105164672A (en) * 2013-05-01 2015-12-16 惠普发展公司,有限责任合伙企业 Content classification
CN104345899B (en) * 2013-08-08 2018-01-19 阿里巴巴集团控股有限公司 Field conversion method and client for input method
US8768712B1 (en) * 2013-12-04 2014-07-01 Google Inc. Initiating actions based on partial hotwords
CN104834638B (en) * 2014-02-10 2019-07-05 腾讯科技(深圳)有限公司 A kind of hot word methods of exhibiting, device and electronic equipment
WO2016093836A1 (en) 2014-12-11 2016-06-16 Hewlett Packard Enterprise Development Lp Interactive detection of system anomalies
CN104572846B (en) * 2014-12-12 2018-10-16 百度在线网络技术(北京)有限公司 A kind of hot word recommendation methods, devices and systems
CN105069064B (en) * 2015-07-29 2019-04-30 百度在线网络技术(北京)有限公司 Acquisition methods and device, the method for pushing and device of vocabulary
TWI614718B (en) * 2016-01-21 2018-02-11 Gamania Digital Entertainment Co Ltd Method for accumulating corresponding scores according to types of information transmitted by terminal devices
CN106125955B (en) * 2016-06-23 2019-05-07 百度在线网络技术(北京)有限公司 A kind of method and apparatus for the offer hot word in input method is applied
CN107544685A (en) * 2016-06-29 2018-01-05 百度在线网络技术(北京)有限公司 Information-pushing method and device
CN106445915B (en) * 2016-09-14 2020-04-28 安徽科大讯飞医疗信息技术有限公司 New word discovery method and device
CN106933379A (en) * 2017-02-13 2017-07-07 北京奇虎科技有限公司 The generation method and device of a kind of dictionary
US10419269B2 (en) 2017-02-21 2019-09-17 Entit Software Llc Anomaly detection
CN107423444B (en) * 2017-08-10 2020-05-19 世纪龙信息网络有限责任公司 Hot word phrase extraction method and system
CN109426356B (en) * 2017-09-01 2022-07-15 百度在线网络技术(北京)有限公司 Information input method and device
CN108182174B (en) * 2017-12-27 2019-03-26 掌阅科技股份有限公司 New words extraction method, electronic equipment and computer storage medium
CN108399013B (en) * 2018-03-16 2022-08-09 北京搜狗科技发展有限公司 User word adding method and device
CN110750706B (en) * 2018-07-19 2023-04-28 阿里巴巴集团控股有限公司 Search hotword determining method, device and system and electronic equipment
CN109214167B (en) * 2018-08-01 2021-04-16 深圳市文鼎创数据科技有限公司 Intelligent key safety equipment and key recovery method and storage medium thereof
CN110471537A (en) * 2019-08-22 2019-11-19 广东创能科技股份有限公司 A kind of WEB cloud input method based on B/S framework
CN112559699A (en) * 2020-11-09 2021-03-26 联想(北京)有限公司 Information interaction method, device and equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924858A (en) * 2006-08-09 2007-03-07 北京搜狗科技发展有限公司 Method and device for fetching new words and input method system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290632B (en) * 2008-05-30 2011-09-14 北京搜狗科技发展有限公司 Input method for user words participating in intelligent word-making and input method system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924858A (en) * 2006-08-09 2007-03-07 北京搜狗科技发展有限公司 Method and device for fetching new words and input method system

Also Published As

Publication number Publication date
CN102163198A (en) 2011-08-24

Similar Documents

Publication Publication Date Title
CN102163198B (en) A method and a system for providing new or popular terms
CN106663125B (en) Question generation device and recording medium
CN101390042B (en) Disambiguating ambiguous characters
CN101420313B (en) Method and system for clustering customer terminal user group
CN102096717B (en) Search method and search engine
CN106709040B (en) Application search method and server
CN104598588B (en) Microblog users label automatic generating calculation based on double focusing class
US20160147866A1 (en) Processing user profiles
CN1924858B (en) Method and device for fetching new words and input method system
CN1936893B (en) Method and system for generating input-method word frequency base based on internet information
JP6759308B2 (en) Maintenance equipment
CN110888990B (en) Text recommendation method, device, equipment and medium
CN106062730A (en) Systems and methods for actively composing content for use in continuous social communication
CN111708740A (en) Mass search query log calculation analysis system based on cloud platform
CN105759983A (en) System and method for inputting text into electronic devices
WO2011042907A1 (en) Method and system for assisting in typing
CN101385025A (en) Analyzing content to determine context and serving relevant content based on the context
CN104991943A (en) Music searching method and apparatus
CN103023753A (en) Method, client-side and system for interactive content correlation output in instant messaging interaction
JP2008529179A (en) Method and apparatus for accessing mobile information in natural language
CN110298029A (en) Friend recommendation method, apparatus, equipment and medium based on user's corpus
WO2001053970A2 (en) A system and method for matching requests for information with sources thereof
JP6994289B2 (en) Programs, devices and methods for creating dialogue scenarios according to character attributes
CN101923556A (en) Method and device for searching webpages according to sentence serial numbers
CN105279159B (en) The reminding method and device of contact person

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant