CN103631929B - A kind of method of intelligent prompt, module and system for search - Google Patents

A kind of method of intelligent prompt, module and system for search Download PDF

Info

Publication number
CN103631929B
CN103631929B CN201310653732.6A CN201310653732A CN103631929B CN 103631929 B CN103631929 B CN 103631929B CN 201310653732 A CN201310653732 A CN 201310653732A CN 103631929 B CN103631929 B CN 103631929B
Authority
CN
China
Prior art keywords
word
candidate word
suffix
search
prefix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310653732.6A
Other languages
Chinese (zh)
Other versions
CN103631929A (en
Inventor
罗晶
尹岩
严敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Wisedu Information Co Ltd
Original Assignee
Jiangsu Wisedu Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Wisedu Information Co Ltd filed Critical Jiangsu Wisedu Information Co Ltd
Priority to CN201310653732.6A priority Critical patent/CN103631929B/en
Publication of CN103631929A publication Critical patent/CN103631929A/en
Application granted granted Critical
Publication of CN103631929B publication Critical patent/CN103631929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query

Abstract

The invention discloses a kind of method of intelligent prompt, module and system for search.The method according to the invention, server performs following steps: separate prefix word and suffix word by segmenter;Synonym is extended to prefix synonym and suffix Alphabetical List;Then traversal hot word suffix tree searches prefix matching and or the hot word acquisition candidate word of suffix match;Again by the probability of each candidate word of analytical calculation of user's historical search behavior.Client executing following steps: calculate the locally associated degree of candidate word;Calculate the click discreet value of candidate word, then according to clicking on discreet value from selecting candidate word to show.In the present invention, cue is obtained by prefix word and suffix word coupling, and combines synonym, and combines the intention of numerous user search, in conjunction with locally associated degree, so that cue is closer to user search intent.

Description

A kind of method of intelligent prompt, module and system for search
Technical field
The present invention relates to the keyword search in data search, data mining, particularly relate to keyword input in artificial Intelligence.
Background technology
Intelligent prompt is that a kind of help user clearly inputs intention, facilitates user fast to input, improves the side of Consumer's Experience Method.Intelligent prompt is mainly used in search engine and development platform, can be according to the input of user, by combobox or mark What label etc. were different represents form, automatically points out to user.
User's search history data that the most first statistical server end of main flow search engine preserves at present, according to search word Search rate, set up popular word dictionary, after user inputs keyword, according to the method for string prefix coupling, from hot topic Word dictionary is searched candidate's cue, filters out cue further according to search rate, be presented in below search box successively.This intelligence Can point out, utilize string prefix matched and searched candidate's cue, some candidates relevant to search keyword may be omitted Cue.Utilize search rate screening candidate's cue in popular word dictionary, be not bound with the search history that active user is local Data, may result in the cue and user search intent deviation provided.The habitual language that has its source in of the problems referred to above occurs Speech expression way.In Chinese, the word of modification noun is always before modificand.Such as " casual pants ", only wherein " lies fallow " It is qualifier, and " trousers " are only main noun.User, after client input " casual pants ", is screened by the mode of prefix matching Go out is all the content relevant to " leisure ".But actually user is primarily intended to search the content relevant to " trousers ".This causes carrying Show that obvious deviation occur in word and user search intent.
Summary of the invention
Problem to be solved by this invention is the rational problem of cue in search engine.
For solving the problems referred to above, the scheme that the present invention uses is as follows:
According to a kind of intelligent prompt method for search of the present invention, including client and server, client and clothes Business device is connected by network, and the method comprises the following steps:
S21: client obtains init string;
S22: client sends init string to server;
S29: server receives init string;
S3: server obtains candidate word information list according to init string search hot word;
Candidate word information list is sent to terminal by S41: server;
S49: client receives candidate word information list;
S5: client obtains candidate word list according to candidate word information list;
S91: client shows candidate word list;
It is characterized in that, described step S3 includes:
S31: server splits init string according to segmenter and obtains prefix word and suffix word;
S32: server searches acquisition prefix synonym and suffix synonym according to prefix word and suffix word in thesaurus Word;
S33: server traversal hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that candidate word information List;
Wherein, described thesaurus is that server is for preserving the database of synonym incidence relation between keyword;Described Hot word suffix tree to be server search for hot word according to the high frequency in hot word bank sets up according to the data structure of generalized suffix tree; Described hot word bank is that server is for preserving the database of hot word information;Described hot word information include hot word, hot word sequence number and The hot word search frequency;Described prefix matching is that the prefix of hot word matches with described prefix word or prefix synonym;Described Suffix match is that the suffix of hot word mates with described suffix word or suffix synonym.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that the method also includes:
S34: server is according to the probability of each candidate word of analytical calculation of user's historical search behavior database;
Wherein, described user's historical search behavior database is used for preserving historical behavior information.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that described step S34 includes:
It is identical with init string that S34a1: server searches original character string in user's historical search behavior database And click on the historical behavior information that hot word is identical with candidate word, it is thus achieved that the click frequency of candidate word;
S34a2: server does the probability of normalized acquisition candidate word according to candidate word is clicked on the frequency;
Wherein, described historical behavior information includes original character string, clicks on hot word and click on the frequency.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that described step S34 includes:
S34b1: according to candidate word in user's historical search behavioral data library lookup historical behavior information;
S34b2: add up different prefix matching modes and the click under different suffix match modes under this historical behavior information The frequency;
S34b3: the click frequency under different prefix matching modes and different suffix match mode is carried out natural logrithm fortune Calculate the logit value obtained under different prefix matching mode and different suffix match mode;
S34b4: according to binary linear regression parametric equation computing formulaMiddle parameterValue;
S34b5: according to formulaCalculate the probability of candidate word, wherein
S34b6;The probability of the candidate word of each candidate word of normalized;
Wherein, described historical behavior information includes clicking on hot word, the click frequency of nine kinds of candidate word match-types.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that described step S5 includes:
S51: client calculates this locality of each candidate word in candidate word information list according to local historical search data storehouse The degree of correlation;
S52: client is estimated according to locally associated degree, the click of the candidate word information each candidate word of calculating of candidate word Value;
S53: client chooses candidate word list according to the click discreet value of candidate word from candidate word information list;
Wherein, described local historical search data storehouse is that client is for preserving local historical search information;Described this locality Historical search information includes local historical search character string, local historical search time, the local historical search frequency;Described step S51 includes:
S511: by segmenter by the local historical search character string in local historical search data storehouse and candidate word information row Candidate word in table splits into lists of keywords and calculates the statistics frequency of each keyword;
S512: build keyword space vector according to the statistics frequency of the keyword in lists of keywords;
S513: according to the statistics frequency structure candidate word sky of keyword keyword in lists of keywords that candidate word splits Between vector;
S514: calculate keyword space vector and the cosine value of candidate word space vector, it is thus achieved that candidate word locally associated Degree.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that institute in described step S511 The statistics frequency calculating keyword stated includes the step of the statistics frequency of temporally weighted calculation.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that in described step S52:
CTR = A×R×C;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R is candidate word Locally associated degree;C is the constant that the type according to candidate word determines.
Further, according to the intelligent prompt method for search of the present invention, it is characterised in that in described step S52:
CTR = A×R×C×P;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R is candidate The locally associated degree of word;C is the constant that the type according to candidate word determines;P is the search frequency of candidate word;Wherein said time Word information is selected also to include the search frequency of candidate word.
A kind of intelligent prompt device for search according to the present invention, it is characterised in that including:
Participle device, is used for splitting init string and obtains prefix word and suffix word;
Synonym expanding unit, for according to prefix word and suffix word searches in thesaurus acquisition prefix synonym with after Sew synonym;
Suffix tree traversal device, for travel through hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that Candidate word information list;Described prefix matching is that the prefix of hot word matches with described prefix word or prefix synonym;Described The suffix that suffix match is hot word mate with described suffix word or suffix synonym;
Hot word bank construction device, is used for preserving the database of hot word information for management and maintenance;
Suffix tree construction device, is used for managing and safeguard hot word suffix tree;Described hot word suffix tree be server according to High frequency search hot word in hot word bank is set up according to the data structure of generalized suffix tree;
Historical behavior analytical equipment, for each candidate word of analytical calculation according to user's historical search behavior database Probability;
User's historical search behavior database device, is used for preserving historical behavior information.
Further, according to a kind of intelligent prompt system for search of the present invention, including client and server, client End is connected by network with server, it is characterised in that:
Described server includes:
Word-dividing mode, is used for splitting init string and obtains prefix word and suffix word;
Synonym expansion module, for according to prefix word and suffix word searches in thesaurus acquisition prefix synonym with after Sew synonym;
Suffix tree spider module, for travel through hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that Candidate word information list;Described prefix matching is that the prefix of hot word matches with described prefix word or prefix synonym;Described The suffix that suffix match is hot word mate with described suffix word or suffix synonym;
Hot word bank builds module, is used for preserving the database of hot word information for management and maintenance;
Suffix tree builds module, is used for managing and safeguard hot word suffix tree;Described hot word suffix tree be server according to High frequency search hot word in hot word bank is set up according to the data structure of generalized suffix tree;
Historical behavior analyzes module, for each candidate word of analytical calculation according to user's historical search behavior database Probability;
User's historical search behavioral data library module, is used for preserving historical behavior information;
Described client includes:
Locally associated degree computing module, for calculating in candidate word information list each according to local historical search data storehouse The locally associated degree of candidate word;
Click on discreet value computing module, calculate each candidate for the locally associated degree according to candidate word, candidate word information The click discreet value of word;
Candidate word chooses module, chooses candidate word row for the click discreet value according to candidate word from candidate word information list Table;
Local historical search data library storage module, is used for preserving local historical search information, and described local history is searched Rope information includes local historical search character string, local historical search time, the local historical search frequency;
Described locally associated degree computing module includes:
Keyword distribution statistics module, is used for the local historical search word in local historical search data storehouse by segmenter Candidate word in symbol string and candidate word information list splits into lists of keywords and calculates the statistics frequency of each keyword;
Keyword space vector builds module, builds key for the statistics frequency according to the keyword in lists of keywords Word space vector;
Candidate word space vector builds module, for the keyword keyword in lists of keywords split according to candidate word The statistics frequency build candidate word space vector;
Vector cosine computing module, for calculating the cosine value of keyword space vector and candidate word space vector, it is thus achieved that The locally associated degree of candidate word.
The technique effect of the present invention is as follows:
1, in the present invention, cue is obtained by prefix word and suffix word coupling, and combines synonym, is therefore easier to Close to the implication expressed by language.
2, in the present invention, the structure hot word generalized suffix tree that fits through of prefix word and suffix word realizes, and combines hot word Sequence number so that search procedure is quick, and the CPU time of consumption is few.
3, in the present invention, final cue combines probability calculation, the meaning of the Probabilistic Synthesis of calculating numerous users search Figure, so that cue is closer to user search intent.
4, in the present invention, final cue combines locally associated degree, analyzes user by user's search history and searches for meaning Figure, so that cue is closer to user search intent.
Detailed description of the invention
Below the present invention be invention and claims and be described in further detail.
One, the application scenario of the present invention and applied environment
The present invention is applied to the intelligent prompt of search engine.During search, user is needed by the text edit box input of webpage Character string to be searched for, then according to assembly of the invention, method or system, with combobox under the text edit box of webpage Multiple cues that form display user may search for, after user selects the cue in combobox, search engine is according to prompting Word scans for.Certainly, after there is the combobox of cue, user can not also select combobox to continue with text, then Search engine scans for according to the text of input.The benefit using the combobox of intelligent prompt is to facilitate user to input, and reduces It is artificial and time-consuming that user version inputs.The acquisition main process of cue can be generalized into following steps:
S21: client obtains init string;
S22: client sends init string to server;
S29: server receives init string;
S3: server obtains candidate word information list according to init string search hot word;
Candidate word information list is sent to terminal by S41: server;
S49: client receives candidate word information list;
S5: client obtains candidate word list according to candidate word information list;
S91: client shows candidate word list;
In said process, client can mainly occur with form web page.Special application journey can certainly be fabricated to Sequence realizes.The client of form web page is typically mounted on user terminal.User accesses the clothes of search engine in the way of webpage Business device.Certainly, in the present invention, client can also be arranged on server side.The situation that client is positioned at server side is all right It is interpreted as that certain application program is divided into client modules and server module, client modules and server module and is respectively this Bright client and server.Now, " network " that be used for connecting both between client modules and server module can be managed Solution becomes communication mode the most widely, such as by local internal memory, or pipeline (Pipe), or socket (Socket) etc..
In said process, in step S21, " client acquisition init string " can be understood as that aforementioned " user passes through webpage Text edit box input need search character string ".According to the aforementioned understanding to client, " client obtains original character String " step can also pass through other forms.In general, the init string that client obtains is by the character being manually entered String, and be that user's input process is obtained by client, the most also non-user finally needs the character string of search.
In said process, in step S91, " client displaying candidate word list " can be understood as the aforesaid " literary composition at webpage Show, with the form of combobox, multiple cues that user may search under this edit box ", candidate word is also cue, multiple Cue constitutes candidate word list.
Said process can be understood as the prior art of the present invention, because a lot of search engine is also really according to above-mentioned steps Realize the process of intelligent prompt.The present invention solves problem to be solved by this invention being embodied as by step S3 and step S5 Realize.The follow-up description of this specification is embodied as and and step S3, S5 phase mainly for step S3 and step S5 The technology contents closed.And for other steps in said process, it will be understood by those skilled in the art that this specification is the most detailed State.
Two, the basic conception in this specification
The keyword of indication of the present invention is can be expressed certain semantic word by segmenter by obtain after character string fractionation. Such as " casual pants " split after obtain two keywords, " leisure " and " trousers ".
The prefix word of indication of the present invention is first key in the keyword that will be obtained after character string fractionation by segmenter Word.Such as " casual pants " split after obtain two keywords, " leisure " and " trousers ".Wherein " lie fallow " is prefix word.
The suffix word of indication of the present invention is that last in the keyword that will be obtained after character string fractionation by segmenter is closed Keyword.Such as " casual pants " split after obtain two keywords, " leisure " and " trousers ".Wherein " trousers " are suffix word.
It will be appreciated by those skilled in the art that if character string can only obtain a keyword after being split by segmenter, then should Keyword be i.e. prefix word be again suffix word.
The candidate word of indication of the present invention is the character string being made up of one or more keywords.
The candidate word list of indication of the present invention can be understood as the array of multiple candidate word composition.
The candidate word information of indication of the present invention includes candidate word and the attribute information of candidate word or only candidate word.Wait The attribute information selecting word can include the search frequency of candidate word, the probability of candidate word and/or the locally associated degree of candidate word.
The candidate word information list of indication of the present invention can be understood as the array of multiple candidate word information composition.
The segmenter of indication of the present invention is module or the device for character string splits into multiple keyword, mainly passes through Character string is split into multiple keyword by dictionary lookup.It will be appreciated by those skilled in the art that segmenter is prior art.In the present invention Specific implementation process in, segmenter can by market buy obtain, it is also possible to oneself structure.
The hot word of indication of the present invention is the character string being made up of one or more keywords, is used for preserving user for server The character string of search history.
The hot word information of indication of the present invention includes that hot word, hot word sequence number, hot word search for the frequency.Wherein hot word sequence number is used for building The vertical index quickly searched, the hot word search frequency is for adding up the number of times that hot word is searched.
Three, embodiment 1
In the present embodiment, abovementioned steps S3 is realized by following steps:
S31: server splits init string according to segmenter and obtains prefix word and suffix word;
S32: server searches acquisition prefix synonym and suffix synonym according to prefix word and suffix word in thesaurus Word;
S33: server traversal hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that candidate word information List.
In the present embodiment, thesaurus is that server is for preserving the database of synonym incidence relation between keyword.With Justice dictionary is generally provided by business dictionary, it is also possible to oneself is set up.
In the present embodiment, step S31 is realized by word-dividing mode or device.Word-dividing mode or device namely aforesaid participle Device.It will be appreciated by those skilled in the art that the prefix word after step S31 processes and suffix word are probably identical.Prefix word and after Sew word identical when, prefix synonym is the most identical with suffix synonym, and therefore step S32 can be done simplification and processes, and only needs The synonym of prefix word to be searched for or the synonym of suffix word.
In the present embodiment, step S32 is realized by synonym expansion module or device.It will be appreciated by those skilled in the art that a word Synonym may have multiple, therefore step S32 obtain prefix synonym and suffix synonym are usually a list.
In the present embodiment, step S33 is realized by suffix tree spider module or device.Here, before prefix matching is hot word Sew and match with described prefix word or prefix synonym;Suffix match is suffix and described suffix word or the suffix synonym of hot word Coupling.The hot word of search that what " with or " in " prefix matching and or suffix match " represented is may meet prefix matching or after Sew coupling or prefix suffix all mates.Suffix tree spider module or device are realized by traversal hot word suffix tree.Hot word suffix Set and set up for server searches for the data structure of hot word foundation generalized suffix tree according to the high frequency in hot word bank.Hot word suffix tree Foundation build module or device by suffix tree and realize.Suffix tree builds module or device, after being used for managing and safeguarding hot word Sew tree.Total well known, suffix tree (Suffix tree) is for for supporting the tree-like of effective string matching and inquiry Data structure.Suffix tree can express a character string, and generalized suffix tree can express multiple character string.The structure of generalized suffix tree Building and traversal is prior art, this specification is not repeated.It should be noted that the hot word in hot word suffix tree is from hot word Storehouse, but the hot word in hot word suffix tree does not comprise all of hot word in hot word bank, the simply hot word of hot word bank medium-high frequency search. The hot word of high frequency search can be by obtaining according to the search frequency sequence of hot word hot word all of in hot word bank: first basis Hot word in hot word bank is sorted by the search frequency of hot word in descending order, then obtains top n hot word in the hot word after sequence. N is the most usually previously set, and such as 10000 or 100000 etc..Highly efficient method can also be before sequence The search frequency doing the threshold filtering once by the search frequency of hot word, only hot word is just entered more than the hot word of a certain setting threshold value Row sequence.
In the present embodiment, aforesaid hot word bank be server for preserving the database of hot word information, these data are also used for Preserve user's search history.Preserve user's search history and built module or device realization by hot word bank.Hot word bank build module or Device is for management and safeguards the database for preserving hot word information.Hot word information includes that hot word, hot word sequence number, hot word are searched for The frequency.The process preserving user's search history is as follows: user submits to searched character string to ask by user end to server After search, server receives after searched character string while performing search, also performs to make searched character string Add the step to hot word bank for hot word: if hot word bank has been preserved this searched character string, then will be corresponding The hot word search frequency adds 1, otherwise by character string searched for preservation to hot word bank, and the search frequency of this hot word is set to 1.
It should be noted that the array that the candidate word information list that step S33 obtains is multiple candidate word information composition.This In embodiment, candidate word information is only hot word, and the candidate word list obtained in step S5 is candidate word information list.? Under other embodiments and follow-up embodiment candidate word information can include more content: the hot word sequence of such as candidate word Number, the attribute information of candidate word.
Four, embodiment 2
The present embodiment is set up on the basis of embodiment 1, specifically, add a step after step S33 of embodiment 1 Suddenly, i.e. step S34: server is according to the probability of each candidate word of analytical calculation of user's historical search behavior database.
Step S34 of the present embodiment is realized by historical behavior analytical equipment or device, problem to solve is that certain It is defeated that the statistical analysis of one specific candidate word user's historical search obtains user view under conditions of user inputs init string Enter the probability of this candidate word.The input of the present embodiment is the candidate word information list that step S33 obtains, and output is also believed for candidate word Breath list, but the candidate word information in the candidate word information list of output adds the probability of candidate word.
The calculating of the probability of candidate word is calculated by user's historical search behavioural analysis and obtains.User's historical search behavior number According to being saved in user's historical search behavior database, this process is real by device or the module of user's historical search behavior database Existing.User's historical search behavior database saves historical behavior information.The method realizing step S34 has a variety of.The present invention Specification provides two kinds of embodiments therein: embodiment 1 and embodiment 2.Wherein embodiment 1 is a kind of simple Embodiment.Embodiment 2 is by the logistic regression algorithm method to the match-type statistical analysis of candidate word.
Embodiment 1
If historical behavior information includes original character string, clicks on hot word and click on the frequency.Server is in user's historical search Behavior database is searched the historical behavior letter that original character string is identical with init string and click hot word is identical with candidate word Breath.The click frequency in historical behavior information can be as the probability of candidate word.It is greater than the integer of 0 owing to clicking on the frequency, and Probability in general sense is the value between 0 ~ 1, can also click on each candidate word after the frequency does normalized for this and make For the probability of candidate word, click on frequency normalized and be referred to following method: set and candidate word information list includes K Candidate word, the click frequency of each candidate word is respectively as follows:, then the probability of i-th candidate word is:.Under present embodiment, said process can be summarized as:
It is identical with init string that S34a1: server searches original character string in user's historical search behavior database And click on the historical behavior information that hot word is identical with candidate word, it is thus achieved that the click frequency of candidate word;
S34a2: server does the probability of normalized acquisition candidate word according to candidate word is clicked on the frequency.
Under present embodiment, historical behavior information generates by the following method: when after client executing step S91, user The candidate word list shown in step S91 can be selected.After user selects the candidate word list shown in step S91, initial word Symbol string and selected candidate word are simultaneously sent to server, and ask retrieval.Server receives init string and selected Candidate word after, perform retrieval and aforementioned selected candidate word added while hot word bank step, also performing will be initial Character string and selected candidate word add the step of access customer historical search behavior database.Here, init string is and goes through Original character string in history behavioural information, selected candidate word is click hot word.Init string and selected candidate Word adds the realization by the following method of access customer historical search behavior database: if in user's historical search behavior database The saved corresponding relation record having original character string and clicking on hot word, then add 1 by clicking on the frequency accordingly, otherwise will preserve former Beginning character string and click hot word are to hot word bank, and are set to 1 by clicking on the frequency accordingly.
Embodiment 2
If historical behavior information includes clicking on hot word, the click frequency of nine kinds of candidate word match-types.Nine kinds of candidate word Join type and include five kinds of fundamental types: non-matching type, prefix matching type, suffix match type, prefix synonym match-type, Suffix synonym match-type;And four kinds of composite types: prefix suffix match type, prefix suffix synonym match-type, prefix Coupling suffix syntype and prefix synonym suffix match type.The match-type of above-mentioned nine kinds of candidate word is grouped into two solely Vertical variable: x1And x2。x1Representing prefix matching mode, possible values is that prefix is not mated, prefix synonym mates, prefix matching, point Not with 1,4,5 numeric representations.x2Representing suffix match mode, possible values is that suffix does not mates, suffix synonym mates, suffix Join, respectively with 1,4,5 numeric representations.The probability that then candidate word is chosen is:
, wherein,For undetermined parameter. Following it isThe computational methods of undetermined parameter.
The probability that candidate word is not chosen is:
The ratio of probability that candidate word is chosen and the probability that candidate word is not chosen is:
Obtain after logit conversion:
Under present embodiment, can obtain according to the click frequency of the various candidate word match-types in historical behavior information The value of logit and the value of x1 and x2.
If one clicks on the value that in the historical behavior information that hot word is corresponding, the click frequency of nine kinds of candidate word match-types preserves For:
{ 73,98,119,67,89,342,137,123,99}.
Then can obtain the data of following form:
x1 x2 Click on the frequency Logit value
1(prefix is not mated) 1(suffix does not mates) 73 4.29
1(prefix is not mated) 4(suffix synonym mates) 89+137+123=349 5.86
1(prefix is not mated) 5(suffix match) 119+342+99=560 6.33
4(prefix synonym mates) 1(suffix does not mates) 67+137+99=303 5.71
4(prefix synonym mates) 4(suffix synonym mates) 137 4.92
4(prefix synonym mates) 5(suffix match) 99 4.60
5(prefix matching) 1(suffix does not mates) 98+342+123=563 6.33
5(prefix matching) 4(suffix synonym mates) 123 4.81
5(prefix matching) 5(suffix match) 342 5.83
According to the data of above table, use binary linear regression parametric equation i.e. can obtain this click hot wordParameter value.Then the probability calculation chosen further according to aforesaid candidate word obtains the probability of current candidate word. The probability obtaining candidate word further can also do normalized.Under present embodiment, said process may be summarized to be following Step:
S34b1: according to candidate word in user's historical search behavioral data library lookup historical behavior information;
S34b2: add up different prefix matching modes and the click under different suffix match modes under this historical behavior information The frequency;
S34b3: the click frequency under different prefix matching modes and different suffix match mode is carried out natural logrithm fortune Calculate the logit value obtained under different prefix matching mode and different suffix match mode;
S34b4: according to binary linear regression parametric equation computing formulaMiddle parameterValue;
S34b5: according to formulaCalculate the probability of candidate word, wherein
S34b6;The probability of the candidate word of each candidate word of normalized.
Under present embodiment, user's historical search behavior database can be independent database;Can also be with aforesaid Hot word bank is same for merging into, and i.e. preserves historical behavior information by aforesaid hot word bank.History row is preserved using hot word bank For under the mode of information, hot word bank also saving the click frequency of the various candidate word match-types in historical behavior information, heat Hot word in dictionary is the click hot word in aforementioned historical behavior information, the click frequency of nine kinds of candidate word match-types total The search frequency with the hot word being in previous embodiment 1.Under present embodiment, as step S34 input, step S33 defeated Candidate word information in the candidate word information list gone out has included at least two contents: the coupling class of hot word sequence number and candidate word Type.The leaf node of aforesaid hot word suffix tree saves hot word sequence number, when performing step S33, and traversal hot word suffix tree coupling The candidate word obtained incidentally has gone up the hot word sequence number of hot word suffix tree leaf node preservation and according to the mode mated, and travels through hot word The candidate word information that suffix tree coupling obtains also appends the match-type of candidate word.
Under present embodiment, historical behavior information is realized by the process preserving user's search history.In present embodiment The process preserving user's search history be with the difference of process preserving user's search history in previous embodiment 1: Under embodiment, in addition it is also necessary to scan for the differential counting of the frequency according to the match-type of candidate word.Under present embodiment, user After selecting the candidate word list shown in step S91, init string and selected candidate word are simultaneously sent to server, and (embodiment 1 that this process sees aforementioned the present embodiment) is retrieved in request.
By to above two embodiment, it will be appreciated by those skilled in the art that different steps S34 realizes typically requiring not Same mathematical method, and similar existing mathematics method of estimation has a variety of, the method therefore realizing step S34 also has a lot Kind.Those skilled in the art understand, the probability of the candidate word that step S34 obtains only estimate, actual being also impossible to has reached Complete accurate, therefore should exist by allowable error, also should allow the difference of parameter under above two embodiment.People in the art Member understands, above-mentioned steps S34 calculates the probabilistic process of candidate word and simply inputs for follow-up process, therefore in actual application also Can be using the product of the probability of candidate word obtained above and the search frequency of candidate word as the probability of candidate word.
Five, embodiment 3
The present embodiment is set up on the basis of embodiment 1 or embodiment 2, specifically, at embodiment 1 or the base of embodiment 2 On plinth, step S5 therein further improved and optimized.In the present embodiment, step S5 comprises the following steps:
S51: client calculates this locality of each candidate word in candidate word information list according to local historical search data storehouse The degree of correlation;
S52: client is estimated according to locally associated degree, the click of the candidate word information each candidate word of calculating of candidate word Value;
S53: client chooses candidate word list according to the click discreet value of candidate word from candidate word information list;
Wherein, described local historical search data storehouse is client for preserving local historical search information, described Local historical search information includes local historical search character string, local historical search time, the local historical search frequency.Step S51 is calculated device by locally associated degree or module realizes;Step S52 is estimated value calculation apparatus by click or module realizes;Step S53 by
Candidate word selecting device or module realize.Wherein, step S51 comprises the following steps:
S511: by segmenter by the local historical search character string in local historical search data storehouse and candidate word information row Candidate word in table splits into lists of keywords and calculates the statistics frequency of each keyword;
S512: build keyword space vector according to the statistics frequency of the keyword in lists of keywords;
S513: according to the statistics frequency structure candidate word sky of keyword keyword in lists of keywords that candidate word splits Between vector;
S514: calculate keyword space vector and the cosine value of candidate word space vector, it is thus achieved that candidate word locally associated Degree.
Wherein, step S511 is realized by keyword distribution statistics device or module;Step S512 is by keyword space vector Construction device or module realize;S513 is realized by candidate word space vector construction device or module;Step S514 is by vector cosine Calculate device or module realizes.Step S511 is divided into again two steps: step S511a: by segmenter by this locality historical search number Lists of keywords the statistics frequency calculating each keyword and step is split into according to the local historical search character string in storehouse S511b: by segmenter, the candidate word in candidate word information list split into lists of keywords and calculate the system of each keyword The meter frequency.Step S511a and step S511b obtain same lists of keywords after performing.Calculate for explanation above-mentioned steps 51 and wait Select the process of the locally associated degree of word, existing illustration.
It is provided with array lhi that content is a length of n in local historical search data storehouse, and is defined as follows:
struct LocalHistInfo
{
String sSearch;
DateTime tRecent;
int nCount;
} lhi[n];
Each member of array lhi is local historical search information.Local historical search information structure LocalHistInfo represents.Wherein, sSearch is local historical search character string;TRecent is the local historical search time, Record is the last time searched for;NCount is the local historical search frequency.Step S511a can pass through procedure below Realize:
for (int i=0;i<n;i++)
{
struct LocalHistInfo item = lhi[i];
StringArray arKeys;
WordSplit(item.sSearch, arKeys);// by segmenter, by this locality historical search character String is divided into keyword
item.nCount = TimeWeightCount(item.tRecent,item.nCount);// temporally The frequency step of weighting
for (int j=0;j<arKeys.GetCount();j++)
{ // by the keyword after segmentation combines local historical search time and the local historical search frequency, joins In vKey
vKey.Add(arKeys[j], item.nCount);
}
}
Said process i.e. constitutes aforesaid step S511a.Wherein, vKey is for representing lists of keywords The example of VecterKey class.Add is the method for class VecterKey.It is defined as follows:
class VecterKey {
Array< KeyItem *> m_arData;
int VeckterKey::Add(string sKey, int nCount)
{
KeyItem * pItem=NULL;
bool bFind = FindKey(sKey, pItem);// lookup keyword has existed
if (!BFind) // if there is no the most newly-built keyword
{
pItem = new KeyItem;
pItem->sKey = sKey;
pItem->nCount = nCount;// add up the search frequency of this keyword
m_arData.Add(pItem);
Else // otherwise to keyword
pItem->nCount += nCount;// add up the search frequency of this keyword
return bFind;
} // end of Add
}; // end of VecterKey
Wherein KeyItem represents the structure of keyword, can be expressed as:
struct KeyItem
{
string sKey;
int nCount;
};
In the above results, sKey is keyword, and nCount is the statistics frequency that keyword is corresponding.
In like manner, step S511b also uses above-mentioned similar step S511a, joins aforementioned after candidate word candidate word being split VKey in, but the local historical search frequency of candidate word can be the hot word that 1 fixing or server hot word bank preserves The search frequency (seeing embodiment 1).It should be noted that local historical search word in above-mentioned local historical search data storehouse Step TimeWeightCount of the calculating frequency that symbol has temporally to weight when serially adding into lists of keywords vKey.This area Artisans understand that, the step of the calculating frequency that this temporally weights can also be omitted.The step of the calculating frequency temporally weighted Rapid is the preferential embodiment of the present invention.The calculating frequency temporally weighted according to being the local historical search time and current time Between time interval the local historical search frequency is adjusted.Simple method can be: when time interval more than 1 month then Weight coefficient is 1;If time interval is between two weeks and 1 month, weight coefficient is 2;Time interval is in a week and half Between individual month, then weight coefficient is set to 3;If time interval is less than a week, weight coefficient is set to 5.
After above-mentioned steps S511a and S511b have performed, obtain lists of keywords vKey.Extract all of pass in vKey The statistics frequency of keyword i.e. can get the keyword space vector Ks_Vector={ v in step S5121, v2, v3..., vm}. The number of keyword during wherein m is vKey, is expressed as the keyword space vector of m dimension;Each dimension values v of vectoriThe most right Answer the statistics frequency of each keyword.
Multiple keyword can be obtained after certain candidate word segmenter in candidate word information list is split use HintKeys represents.If the keyword in aforesaid vKey exists in HintKeys, then set vector value as this keyword The statistics frequency, being otherwise set to this vector value is the 0 candidate word space vector Hs_Vector={ w that can also obtain m dimension1, w2, w3..., wm}.In candidate word space vector Hs_Vector, if certain dimension wiCorresponding vector value is 0, then wiCorresponding pass Keyword the most, in lists of keywords HintKeys that candidate word splits, otherwise can not represent that this keyword is present in candidate word and tears open In lists of keywords HintKeys divided.The process of aforementioned acquisition candidate word space vector Hs_Vector is aforesaid step S513。
Keyword space vector Ks_Vector and step S513 of the m dimension obtained according to abovementioned steps S512 obtain The candidate word space vector Hs_Vector of m dimension use vector cosine formula i.e. to can get cosine value λ:
The process using above-mentioned formula to calculate cosine value λ is abovementioned steps S514.Cosine value λ can be used as candidate word Locally associated degree.As waiting after the cosine value of each above-mentioned candidate word can also being normalized in reality is implemented Select the locally associated degree of word: set the cosine value of each candidate word as { λ1, λ 2, λ 3 ..., λ K}, wherein K indicates K candidate word, The locally associated degree that then candidate word i is corresponding is:
Calculate candidate word in step S52 clicks on the subsequent step that discreet value process is step S51.The input of step S52 Depend on the calculated value of the locally associated degree of step S51.The click discreet value process of step S52 candidate word, this specification is given Two kinds of embodiments:
Embodiment 1:CTR=A × R × C;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R Locally associated degree for candidate word;C is the constant that the type according to candidate word determines.
Embodiment 2:CTR=A × R × C × P;Wherein CTR is the click discreet value of candidate word;A is the general of candidate word Rate;R is the locally associated degree of candidate word;C is the constant that the type according to candidate word determines;P is the search frequency of candidate word.
In above two embodiment, the probability of candidate word is the probability of the candidate word in embodiment 2, it can be seen that on State on the basis of two kinds of embodiments are built upon embodiment 2.Candidate in " C is the constant that the type according to candidate word determines " The type of word is the match-type of aforementioned candidates word.The match-type of candidate word is according to step S33 process in previous embodiment 1 The type of middle acquisition, typically has nine types.The match-type of nine kinds of candidate word refers to previous embodiment 2, is not repeated.Above-mentioned C in two kinds of embodiments is the constant that the match-type of nine kinds of candidate word determines, its concrete numerical value people in the art Member can be worth accordingly according to the application settings that the present invention is concrete.The search frequency of candidate word " P be " in above-mentioned embodiment 2 It it is the search frequency preserved in aforementioned hot word bank.According to above two embodiment, those skilled in the art can also obtain Go out other embodiment.Such as,
Embodiment 3:CTR=A × R;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R is The locally associated degree of candidate word.
Embodiment 4:CTR=A × R × P;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R Locally associated degree for candidate word;P is the search frequency of candidate word.
Embodiment 5:CTR=R × P;Wherein CTR is the click discreet value of candidate word;R is the locally associated of candidate word Degree;P is the search frequency of candidate word.
It should be noted that previous embodiment 2 finally illustrate can be by the search frequency of the probability of candidate word and candidate word Secondary product is as the probability of candidate word.Under this embodiment, A is A × P, and therefore embodiment 1 is equal to embodiment party Formula 2, embodiment 3 is equal to embodiment 4.
In embodiment 5, it is not necessary to the probability of candidate word is as input, under this embodiment, it is not necessary to make with embodiment 2 Based on, it is only necessary to based on embodiment 1.The present embodiment is for the purpose of realizing step S5, and previous embodiment 1 and enforcement Example 2 is for the purpose of realizing step S3.Therefore, if the input and output of the detailed description of the invention of step S5 be not related to step S3 or with Step S3 is unrelated, then the present embodiment can not independently constitute complete technical side based on embodiment 1 or embodiment 2 Case realizes the purpose of the present invention.
The simple embodiment of step S53 is: by the descending sort clicking on discreet value, candidate word is obtained candidate word sequence Queue, then selects front 10 or 20 candidate word as final candidate word list from candidate word sequencing queue.In step S5 The candidate word obtained is from the candidate word information list in abovementioned steps S41.It will be appreciated by those skilled in the art that previous embodiment In 1 or 2, with reference to step S53, step S39 before step S41, can also be included: by candidate word information by the probability of candidate word or The descending sort of the search frequency of candidate word obtains candidate word information sorting queue, then selects from candidate word information sorting queue Select front 20 or 30 candidate word information and perform step S41 as final candidate word information list.The presence or absence of step S39 is also Do not affect aforesaid embodiment 1 or embodiment 2 or the technical scheme of the present embodiment, nor affect on the rights protection model of the present invention Enclose.

Claims (7)

1., for an intelligent prompt method for search, including client and server, client and server are by network phase Even, the method comprises the following steps:
S21: client obtains init string;
S22: client sends init string to server;
S29: server receives init string;
S3: server obtains candidate word information list according to init string search hot word;
Candidate word information list is sent to client by S41: server;
S49: client receives candidate word information list;
S5: client obtains candidate word list according to candidate word information list;
S91: client shows candidate word list;
It is characterized in that, described step S3 includes:
S31: server splits init string according to segmenter and obtains prefix word and suffix word;
S32: server searches acquisition prefix synonym and suffix synonym according to prefix word and suffix word in thesaurus;
S33: server traversal hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that candidate word information arrange Table;
S34: server is according to the probability of each candidate word of analytical calculation of user's historical search behavior database;
Wherein, described prefix word is first keyword in the keyword that will be obtained after character string fractionation by segmenter;Institute The suffix word stated is last keyword in the keyword that will be obtained after character string fractionation by segmenter;Described thesaurus For server for preserving the database of synonym incidence relation between keyword;Described hot word suffix tree is that server is according to warm High frequency search hot word in dictionary is set up according to the data structure of generalized suffix tree;Described hot word bank is that server is for protecting Deposit the database of hot word information;Described hot word information includes hot word, hot word sequence number and the hot word search frequency;Described prefix matching Prefix for hot word matches with described prefix word or prefix synonym;Described suffix match be hot word suffix with described after Sew word or suffix synonym coupling;Described user's historical search behavior database is used for preserving historical behavior information;Described step Rapid S5 includes:
S51: client calculates the locally associated of each candidate word in candidate word information list according to local historical search data storehouse Degree;
S52: client calculates the click discreet value of each candidate word according to the locally associated degree of candidate word, candidate word information;
S53: client chooses candidate word list according to the click discreet value of candidate word from candidate word information list;
Wherein, described local historical search data storehouse is that client is for preserving local historical search information;Described local history Search information includes local historical search character string, local historical search time, the local historical search frequency;Described step S51 Including:
S511: by segmenter by the local historical search character string in local historical search data storehouse and candidate word information list Candidate word split into lists of keywords and calculate the statistics frequency of each keyword;
S512: build keyword space vector according to the statistics frequency of the keyword in lists of keywords;
S513: according to candidate word split keyword keyword in lists of keywords the statistics frequency build candidate word space to Amount;
S514: calculate the cosine value of keyword space vector and candidate word space vector, it is thus achieved that the locally associated degree of candidate word.
2. the intelligent prompt method for search as claimed in claim 1, it is characterised in that described step S34 includes:
S34a1: server searches in user's historical search behavior database that original character string is identical with init string and point Hit the historical behavior information that hot word is identical with candidate word, it is thus achieved that the click frequency of candidate word;
S34a2: server does the probability of normalized acquisition candidate word according to candidate word is clicked on the frequency;
Wherein, described historical behavior information includes original character string, clicks on hot word and click on the frequency.
3. the intelligent prompt method for search as claimed in claim 1, it is characterised in that described step S34 includes:
S34b1: according to candidate word in user's historical search behavioral data library lookup historical behavior information;
S34b2: add up different prefix matching modes and the click frequency under different suffix match modes under this historical behavior information;
S34b3: the click frequency under different prefix matching modes and different suffix match mode is carried out natural logrithm computing and obtains Obtain the logit value under different prefix matching modes and different suffix match modes;
S34b4: according to binary linear regression parametric equation computing formulaMiddle parameter Value;
S34b5: according to formulaCalculate the probability of candidate word, wherein
S34b6;The probability of the candidate word of each candidate word of normalized;
Wherein, described historical behavior information includes clicking on hot word, the click frequency of nine kinds of candidate word match-types;Described nine kinds Candidate word match-type be respectively as follows: non-matching type, prefix matching type, suffix match type, prefix synonym match-type, after Sew synonym match-type, prefix suffix match type, prefix suffix synonym match-type, prefix matching suffix syntype and Prefix synonym suffix match type.
4. the intelligent prompt method for search as claimed in claim 1, it is characterised in that described in described step S511 The statistics frequency calculating keyword includes the step of the temporally frequency of weighted calculation.
5. the intelligent prompt method for search as claimed in claim 1, it is characterised in that in described step S52:
CTR = A×R×C;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R is this locality of candidate word The degree of correlation;C is the constant that the type according to candidate word determines.
6. the intelligent prompt method for search as claimed in claim 1, it is characterised in that in described step S52:
CTR = A×R×C×P;Wherein CTR is the click discreet value of candidate word;A is the probability of candidate word;R is candidate word Locally associated degree;C is the constant that the type according to candidate word determines;P is the search frequency of candidate word.
7., for an intelligent prompt system for search, including client and server, client and server are by network phase Even, it is characterised in that:
Described server includes:
Word-dividing mode, is used for splitting init string and obtains prefix word and suffix word;Described prefix word is by word by segmenter First keyword in the keyword that symbol string obtains after splitting;Described suffix word is to be obtained after character string fractionation by segmenter To keyword in last keyword;
Synonym expansion module, is used for searching acquisition prefix synonym in thesaurus according to prefix word and suffix word and suffix is same Justice word;
Suffix tree spider module, for travel through hot word suffix tree search prefix matching and or the hot word of suffix match, it is thus achieved that candidate Word information list;Described prefix matching is that the prefix of hot word matches with described prefix word or prefix synonym;After described Sew the suffix that coupling is hot word to mate with described suffix word or suffix synonym;
Hot word bank builds module, is used for preserving the database of hot word information for management and maintenance;
Suffix tree builds module, is used for managing and safeguard hot word suffix tree;Described hot word suffix tree is that server is according to hot word High frequency search hot word in storehouse is set up according to the data structure of generalized suffix tree;
Historical behavior analyzes module, is used for the general of each candidate word of analytical calculation according to user's historical search behavior database Rate;
User's historical search behavioral data library module, is used for preserving historical behavior information;
Described client includes:
Locally associated degree computing module, for calculating each candidate in candidate word information list according to local historical search data storehouse The locally associated degree of word;
Click on discreet value computing module, calculate each candidate word for the locally associated degree according to candidate word, candidate word information Click on discreet value;
Candidate word chooses module, for choosing candidate word list according to the click discreet value of candidate word from candidate word information list;
Local historical search data library storage module, is used for preserving local historical search information, described local historical search letter Breath includes local historical search character string, local historical search time, the local historical search frequency;
Described locally associated degree computing module includes:
Keyword distribution statistics module, is used for the local historical search character string in local historical search data storehouse by segmenter Split into lists of keywords with the candidate word in candidate word information list and calculate the statistics frequency of each keyword;
Keyword space vector builds module, builds keyword for the statistics frequency according to the keyword in lists of keywords empty Between vector;
Candidate word space vector builds module, for the keyword system of keyword in lists of keywords split according to candidate word The meter frequency builds candidate word space vector;
Vector cosine computing module, for calculating the cosine value of keyword space vector and candidate word space vector, it is thus achieved that candidate The locally associated degree of word.
CN201310653732.6A 2013-12-09 2013-12-09 A kind of method of intelligent prompt, module and system for search Active CN103631929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310653732.6A CN103631929B (en) 2013-12-09 2013-12-09 A kind of method of intelligent prompt, module and system for search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310653732.6A CN103631929B (en) 2013-12-09 2013-12-09 A kind of method of intelligent prompt, module and system for search

Publications (2)

Publication Number Publication Date
CN103631929A CN103631929A (en) 2014-03-12
CN103631929B true CN103631929B (en) 2016-08-31

Family

ID=50212970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310653732.6A Active CN103631929B (en) 2013-12-09 2013-12-09 A kind of method of intelligent prompt, module and system for search

Country Status (1)

Country Link
CN (1) CN103631929B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914569B (en) * 2014-04-24 2018-09-07 百度在线网络技术(北京)有限公司 Input creation method, the device of reminding method, device and dictionary tree-model
CN105224554A (en) * 2014-06-11 2016-01-06 阿里巴巴集团控股有限公司 Search word is recommended to carry out method, system, server and the intelligent terminal searched for
CN104750873A (en) * 2015-04-22 2015-07-01 百度在线网络技术(北京)有限公司 Popular search term push method and device
CN105488121A (en) * 2015-11-24 2016-04-13 魏强 Accurate retrieval system
CN106126500B (en) * 2016-06-22 2019-02-22 广东亿迅科技有限公司 A kind of statistical method being associated with hot word
CN107665217A (en) * 2016-07-29 2018-02-06 苏宁云商集团股份有限公司 A kind of vocabulary processing method and system for searching service
CN108319603A (en) * 2017-01-17 2018-07-24 腾讯科技(深圳)有限公司 Object recommendation method and apparatus
CN108227954A (en) * 2017-12-29 2018-06-29 北京奇虎科技有限公司 A kind of method, apparatus and electronic equipment that search input associational word is provided
CN108319376B (en) * 2017-12-29 2021-11-26 北京奇虎科技有限公司 Input association recommendation method and device for optimizing commercial word promotion
CN108241740A (en) * 2017-12-29 2018-07-03 北京奇虎科技有限公司 The generation method and device of a kind of search input associational word of timeliness
CN110286775A (en) * 2018-03-19 2019-09-27 北京搜狗科技发展有限公司 A kind of dictionary management method and device
CN108536763B (en) * 2018-03-21 2021-02-05 创新先进技术有限公司 Pull-down prompting method and device
CN108846016B (en) * 2018-05-05 2021-08-20 复旦大学 Chinese word segmentation oriented search algorithm
CN109739367A (en) * 2018-12-28 2019-05-10 北京金山安全软件有限公司 Candidate word list generation method and device
CN109933217B (en) * 2019-03-12 2020-05-01 北京字节跳动网络技术有限公司 Method and device for pushing sentences
CN113032819A (en) * 2019-12-09 2021-06-25 阿里巴巴集团控股有限公司 Method and system for determining search prompt words and information processing method
CN111488426B (en) * 2020-04-17 2024-02-02 支付宝(杭州)信息技术有限公司 Query intention determining method, device and processing equipment
CN111782947B (en) * 2020-06-29 2022-04-22 北京达佳互联信息技术有限公司 Search content display method and device, electronic equipment and storage medium
CN112925900B (en) * 2021-02-26 2023-10-03 北京百度网讯科技有限公司 Search information processing method, device, equipment and storage medium
CN114817690A (en) * 2022-06-28 2022-07-29 江西医之健科技有限公司 Data searching method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930022A (en) * 2012-10-31 2013-02-13 中国运载火箭技术研究院 User-oriented information search engine system and method
CN103258023A (en) * 2013-05-07 2013-08-21 百度在线网络技术(北京)有限公司 Recommendation method and search engine for search candidate words
CN103425687A (en) * 2012-05-21 2013-12-04 阿里巴巴集团控股有限公司 Retrieval method and system based on queries

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425687A (en) * 2012-05-21 2013-12-04 阿里巴巴集团控股有限公司 Retrieval method and system based on queries
CN102930022A (en) * 2012-10-31 2013-02-13 中国运载火箭技术研究院 User-oriented information search engine system and method
CN103258023A (en) * 2013-05-07 2013-08-21 百度在线网络技术(北京)有限公司 Recommendation method and search engine for search candidate words

Also Published As

Publication number Publication date
CN103631929A (en) 2014-03-12

Similar Documents

Publication Publication Date Title
CN103631929B (en) A kind of method of intelligent prompt, module and system for search
CN105488024B (en) The abstracting method and device of Web page subject sentence
Liu et al. Context-based collaborative filtering for citation recommendation
CN104615767B (en) Training method, search processing method and the device of searching order model
CN104484339B (en) A kind of related entities recommend method and system
CN107239512B (en) A kind of microblogging comment spam recognition methods of combination comment relational network figure
CN106547864B (en) A kind of Personalized search based on query expansion
CN106484764A (en) User&#39;s similarity calculating method based on crowd portrayal technology
CN104462327B (en) Calculating, search processing method and the device of statement similarity
Amami et al. A graph based approach to scientific paper recommendation
Kim et al. A framework for tag-aware recommender systems
CN102456057B (en) Search method based on online trade platform, device and server
CN109597995A (en) A kind of document representation method based on BM25 weighted combination term vector
CN104281565A (en) Semantic dictionary constructing method and device
CN107832319B (en) Heuristic query expansion method based on semantic association network
An et al. A heuristic approach on metadata recommendation for search engine optimization
CN103914490B (en) Webpage operation method and system
CN101840438B (en) Retrieval system oriented to meta keywords of source document
Elfida et al. Enhancing to method for extracting Social network by the relation existence
CN106599304B (en) Modular user retrieval intention modeling method for small and medium-sized websites
Zhang et al. Co-ranking multiple entities in a heterogeneous network: Integrating temporal factor and users’ bookmarks
CN104794200B (en) A kind of event distribution subscription method of the support fuzzy matching based on body
CN108932247A (en) A kind of method and device optimizing text search
TWI621952B (en) Comparison table automatic generation method, device and computer program product of the same
Albathan et al. Enhanced n-gram extraction using relevance feature discovery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 211100, No. 100, general road, Jiangning Economic Development Zone, Jiangsu, Nanjing

Applicant after: JIANGSU WISEDU EDUCATION INFORMATION TECHNOLOGY CO., LTD.

Address before: 211100, No. 100, general road, Jiangning Economic Development Zone, Jiangsu, Nanjing

Applicant before: Jiangsu Wisedu Information Technology Co., Ltd.

COR Change of bibliographic data
C14 Grant of patent or utility model
GR01 Patent grant