CN108804532A - A kind of recognition methods, the device of the excavation and query intention of query intention - Google Patents
A kind of recognition methods, the device of the excavation and query intention of query intention Download PDFInfo
- Publication number
- CN108804532A CN108804532A CN201810416613.1A CN201810416613A CN108804532A CN 108804532 A CN108804532 A CN 108804532A CN 201810416613 A CN201810416613 A CN 201810416613A CN 108804532 A CN108804532 A CN 108804532A
- Authority
- CN
- China
- Prior art keywords
- intention
- major class
- intended
- keyword
- url
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
This application discloses the recognition methods of a kind of excavation of query intention and query intention, device, belong to field of communication technology, this method includes, extract each keyword in Query Information to be checked, obtain multiple keyword sets, distribution probability between being intended to based on each keyword set obtained in advance and each major class, calculate distribution probability of the Query Information in each major class intention, and then determine that target major class is intended to, and the intention type based on each keyword in target major class intention, further determine that the son of Query Information is intended to, in this way, improve the efficiency and accuracy of query intention identification, extend the query context of Query Information, increase the diversity of the query result of Query Information.
Description
Technical field
This application involves field of communication technology more particularly to a kind of identification sides of the excavation and query intention of query intention
Method, device.
Background technology
This part intends to provides background or context for the presently filed embodiment stated in claims.Herein
Description recognizes it is the prior art not because not being included in this part.
Currently, when being scanned for according to Query Information input by user, it can be by way of participle from Query Information
After middle extraction keyword, search result is returned to by the way of Keywords matching, still, due to the pass extracted from Query Information
There may be a variety of ambiguities for keyword, and therefore, the search result of acquisition may differ greatly with the query intention of user.
It is identified for the query intention to user, and is scanned for this, to more accurately be returned most to user
Meet the search result of its demand.In the prior art, generally use carries out Query Information and the query intention template of excavation
The mode matched determines user's query intention, in the above method, on the one hand, time that the excavation of query intention template needs compared with
Long, on the other hand, the query intention template excavated can not also cover all user's query intentions, cause template coverage rate compared with
It is low, therefore, query intention recognition efficiency and the accuracy of user how are improved, is be worthy of consideration the problem of.
Invention content
The embodiment of the present application provides a kind of recognition methods, the device of the excavation and query intention of query intention, to right
When the query intention of user is excavated and identified, efficiency and the accuracy of query intention excavation and identification are improved.
In a first aspect, a kind of method for digging of query intention, including:
For any keyword set, the corresponding intention mined information of the keyword is obtained;
Based on the incidence relation being intended between mined information and major class intention, determine that each intention mined information is corresponding respectively
Major class is intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
It is intended to based on the corresponding major class of each intention mined information, determines keyword set in each major class intention respectively
Distribution probability, and determine according to each distribution probability the query intention of keyword set.
Preferably, it includes at least one of following to be intended to mined information:URL, URL title, URL click datas and intention are mended
Fill information, wherein obtained in the entitled search results scanned for using keyword set of URL and URL, URL points
Data are hit to be determined according to click logs data for URL.
Preferably, being intended to based on the corresponding major class of each intention mined information, determine that keyword set is big at each respectively
Distribution probability in class intention, specifically includes:
It is intended to for each corresponding major class of each intention mined information, the URL and URL marks being intended to based on the corresponding major class
Topic determines the URL distribution probabilities between keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined
Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined
Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on keyword set in major class intention
Distribution probability determines distribution probability of the keyword set in major class intention.
Preferably, based on URL the and URL titles that the corresponding major class is intended to, determine that keyword set is intended to it with the major class
Between URL distribution probabilities, including:
Based on each URL that the corresponding major class is intended to, determine that the URL between keyword set is intended to the major class is matched
Degree;
Based on each URL titles that the corresponding major class is intended to, the title between keyword set and the major class are intended to is determined
Matching degree;
URL matching degrees between being intended to based on keyword set and the major class and title match degree, determine keyword set
URL distribution probabilities between being intended to the major class.
Preferably, based on each URL that the corresponding major class is intended to, between determining that keyword set and the major class are intended to
URL matching degrees, including:
Obtain URL name placements of each URL of corresponding major class intention in search result;
Based on the incidence relation between URL name placements and matching attenuation index, each URL name placement is determined respectively
Corresponding matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is keyword set and the major class
URL matching degrees between intention.
Preferably, based on each URL titles that the corresponding major class is intended to, between determining that keyword set and the major class are intended to
Title match degree, including:
It is intended to each corresponding URL title for the major class, determines that every URL titles are intended to encompass with the major class respectively
Each term vector between cosine similarity, and by the maximum value in each cosine similarity, as the URL titles with should
Semantic similarity between major class intention;
Caption position sequence of each URL titles in search result is obtained, and based on caption position sequence and matching attenuation
Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Semantic Similarity and phase that the major class is intended between each corresponding URL title and major class intention are determined respectively
Matching product between the matching attenuation index answered;
Determine that the adduction of each matching product is the title match degree between keyword set and major class intention.
Second aspect, a kind of recognition methods of query intention, including:
Each keyword set is obtained based on each keyword extracted from the Query Information of reception, and is obtained using power
Distribution probability between each keyword set that the method that profit requires first aspect obtains and each major class intention;
It is intended to for each major class, according to distribution probability of each keyword set in major class intention, determines inquiry letter
Cease the distribution probability in major class intention;
Distribution probability based on Query Information in each major class intention determines that the corresponding target major class of Query Information is intended to;
It is intended to according to determining target major class, determines the intention type of each keyword, and the meaning based on each keyword
Graph type determines that the son of Query Information is intended to, it is intended that type obtains after dividing major class intention according to resource requirement.
Preferably, the distribution probability according to each keyword set in major class intention, determines Query Information in the major class
Distribution probability in intention, specifically includes:
It is intended to for each major class, according to distribution probability and respective weights of each keyword set in major class intention
Adduction, determine the match query degree between Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on maximum distribution probability and Query Information and each major class, determines inquiry
Distribution probability of the information under major class intention.
Preferably, being intended to according to determining target major class, determines the intention type of each keyword, specifically include:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on target major class
Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of intention type is multiple, resource of the keyword under each intention type based on acquisition
Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial
The highest intention type of word distribution probability is the intention type of the keyword, wherein keyword distribution probability characterization is based on key
Resource distribution of the search result that word returns under each intention type.
Preferably, intention type includes type of subject and demand type;And
Intention type based on each keyword determines that the son of Query Information is intended to, specifically includes:
Based on the incidence relation between each keyword and intention type, the pass that type of subject is corresponded in each keyword is determined
Keyword is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of main body word is obtained;
And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of demand word is obtained;
Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is intended to.
Preferably, in the intention type based on each keyword, after determining that the son of Query Information is intended to, further wrap
It includes:
It is based respectively on the combination that main body word is intended to each height, Query Information is scanned for, obtains search result.
The third aspect, a kind of excavating gear of query intention, including:
Search unit obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit, for based on the incidence relation being intended between mined information and major class intention, determining respectively each
It is intended to the corresponding major class of mined information to be intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit determines keyword set respectively for being intended to based on the corresponding major class of each intention mined information
Distribution probability in each major class intention.
Fourth aspect, a kind of identification device of query intention, including:
Acquiring unit, for obtaining each keyword set based on each keyword extracted from the Query Information of reception
It closes, and the distribution obtained between each keyword set and each major class intention using the device acquisition of the third aspect is general
Rate;
First determination unit, for being intended to for each major class, according to point of each keyword set in major class intention
Cloth probability determines distribution probability of the Query Information in major class intention;
Second determination unit determines Query Information pair for the distribution probability based on Query Information in each major class intention
The target major class answered is intended to;
Third determination unit determines the intention type of each keyword, and be based on for being intended to according to determining target major class
The intention type of each keyword determines that the son of Query Information is intended to, it is intended that type is that major class is intended to according to resource requirement
It is obtained after division.
5th aspect, provides a kind of terminal device, including at least one processing unit and at least one storage unit,
Wherein, storage unit is stored with computer program, when program is executed by processing unit so that processing unit executes any of the above-described
The step of excavation of query intention a kind of and the recognition methods of query intention.
6th aspect, provides a kind of computer-readable medium, is stored with the computer program that can be executed by terminal device,
When program is run on the terminal device so that terminal device executes excavation and the query intention of a kind of any of the above-described query intention
Recognition methods the step of.
In a kind of excavation of query intention provided by the embodiments of the present application and the recognition methods of query intention, device, it is based on
The each keyword extracted from the Query Information of reception obtains each keyword set, is intended to for each major class, according to each
Distribution probability of the keyword set in major class intention determines distribution probability of the Query Information in major class intention;It is based on
Distribution probability of the Query Information in each major class intention determines that the corresponding target major class of Query Information is intended to;According to determining mesh
It marks major class to be intended to, determines the intention type of each keyword, and the intention type based on each keyword, determine Query Information
Son is intended to, it is intended that type obtains after dividing major class intention according to resource requirement.In this way, using predetermined keyword
Distribution probability between set and major class intention determines distribution probability of the Query Information in each major class intention, and then determines mesh
It marks major class to be intended to be intended to son, improves the efficiency and accuracy of query intention identification, extend the query context of Query Information,
Improve the diversity of subsequent search result.
Other features and advantage will illustrate in the following description, also, partly become from specification
It obtains it is clear that being understood by implementing the application.The purpose of the application and other advantages can be by the explanations write
Specifically noted structure is realized and is obtained in book, claims and attached drawing.
Description of the drawings
Attached drawing described herein is used for providing further understanding of the present application, constitutes part of this application, this Shen
Illustrative embodiments and their description please do not constitute the improper restriction to the application for explaining the application.In the accompanying drawings:
Fig. 1 is the application scenarios schematic diagram according to embodiment of the present invention;
Fig. 2 a are a kind of schematic diagram of the method for digging system of query intention in the application embodiment;
Fig. 2 b are search result schematic diagram in the application embodiment;
Fig. 3 a are a kind of implementing procedure figure of the recognition methods of query intention in the application embodiment;
Fig. 3 b are the distribution probability figure of Query Information in the application embodiment;
Fig. 3 c are the schematic diagram of intention type in the application embodiment;
Fig. 3 d are the syntactic structure schematic diagram one that Query Information extends in the application embodiment;
Fig. 3 e are the syntactic structure schematic diagram two that Query Information extends in the application embodiment;
Fig. 3 f are the configuration diagram of the recognition methods of query intention in the application embodiment;
Fig. 4 is a kind of structural schematic diagram of the excavating gear of query intention in the application embodiment;
Fig. 5 is a kind of structural schematic diagram of the identification device of query intention in the application embodiment;
Fig. 6 is terminal device structural schematic diagram in the application embodiment.
Specific implementation mode
In order to which the efficiency and accuracy, the embodiment of the present application that improve the identification of user's query intention provide a kind of query intention
Excavation and query intention recognition methods, device.
First, the part term involved in the embodiment of the present application is illustrated, in order to those skilled in the art understand that.
1, terminal device:Types of applications program, and pair that can will be provided in mounted application program can be installed
Equipment as shown in carrying out, the electronic equipment can be mobile, can also be it is fixed,.For example, mobile phone, tablet computer, each
Class wearable device, mobile unit, personal digital assistant (personal digital assistant, PDA), point-of-sale terminal
(point of sales, POS) or other electronic equipments etc. that can realize above-mentioned function.
2, major class is intended to:For what is obtained after the intention of the inquiry of user is classified according to topic.
The preferred embodiment of the application is illustrated below in conjunction with Figure of description, it should be understood that described herein
Preferred embodiment is only used for describing and explaining the application, is not used to limit the application, and in the absence of conflict, this Shen
Please in embodiment and embodiment in feature can be combined with each other.
Currently, when being scanned for according to Query Information input by user, the mode of participle extraction is usually first passed through, from looking into
It askes in information and extracts keyword, and return to (Uniform Resource Locator, URL) mark by the way of Keywords matching
Topic, i.e. Document Title.But since Query Information is usually shorter, and the keyword extracted from Query Information there may be
More ambiguity, therefore, the search result of acquisition may differ greatly with the real query intention of user.
For example, Query Information input by user is " rural love 10 ", the URL of return is entitled, and " the 10th says:How to treat
A side in love comes from rural area ", it is clear that each keyword that the search result of return only includes with Query Information matches,
But to user inquiry it is real be intended to and it is uncorrelated.
In view of this, the embodiment of the present application provides a kind of identifying schemes of the excavation and query intention of query intention, base
The each keyword extracted in the Query Information from reception obtains each keyword set, is intended to for each major class, according to
Distribution probability of each keyword set in major class intention determines that distribution of the Query Information in major class intention is general
Rate;Distribution probability based on the Query Information in each major class intention determines the corresponding target major class meaning of the Query Information
Figure;It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword
Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement.
The recognition methods of the excavation and query intention of a kind of query intention provided by the embodiments of the present application, can be applied to terminal
In equipment, which can be mobile phone, tablet computer, PDA (Personal Digital Assistant, palm PC)
Deng.
In order to which the efficiency and accuracy, the embodiment of the present invention that improve the identification of user's query intention give solution, join
Application scenarios schematic diagram shown in FIG. 1 is examined, the client for being provided with function of search is installed, then user 10 on user equipment 11
Inquiry request is sent by the user end to server 12 installed in user equipment 11, server 12 is receiving inquiry request
Afterwards, each keyword set is obtained based on each keyword extracted from the Query Information of reception, is intended to for each major class,
According to distribution probability of each keyword set in major class intention, distribution of the Query Information in major class intention is determined
Probability;Distribution probability based on the Query Information in each major class intention, determines the corresponding target major class of the Query Information
It is intended to;It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword
Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement.
In this way, improving efficiency and the accuracy of user's query intention identification.
It should be noted that be communicatively coupled by network between user equipment 11 and server 12, which can be with
For LAN, wide area network etc..User equipment 11 can be portable equipment (such as:Mobile phone, tablet, laptop etc.), also may be used
Think that PC (PC, Personal Computer), server 12 can be capable of providing setting for Internet service to be any
Standby, the client in user equipment 11 can be the client with function of search, can be wechat and QQ browsers etc..
It should be noted that above application scene is merely for convenience of understanding spirit and principles of the present invention and showing, this
The embodiment of invention is unrestricted in this regard.On the contrary, embodiments of the present invention can be applied to it is applicable any
Scene.
It is a kind of implementing procedure figure of the method for digging of query intention provided by the present application shown in Fig. 2 a.This method
Specific implementation flow it is as follows:
Step 200:Server is directed to any keyword set, obtains the corresponding intention mined information of the keyword.
Specifically, it includes at least one of following to be intended to mined information:Uniform resource locator (Uniform Resource
Locator, URL), URL titles, URL click datas and be intended to supplemental information.
Wherein, it is intended that supplemental information is each actual resource obtained by keyword, i.e., is directed to each major class intention in advance and sets
Therefore the actual resource set, is pre-set this is because the coverage rate of URL and URL titles is not complete for each intention class
Actual resource is to increase coverage rate.For example, actual resource is the trade name that shopping is intended in class, brand name etc., music is intended to class
In singer, song, album etc..
Wherein, when obtaining intention mined information, any one in following methods or arbitrary combination may be used:
First way is:Server by search engine (e.g., Baidu and wechat) to keyword set to be excavated into
Row search, obtains each intention mined information for including URL and URL titles returned based on the keyword.Wherein, based on key
It, need not be close to appearance between each keyword when set of words scans for.
Be a kind of search result schematic diagram for example, refering to shown in Fig. 2 b, for keyword set { credit card, enchashment } into
The search result obtained after row Baidu search, including each URL and URL titles.
The second way is:Server carries out blog search according to click logs data to keyword set, obtains each
The URL click datas of URL.
The third mode is:Server is set pre-set for actual resource (e.g., trade name, singer, song etc.)
In the set for the intention supplemental information set, keyword set is carried out to be intended to supplemental information search.
Step 201:Server determines each intention respectively based on the incidence relation being intended between mined information and major class intention
The corresponding major class of mined information is intended to.
Specifically, before being excavated to query intention, intention that server inquires user previously according to topic into
Row classification obtains each major class and is intended to, and establishes the incidence relation that major class is intended to and is intended between mined information.
Wherein, which includes the incidence relation between URL and major class intention, between URL titles and major class are intended to
Incidence relation, URL click datas and major class be intended between incidence relation, it is intended that supplemental information and major class be intended between pass
Connection relationship.
Optionally, major class intention can be divided into:Tourism, game, sport, music, video, software, literature, cuisines, medical treatment,
Finance and economics, automobile, house property, animation, education, science and technology is military, shopping, chicken soup text, amusement, mother and baby, fashion, and public platform commonly uses inquiry
(weather, logistics etc.), personage, information, picture, question and answer, encyclopaedia (experience, knowledge).
In this manner it is possible to be based on each intention mined information, determine that the corresponding major class of each intention mined information is intended to.
Step 202:Server is based on each intentions mined information corresponding major class intention, determine keyword set respectively with often
The URL distribution probabilities that one major class is intended to.
Specifically, server is intended to for each major class, following steps are executed:
It is intended to based on the corresponding major class of each URL, determines that the URL between keyword set is intended to a major class is matched
Degree, and be intended to based on the corresponding major class of each URL titles, determine the title match between keyword set and the major class are intended to
Degree, and based on the URL matching degrees and title match degree between the keyword set and major class intention, determine the keyword set
Close the URL distribution probabilities between major class intention.Wherein, distribution probability and URL matching degrees and title match degree are in positive
It closes.
Specifically, when determining the URL matching degrees between keyword set and major class intention, following steps can be executed:
URL name placements of each URL of corresponding major class intention in described search result is obtained, and is based on URL
Incidence relation between name placement and matching attenuation index determines each corresponding matching attenuation of URL name placements respectively
Index, and the adduction of the matching attenuation index of each URL determined is calculated, it is intended to as the keyword set and the major class
Between URL matching degrees.
Optionally, when calculating URL matching degrees, following formula may be used:
Wherein, urlmatchFor URL matching degrees, urliIt anticipates for major class for i-th of the URL obtained based on keyword set, c
Figure, m are the sum of the corresponding URL of keyword set, indicator (urli, c) and it is 0-1 functions, if urliCorresponding major class meaning
Figure is c, is 1, otherwise be 0, pos (i) for the position penalty negatively correlated with i, and i and m are positive integer.
Specifically, when the title match between the determining keyword set and major class intention is spent, following step can be executed
Suddenly:
First, it is directed to each URL title of the keyword set respectively, executes following steps:Calculate a URL title
Cosine similarity between each term vector being intended to the major class, and the maximum value in each cosine similarity is chosen, make
Semantic similarity between being intended to for the URL titles and the major class.
Then, caption position sequence of each URL title in search result is obtained respectively, and is arranged based on caption position
Incidence relation between sequence and matching attenuation index determines the corresponding matching attenuation index of each caption position sequence respectively,
And based on the semantic similarity between each URL title and a major class intention, between corresponding matching attenuation index
Matching product adduction, obtain the keyword set and the major class be intended between title match degree, wherein title match degree
It is proportionate with the matching product.
Optionally, when calculating title match is spent, following formula may be used:
Wherein, titlematchFor title match degree, consine () is for calculating the cosine phase between different term vectors
Like the function of degree, titlel_ vector is the title vector of first of the URL title obtained based on keyword set, wordk_c_
Vector is k-th of term vector that major class is intended to that c includes, and pos (l) is the position penalty negatively correlated with l, and n marks for URL
The sum of topic, y are the sum for each term vector for being intended to setting for major class, and k is the serial number of term vector, and i, n and y are just whole
Number.
Optionally, based on the URL matching degrees and title match degree between the keyword set and major class intention, determining should
Keyword set and the major class be intended between URL distribution probabilities when, following formula may be used:
Wherein, Purl(c | j) is keyword set j and major class is intended to the URL distribution probabilities between c, urlmatchIt is URL
With degree, titlematchFor title match degree, jqvIt is constant, w1 and w2 to be directed to the search temperature of keyword set setting in advance
It is weighted value.
Step 203:Each URL click datas that server is intended to according to the corresponding major class determine that keyword set is big at this
Click distribution probability in class intention.
Wherein, it is point based on the corresponding URL click datas of Query Information in each major class intention to click distribution probability
What cloth determined.
Step 204:Each intention supplemental information that server is intended to according to the corresponding major class determines that keyword set is big at this
Intention supplemental information distribution probability in class intention.
Wherein, it is intended that supplemental information distribution probability is anticipated in each major class based on the corresponding each intention supplemental information of Query Information
Distribution in figure is determining.
In this way, the hobby of user can be determined by the URL click datas of user, by being intended to supplemental information, raising is searched
The coverage rate of rope.
Step 205:URL distribution probability, click distribution probability of the server based on keyword set in major class intention
With intention supplemental information distribution probability, distribution probability of the keyword set in major class intention is determined.
Specifically, server is based on the click distribution probability determined and is intended to supplemental information distribution probability, for each
A major class is intended to, and executes following steps:
Calculate the click distribution probability under the major class is intended to, it is intended that supplemental information distribution probability and the distribution determined are general
The adduction of rate and the product of corresponding weighted value determines distribution probability of the keyword set in major class intention.
Optionally, when calculating the distribution probability after adjustment, following formula may be used:
P (c | j)=Pd(c|j)wd+pn(c|j)wn+Purl(c|j)wurl
Wherein, P (c | j) is distribution probability of the keyword set in major class intention, PdTo click distribution probability, pn is
It is intended to supplemental information distribution probability, Purl(c | j) it is URL distribution probabilities, j is keyword set, and c is intended to for major class, wd、wnWith
wurlFor weighted value.
In the embodiment of the present application, only by taking the distribution probability between determining that a keyword set and a major class are intended to as an example
It illustrates, is based on identical principle, it may be determined that the distribution probability that each keyword set is intended to each major class respectively.
Details are not described herein.
Step 206:Distribution probability of the server based on keyword set in each major class intention, determines keyword set
Query intention.
Pass through each distribution probability, it may be determined that keyword set respectively each major class intention in distribution situation, to really
Fixed corresponding query intention.
In this way, the database of the distribution probability between each keyword set and each major class intention is established, so as to
During subsequent, further to be identified to the query intention of Query Information based on each distribution probability.
It is a kind of implementing procedure figure of the recognition methods of query intention provided by the present application shown in Fig. 3 a.This method
Specific implementation flow it is as follows:
Step 300:Server obtains each keyword set based on each keyword extracted from the Query Information of reception
It closes.
Specifically, server carries out participle extraction to Query Information, several keywords are obtained, and respectively to the part of acquisition
Or whole keywords carry out permutation and combination, obtain multiple keyword sets.Wherein, keyword set is combined into arbitrary n keyword
Sequential combination, n are integer, no more than the quantity of whole keywords of a Query Information.
Step 301:It is intended to for each major class, server is based on point between keyword set and a major class intention
Cloth probability determines the match query degree between the Query Information and the major class are intended to.
Specifically, when executing step 301, the distribution between match query degree and keyword set and a major class are intended to is general
Rate is proportionate.
Optionally, when the match query between calculating Query Information and a major class are intended to is spent, following formula may be used:
Wherein, score (c | q) is Query Information and major class is intended to match query degree between c, q for Query Information institute
There are keyword set, wjFor the weighted value of keyword set j, P (c | j) is keyword set j and major class is intended to the distribution between c
Probability.
Step 302:For server based on the match query degree between Query Information and each major class intention, determining respectively should
Distribution probability of the Query Information in each major class intention.
Specifically, when executing step 204, following formula may be used:
Wherein, to be Query Information q be intended to the distribution probability in c to P (c | q) in major class, score (c | q) be Query Information q with
Major class is intended to the match query degree between c, and b is the sum of major class intention, the serial number that x is intended to for major class, and p (c | j) it is keyword
Set j and major class are intended to the distribution probability between c, and a is constant.
Step 303:The distribution probability that server is intended to according to Query Information in each major class determines that the target of Query Information is big
Class is intended to.
It is intended to specifically, server will choose the highest preceding t major class of distribution probability, is determined as target major class intention, t is
Positive integer.
It is the distribution probability figure of Query Information shown in Fig. 3 b.For example, Query Information:" moonlight of talk on the journey to west is precious
Distribution probability of the box " in video (video) is 0.6, and the distribution probability in encyclopaedia (baike) is 0.12, at amusement (ent)
In distribution probability be 0.07, distribution probability in animation (dongman) is 0.06.Then the highest major class of distribution probability is intended to
Video is that the target major class of " Moon light treasure box of talk on the journey to west " is intended to.Based on identical principle, it may be determined that other Query Informations
Distribution probability and target major class are intended to, and details are not described herein.
Step 304:Server determines the intention type of each keyword.
Specifically, server is directed to each keyword respectively, each target major class is intended to, and executes following steps:
First, server obtains each intention type for being intended to setting for target major class, determines the intention type of keyword.
Wherein, it is intended that type obtains after dividing major class intention according to resource requirement.
It is the schematic diagram of each intention type for example, refering to shown in Fig. 3 c.Optionally, the intention class for including during video is intended to
Type can be movie name, TV play name, variety name, performer, broadcasting, personage, film review, ticketing service, music, plot, public platform, neglect
Frequency and video requirement etc..
Then, server judges whether the quantity of intention type is one, if so, determining that the intention type is the key
Otherwise the intention type of word calculates resource distribution probability and user point of the keyword of acquisition under each intention type
Set of the data respectively with respective weights product is hit, as keyword distribution probability of the keyword under each intention type, and
By the highest intention type of keyword distribution probability, it is determined as the last intention type of the keyword.Wherein, keyword distribution is general
Rate is the resource distribution and user preferences for characterizing the search result returned based on keyword under each intention type.
Step 305:Intention type of the server based on each keyword determines that the son of the Query Information is intended to.
Specifically, each intention type can also be divided into type of subject and demand type.Such as " nothing asks Xi Dong to keyword set
Film review " in, the intention type of " without ask Xi Dong " had both been movie name, also based on type, " film review " is demand type.
Server is determined based on the incidence relation between each keyword and intention type and is corresponded to main body class in each keyword
The keyword of type is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;Based on the pass between intention type and conjunctive word
Connection relationship obtains the corresponding conjunctive word of intention type of main body word;And based on the incidence relation between demand word and conjunctive word,
The corresponding conjunctive word of acquisition demand word;Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is anticipated
Figure.
Step 306:Server is based respectively on the combination that the main body word and each height are intended to, to the Query Information into
Row search, obtains search result.
Specifically, server scans for after being combined with main body word after each sub- intention progress or combinations, searched
Hitch fruit.
If being combined since main body word is carried out and with each sub- intention, excessive search terms can increase retrieval pressure, drop
Low effectiveness of retrieval in the embodiment of the present application, increases or nodes in the syntax tree structure of existing Query Information, will inquiry letter
Breath is extended, and improves effectiveness of retrieval.
For example, refering to shown in Fig. 3 d, for the syntactic structure schematic diagram one of Query Information extension.Query Information is " to embroider spring knife
Where watch ", determine that the intention type of main body word " embroidering spring knife " is movie name, " viewing " is demand word.Obtain the pass of movie name
It is " film review ", " stage photo " and " director " to join word.
In another example refering to shown in Fig. 3 e, for the syntactic structure schematic diagram two of Query Information extension.Query Information is " warwolf 2
Wonderful " determines that the intention type of main body word " warwolf 2 " is movie name, and " wonderful " is demand word.Obtain movie name
Conjunctive word is " film review ", " stage photo " and " director ".
In another example refering to shown in Fig. 3 e, for the syntactic structure schematic diagram two of Query Information extension.Query Information is " to read aloud
Person's variety " determines that the intention type of main body word " declaimer " is variety name, and " variety " is demand word.Obtain the association of variety name
Word is " welcome guest ", " evaluation " and " viewing ".
In this manner it is possible to by the distribution probability between predetermined keyword set and major class intention, quickly really
The target major class for determining Query Information is intended to, and then by the intention type of each keyword, determines that son is intended to, and pass through sub- intention pair
Query Information is expanded, and the diversity of search result is increased.
Further, refering to the evaluation and test table shown in table 1, identified for query intention.Wechat and Baidu is respectively adopted in server
It scans for, wherein top is to choose Query Information according to temperature, and Random is the Query Information by obtaining at random, it is clear that
By the accuracy for the result that major class is intended to scan for, less than the result scanned for by the query intention determined after extension
Accuracy.
Table 1
Search engine | Major class is intended to (top) | Major class is intended to (Random) | Purposes (top) | Purposes (Random) |
84.3% | 80.6% | 88.2% | 75.3% | |
Baidu | 96.2% | 95.5% | 93.4% | 84.1% |
It is the configuration diagram of the recognition methods of query intention shown in Fig. 3 f.Server is previously according to each keyword
Distribution probability between set and each major class intention clicks distribution probability and is intended to supplemental information distribution probability, determines crucial
Distribution probability of the set of words in each major class intention.Then when carrying out the identification of query intention to Query Information, based on inquiry letter
Each keyword set of breath obtains corresponding distribution probability, and then determines the target distribution probability of Query Information, and then passes through mesh
The intention type for marking major class intention and each keyword determines that the son of Query Information is intended to, finally, obtains the inquiry of Query Information
As a result.
Based on same inventive concept, a kind of excavating gear of query intention is additionally provided in the embodiment of the present application, due to upper
It states device and principle that equipment solves the problems, such as is similar to a kind of method for digging of query intention, therefore, the implementation of above-mentioned apparatus can
With referring to the implementation of method, overlaps will not be repeated.
As shown in figure 4, it is a kind of structural schematic diagram of the excavating gear of query intention provided by the embodiments of the present application, packet
It includes:
Search unit 40 obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit 41, for based on the incidence relation being intended between mined information and major class intention, determining respectively
Each corresponding major class of mined information that is intended to is intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit 42 determines keyword set respectively for being intended to based on the corresponding major class of each intention mined information
Close the distribution probability in each major class intention.
Preferably, it includes at least one of following to be intended to mined information:URL, URL title, URL click datas and intention are mended
Fill information, wherein obtained in the entitled search results scanned for using keyword set of URL and URL, URL points
Data are hit to be determined according to click logs data for URL.
Preferably, being intended to based on the corresponding major class of each intention mined information, determine keyword set at each respectively
Major class be intended in distribution probability when, the second determination unit 42 is specifically used for:
It is intended to for each corresponding major class of each intention mined information, the URL and URL marks being intended to based on the corresponding major class
Topic determines the URL distribution probabilities between keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined
Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined
Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on keyword set in major class intention
Distribution probability determines distribution probability of the keyword set in major class intention.
Preferably, in URL the and URL titles being intended to based on the corresponding major class, determine that keyword set is intended to the major class
Between URL distribution probabilities when, the second determination unit 42 is additionally operable to::
Based on each URL that the corresponding major class is intended to, determine that the URL between keyword set is intended to the major class is matched
Degree;
Based on each URL titles that the corresponding major class is intended to, the title between keyword set and the major class are intended to is determined
Matching degree;
URL matching degrees between being intended to based on keyword set and the major class and title match degree, determine keyword set
URL distribution probabilities between being intended to the major class.
Preferably, in each URL being intended to based on the corresponding major class, between determining that keyword set and the major class are intended to
When URL matching degrees, the second determination unit 42 is additionally operable to:
Obtain URL name placements of each URL of corresponding major class intention in search result;
Based on the incidence relation between URL name placements and matching attenuation index, each URL name placement is determined respectively
Corresponding matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is keyword set and the major class
URL matching degrees between intention.
Preferably, in each URL titles being intended to based on the corresponding major class, determine that keyword set is intended to it with the major class
Between title match when spending, the second determination unit 42 is additionally operable to:
It is intended to each corresponding URL title for the major class, determines that every URL titles are intended to encompass with the major class respectively
Each term vector between cosine similarity, and by the maximum value in each cosine similarity, as the URL titles with should
Semantic similarity between major class intention;
Caption position sequence of each URL titles in search result is obtained, and based on caption position sequence and matching attenuation
Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Semantic Similarity and phase that the major class is intended between each corresponding URL title and major class intention are determined respectively
Matching product between the matching attenuation index answered;
Determine that the adduction of each matching product is the title match degree between keyword set and major class intention.
Based on same inventive concept, a kind of excavating gear of query intention is additionally provided in the embodiment of the present application, due to upper
It states device and principle that equipment solves the problems, such as is similar to a kind of method for digging of query intention, therefore, the implementation of above-mentioned apparatus can
With referring to the implementation of method, overlaps will not be repeated.
As shown in figure 5, it is a kind of structural schematic diagram of the identification device of query intention provided by the embodiments of the present application, packet
It includes:
Acquiring unit 50, for obtaining each keyword set based on each keyword extracted from the Query Information of reception
It closes, and obtains between excavating gear each keyword set obtained and each major class intention using above-mentioned query intention
Distribution probability;
First determination unit 51, for being intended to for each major class, according to each keyword set in major class intention
Distribution probability determines distribution probability of the Query Information in major class intention;
Second determination unit 52 determines Query Information for the distribution probability based on Query Information in each major class intention
Corresponding target major class is intended to;
Third determination unit 53 determines the intention type of each keyword, and base for being intended to according to determining target major class
In the intention type of each keyword, determine that the son of Query Information is intended to, it is intended that type is to be intended to need according to resource by major class
It asks and is obtained after dividing.
Preferably, in the distribution probability according to each keyword set in major class intention, determine that Query Information is big at this
Class be intended in distribution probability when, the second determination unit 52 is specifically used for:
It is intended to for each major class, according to distribution probability and respective weights of each keyword set in major class intention
Adduction, determine the match query degree between Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on maximum distribution probability and Query Information and each major class, determines inquiry
Distribution probability of the information under major class intention.
Preferably, according to determining target major class intention, when determining the intention type of each keyword, third determination unit
53 are specifically used for:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on target major class
Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of intention type is multiple, resource of the keyword under each intention type based on acquisition
Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial
The highest intention type of word distribution probability is the intention type of the keyword, wherein keyword distribution probability characterization is based on key
Resource distribution of the search result that word returns under each intention type.
Preferably, intention type includes type of subject and demand type;And
In the intention type based on each keyword, when determining that the son of Query Information is intended to, third determination unit 53 has
Body is used for:
Based on the incidence relation between each keyword and intention type, the pass that type of subject is corresponded in each keyword is determined
Keyword is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of main body word is obtained;
And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of demand word is obtained;
Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is intended to.
Preferably, in the intention type based on each keyword, after determining that the son of Query Information is intended to, third determines
Unit 53 is specifically used for:
It is based respectively on the combination that main body word is intended to each height, Query Information is scanned for, obtains search result.
For convenience of description, above each section is divided by function describes respectively for each module (or unit).Certainly, exist
Implement the function of each module (or unit) can be realized in same or multiple softwares or hardware when the application.
Based on same technical concept, the embodiment of the present application also provides a kind of terminal devices 600, with reference to shown in Fig. 6, terminal
Equipment 600 is used to implement the method that above-mentioned each embodiment of the method is recorded, such as embodiment shown in implementing Fig. 3 a, terminal device
600 may include memory 601, processor 602, input unit 603 and display panel 604.
The memory 601, the computer program for storing the execution of processor 602.Memory 601 can include mainly depositing
Store up program area and storage data field, wherein storing program area can storage program area, the application program needed at least one function
Deng;Storage data field can be stored uses created data etc. according to terminal device 600.Processor 602 can be in one
Central Processing Unit (central processing unit, CPU), or be digital processing element etc..Input unit 603, can
For obtaining user instruction input by user.The display panel 604, for showing information input by user or being supplied to
The information of user, in the embodiment of the present application, display panel 604 is mainly used for display circle of each application program in display terminal
The control object shown in face and each display interface.Optionally, liquid crystal display (liquid may be used in display panel 604
Crystal display, LCD) or the forms such as OLED (organic light-emitting diode, Organic Light Emitting Diode)
To configure display panel 604.
Above-mentioned memory 601, processor 602, input unit 603 and display panel 604 are not limited in the embodiment of the present application
Between specific connection medium.The embodiment of the present application is in figure 6 with memory 601, processor 602, input unit 603, display
It is connected by bus 605 between panel 604, bus 605 is indicated with thick line in figure 6, the connection type between other components, only
It is to be schematically illustrated, does not regard it as and be limited.The bus 605 can be divided into address bus, data/address bus, controlling bus
Deng.For ease of indicating, only indicated with a thick line in Fig. 6, it is not intended that an only bus or a type of bus.
Memory 601 can be volatile memory (volatile memory), such as random access memory
(random-access memory, RAM);Memory 601 can also be nonvolatile memory (non-volatile
Memory), such as read-only memory, flash memory (flash memory), hard disk (hard disk drive, HDD) or solid
State hard disk (solid-state drive, SSD) or memory 601 can be used for carrying or store with instruction or data
The desired program code of structure type and can by any other medium of computer access, but not limited to this.Memory 601
It can be the combination of above-mentioned memory.
Processor 602, for realizing the embodiment as shown in implementing Fig. 3 a, including:
The processor 602, for calling the computer program stored in the memory 601 to execute such as implementing Fig. 3 a institutes
The embodiment shown.
The embodiment of the present application also provides a kind of computer readable storage medium, it is stored as executing and is held needed for above-mentioned processor
Capable computer executable instructions, it includes for executing the program executed needed for above-mentioned processor.
In some possible embodiments, a kind of various aspects of the method for digging of query intention provided by the present application are also
It can be implemented as a kind of form of program product comprising program code, when described program product is run on the terminal device,
Said program code be used for make the terminal device execute this specification foregoing description according to the various exemplary implementations of the application
Step in a kind of method for digging of query intention of mode.For example, the terminal device can be executed as shown in implementing Fig. 3 a
Embodiment.
The arbitrary combination of one or more readable mediums may be used in described program product.Readable medium can be readable letter
Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example may be-but not limited to-electricity, magnetic, optical, electromagnetic, red
The system of outside line or semiconductor, device or device, or the arbitrary above combination.The more specific example of readable storage medium storing program for executing
(non exhaustive list) includes:Electrical connection, portable disc with one or more conducting wires, hard disk, random access memory
(RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc
Read memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
A kind of may be used for program product of the method for digging of query intention for presently filed embodiment is portable
Compact disk read-only memory (CD-ROM) and include program code, and can run on the computing device.However, the journey of the application
Sequence product is without being limited thereto, in this document, readable storage medium storing program for executing can be any include or storage program tangible medium, the journey
Sequence can be commanded the either device use or in connection of execution system, device.
Readable signal medium may include in a base band or as the data-signal that a carrier wave part is propagated, wherein carrying
Readable program code.Diversified forms may be used in the data-signal of this propagation, including --- but being not limited to --- electromagnetism letter
Number, optical signal or above-mentioned any appropriate combination.Readable signal medium can also be other than readable storage medium storing program for executing it is any can
Read medium, which can send, propagate either transmission for being used by instruction execution system, device or device or
Program in connection.
The program code for including on readable medium can transmit with any suitable medium, including --- but being not limited to ---
Wirelessly, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
Can with any combination of one or more programming languages come write for execute the application operation program
Code, described program design language include object oriented program language-Java, C++ etc., further include conventional
Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user
It executes on computing device, partly execute on a user device, being executed as an independent software package, partly in user's calculating
Upper side point is executed or is executed in remote computing device or server completely on a remote computing.It is being related to far
In the situation of journey computing device, remote computing device can pass through the network of any kind --- including LAN (LAN) or extensively
Domain net (WAN)-be connected to user calculating equipment, or, it may be connected to external computing device (such as utilize Internet service
Provider is connected by internet).
It should be noted that although being referred to several units or subelement of device in above-detailed, this stroke
It point is only exemplary not enforceable.In fact, according to presently filed embodiment, it is above-described two or more
The feature and function of unit can embody in a unit.Conversely, the feature and function of an above-described unit can
It is embodied by multiple units with being further divided into.
In addition, although the operation of the application method is described with particular order in the accompanying drawings, this do not require that or
Hint must execute these operations according to the particular order, or have to carry out shown in whole operation could realize it is desired
As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/or by one
Step is decomposed into execution of multiple steps.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, the application can be used in one or more wherein include computer usable program code computer
The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The application is with reference to method, the flow of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided
Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real
The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to
Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or
The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment of the application has been described, created once a person skilled in the art knows basic
Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the application range.
Claims (15)
1. a kind of method for digging of query intention, which is characterized in that including:
For any keyword set, the corresponding intention mined information of the keyword is obtained;
Incidence relation between being intended to based on the intention mined information and major class determines that each intention mined information is corresponding respectively
Major class is intended to, wherein the major class is intended that the intention of the inquiry obtained according to topic classification;
It is intended to based on the corresponding major class of each intention mined information, determines the keyword set in each major class intention respectively
Distribution probability, and according to each distribution probability determine the keyword set query intention be distributed.
2. the method as described in claim 1, which is characterized in that the intention mined information includes at least one of following:System
One Resource Locator URL, URL title, URL click datas and intention supplemental information, wherein the entitled utilizations of the URL and URL
It is obtained in the search result that the keyword set scans for, the URL click datas are for URL according to click
What daily record data was determined.
3. method as claimed in claim 2, which is characterized in that be intended to based on the corresponding major class of each intention mined information, respectively
It determines distribution probability of the keyword set in each major class intention, specifically includes:
It is intended to for corresponding each major class of each intention mined information, based on URL the and URL titles that the corresponding major class is intended to,
Determine the URL distribution probabilities between the keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined
Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined
Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on the keyword set in major class intention
Distribution probability determines distribution probability of the keyword set in major class intention.
4. method as claimed in claim 3, which is characterized in that based on URL the and URL titles that the corresponding major class is intended to, determine
URL distribution probabilities between the keyword set and major class intention, including:
Based on each URL that the corresponding major class is intended to, determine that the URL between the keyword set is intended to the major class is matched
Degree;
Based on each URL titles that the corresponding major class is intended to, the title between the keyword set and the major class are intended to is determined
Matching degree;
URL matching degrees between being intended to based on the keyword set and the major class and title match degree, determine the keyword
URL distribution probabilities between set and major class intention.
5. method as claimed in claim 4, which is characterized in that based on each URL that the corresponding major class is intended to, determine the pass
URL matching degrees between keyword set and major class intention, including:
Obtain URL name placements of each URL of corresponding major class intention in described search result;
Based on the incidence relation between URL name placements and matching attenuation index, determine that each URL name placement corresponds to respectively
Matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is the keyword set and the major class
URL matching degrees between intention.
6. method as claimed in claim 4, which is characterized in that based on each URL titles that the corresponding major class is intended to, determine institute
The title match degree between keyword set and major class intention is stated, including:
It is intended to each corresponding URL title for the major class, it is every determines that every URL titles are intended to encompass with the major class respectively
Cosine similarity between one term vector, and by the maximum value in each cosine similarity, as the URL titles and the major class
Semantic similarity between intention;
Caption position sequence of each URL titles in described search result is obtained, and based on caption position sequence and matching attenuation
Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Determine respectively the major class be intended to each corresponding URL title and the major class be intended between Semantic Similarity with it is corresponding
Matching product between matching attenuation index;
Determine that the adduction of each matching product is the title match degree between the keyword set and major class intention.
7. a kind of query intention recognition methods, which is characterized in that including:
Each keyword set is obtained based on each keyword extracted from the Query Information of reception, and obtains and is wanted using right
Distribution between asking method each keyword set obtained described in 1~6 any claim and each major class to be intended to
Probability;
It is intended to for each major class, according to distribution probability of each keyword set in major class intention, determines the inquiry letter
Cease the distribution probability in major class intention;
Distribution probability based on the Query Information in each major class intention determines the corresponding target major class meaning of the Query Information
Figure;
It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword
Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement.
8. the method for claim 7, which is characterized in that general according to distribution of each keyword set in major class intention
Rate determines distribution probability of the Query Information in major class intention, specifically includes:
For each major class be intended to, according to each keyword set the major class intention in distribution probability and respective weights plus
With determine the match query degree between the Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on the maximum distribution probability and the Query Information and each major class, is determined
Distribution probability of the Query Information under major class intention.
9. the method for claim 7, which is characterized in that be intended to according to determining target major class, determine each keyword
Intention type specifically includes:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on the target major class
Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of the intention type is multiple, resource of the keyword under each intention type based on acquisition
Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial
The highest intention type of word distribution probability is the intention type of the keyword, wherein the keyword distribution probability characterization is based on
Resource distribution of the search result that keyword returns under each intention type.
10. the method for claim 7, which is characterized in that the intention type includes type of subject and demand type;With
And
Intention type based on each keyword determines that the son of the Query Information is intended to, specifically includes:
Based on the incidence relation between each keyword and intention type, the keyword that type of subject is corresponded in each keyword is determined
Keyword for main pronouns, general term for nouns, numerals and measure words, corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of the main body word is obtained;
And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of the demand word is obtained;
Determine that the son for being arbitrarily combined as the Query Information of the demand word and each conjunctive word of acquisition is intended to.
11. method as claimed in claim 10, which is characterized in that in the intention type based on each keyword, determine institute
After the son intention for stating Query Information, further comprise:
It is based respectively on the combination that the main body word is intended to each height, the Query Information is scanned for, obtains search knot
Fruit.
12. a kind of excavating gear of query intention, which is characterized in that including:
Search unit obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit is determined each respectively for the incidence relation between being intended to based on the intention mined information and major class
It is intended to the corresponding major class of mined information to be intended to, wherein the major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit determines the keyword set respectively for being intended to based on the corresponding major class of each intention mined information
Distribution probability in each major class intention.
13. a kind of identification device of query intention, which is characterized in that including:
Acquiring unit, for obtaining each keyword set based on each keyword extracted from the Query Information of reception, and
It obtains using the distribution between each keyword set obtained of the device described in claim 12 and each major class intention
Probability;
First determination unit, it is general according to distribution of each keyword set in major class intention for being intended to for each major class
Rate determines distribution probability of the Query Information in major class intention;
Second determination unit determines the inquiry letter for the distribution probability based on the Query Information in each major class intention
Corresponding target major class is ceased to be intended to;
Third determination unit determines the intention type of each keyword, and based on each for being intended to according to determining target major class
The intention type of a keyword determines that the son of the Query Information is intended to, and the intention type is that major class is intended to according to resource
What demand obtained after dividing.
14. a kind of terminal device, which is characterized in that including at least one processing unit and at least one storage unit,
In, the storage unit is stored with computer program, when described program is executed by the processing unit so that the processing is single
First perform claim requires the step of 1-6 or 7-11 any claim the methods.
15. a kind of computer-readable medium, which is characterized in that it is stored with the computer program that can be executed by terminal device, when
When described program is run on the terminal device so that the terminal device perform claim requires any the methods of 1-6 or 7-11
The step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810416613.1A CN108804532B (en) | 2018-05-03 | 2018-05-03 | Query intention mining method and device and query intention identification method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810416613.1A CN108804532B (en) | 2018-05-03 | 2018-05-03 | Query intention mining method and device and query intention identification method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108804532A true CN108804532A (en) | 2018-11-13 |
CN108804532B CN108804532B (en) | 2020-06-26 |
Family
ID=64093548
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810416613.1A Active CN108804532B (en) | 2018-05-03 | 2018-05-03 | Query intention mining method and device and query intention identification method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108804532B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783608A (en) * | 2018-12-20 | 2019-05-21 | 出门问问信息科技有限公司 | Determination method, apparatus, readable storage medium storing program for executing and the electronic equipment of goal hypothesis |
CN109783736A (en) * | 2019-01-18 | 2019-05-21 | 广东小天才科技有限公司 | A kind of intention estimation method and system |
CN109815314A (en) * | 2019-01-04 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of intension recognizing method, identification equipment and computer readable storage medium |
CN109918555A (en) * | 2019-02-20 | 2019-06-21 | 百度在线网络技术(北京)有限公司 | Method, apparatus, equipment and the medium suggested for providing search |
CN110096709A (en) * | 2019-05-07 | 2019-08-06 | 百度在线网络技术(北京)有限公司 | Command processing method and device, server and computer-readable medium |
CN110162535A (en) * | 2019-03-26 | 2019-08-23 | 腾讯科技(深圳)有限公司 | For executing personalized searching method, device, equipment and storage medium |
CN110176315A (en) * | 2019-06-05 | 2019-08-27 | 京东方科技集团股份有限公司 | Medical answering method and system, electronic equipment, computer-readable medium |
CN110209766A (en) * | 2019-05-23 | 2019-09-06 | 招商局金融科技有限公司 | Method for exhibiting data, electronic device and storage medium |
CN110799973A (en) * | 2019-09-27 | 2020-02-14 | 京东方科技集团股份有限公司 | Information query method and display device |
CN111209374A (en) * | 2020-01-07 | 2020-05-29 | 平安科技(深圳)有限公司 | Data query display method and device, computer system and readable storage medium |
CN111339239A (en) * | 2019-06-13 | 2020-06-26 | 海通证券股份有限公司 | Knowledge retrieval method and device, storage medium and server |
CN111353021A (en) * | 2020-02-28 | 2020-06-30 | 百度在线网络技术(北京)有限公司 | Intention recognition method and apparatus, electronic apparatus, and medium |
CN111368161A (en) * | 2018-12-26 | 2020-07-03 | 北京搜狗科技发展有限公司 | Search intention recognition method and intention recognition model training method and device |
CN111401048A (en) * | 2020-03-10 | 2020-07-10 | 北京五八信息技术有限公司 | Intention identification method and device |
CN111488451A (en) * | 2020-06-29 | 2020-08-04 | 上海飞旗网络技术股份有限公司 | SVM classification model-based user query information and intention extraction method and device |
CN111597322A (en) * | 2019-12-28 | 2020-08-28 | 华南理工大学 | Automatic template mining system and method based on frequent item set |
CN111597433A (en) * | 2020-04-10 | 2020-08-28 | 北京百度网讯科技有限公司 | Resource searching method and device and electronic equipment |
CN111666006A (en) * | 2019-03-05 | 2020-09-15 | 京东方科技集团股份有限公司 | Method and device for drawing question and answer, drawing question and answer system and readable storage medium |
CN111949898A (en) * | 2020-08-28 | 2020-11-17 | 平安国际智慧城市科技股份有限公司 | Search result ordering method, device, equipment and computer readable storage medium |
CN112182176A (en) * | 2020-09-25 | 2021-01-05 | 北京字节跳动网络技术有限公司 | Intelligent question answering method, device, equipment and readable storage medium |
CN113656584A (en) * | 2021-08-18 | 2021-11-16 | 维沃移动通信有限公司 | User classification method and device, electronic equipment and storage medium |
CN114610914A (en) * | 2022-03-11 | 2022-06-10 | 北京百度网讯科技有限公司 | Information processing method and device and electronic equipment |
CN114969339A (en) * | 2022-05-30 | 2022-08-30 | 中电金信软件有限公司 | Text matching method and device, electronic equipment and readable storage medium |
CN110799973B (en) * | 2019-09-27 | 2024-04-19 | 京东方科技集团股份有限公司 | Information query method and display device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103873601A (en) * | 2012-12-11 | 2014-06-18 | 百度在线网络技术(北京)有限公司 | Addressing class query word mining method and system |
CN106302350A (en) * | 2015-06-01 | 2017-01-04 | 阿里巴巴集团控股有限公司 | URL monitoring method, device and equipment |
CN106649818A (en) * | 2016-12-29 | 2017-05-10 | 北京奇虎科技有限公司 | Recognition method and device for application search intentions and application search method and server |
WO2017107708A1 (en) * | 2015-12-25 | 2017-06-29 | 北京搜狗科技发展有限公司 | User proxy self-adaptation uniform resource locator prefix mining method and device |
CN107958078A (en) * | 2017-12-13 | 2018-04-24 | 北京百度网讯科技有限公司 | Information generating method and device |
-
2018
- 2018-05-03 CN CN201810416613.1A patent/CN108804532B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103873601A (en) * | 2012-12-11 | 2014-06-18 | 百度在线网络技术(北京)有限公司 | Addressing class query word mining method and system |
CN106302350A (en) * | 2015-06-01 | 2017-01-04 | 阿里巴巴集团控股有限公司 | URL monitoring method, device and equipment |
WO2017107708A1 (en) * | 2015-12-25 | 2017-06-29 | 北京搜狗科技发展有限公司 | User proxy self-adaptation uniform resource locator prefix mining method and device |
CN106649818A (en) * | 2016-12-29 | 2017-05-10 | 北京奇虎科技有限公司 | Recognition method and device for application search intentions and application search method and server |
CN107958078A (en) * | 2017-12-13 | 2018-04-24 | 北京百度网讯科技有限公司 | Information generating method and device |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783608A (en) * | 2018-12-20 | 2019-05-21 | 出门问问信息科技有限公司 | Determination method, apparatus, readable storage medium storing program for executing and the electronic equipment of goal hypothesis |
CN111368161A (en) * | 2018-12-26 | 2020-07-03 | 北京搜狗科技发展有限公司 | Search intention recognition method and intention recognition model training method and device |
CN111368161B (en) * | 2018-12-26 | 2024-01-09 | 北京搜狗科技发展有限公司 | Search intention recognition method, intention recognition model training method and device |
CN109815314B (en) * | 2019-01-04 | 2023-08-08 | 平安科技(深圳)有限公司 | Intent recognition method, recognition device and computer readable storage medium |
CN109815314A (en) * | 2019-01-04 | 2019-05-28 | 平安科技(深圳)有限公司 | A kind of intension recognizing method, identification equipment and computer readable storage medium |
WO2020140373A1 (en) * | 2019-01-04 | 2020-07-09 | 平安科技(深圳)有限公司 | Intention recognition method, recognition device and computer-readable storage medium |
CN109783736A (en) * | 2019-01-18 | 2019-05-21 | 广东小天才科技有限公司 | A kind of intention estimation method and system |
CN109783736B (en) * | 2019-01-18 | 2022-03-08 | 广东小天才科技有限公司 | Intention presumption method and system |
US11436282B2 (en) | 2019-02-20 | 2022-09-06 | Baidu Online Network Technology (Beijing) Co., Ltd. | Methods, devices and media for providing search suggestions |
CN109918555A (en) * | 2019-02-20 | 2019-06-21 | 百度在线网络技术(北京)有限公司 | Method, apparatus, equipment and the medium suggested for providing search |
CN109918555B (en) * | 2019-02-20 | 2021-10-15 | 百度在线网络技术(北京)有限公司 | Method, apparatus, device and medium for providing search suggestions |
CN111666006A (en) * | 2019-03-05 | 2020-09-15 | 京东方科技集团股份有限公司 | Method and device for drawing question and answer, drawing question and answer system and readable storage medium |
CN111666006B (en) * | 2019-03-05 | 2022-01-14 | 京东方科技集团股份有限公司 | Method and device for drawing question and answer, drawing question and answer system and readable storage medium |
CN110162535A (en) * | 2019-03-26 | 2019-08-23 | 腾讯科技(深圳)有限公司 | For executing personalized searching method, device, equipment and storage medium |
CN110162535B (en) * | 2019-03-26 | 2023-11-07 | 腾讯科技(深圳)有限公司 | Search method, apparatus, device and storage medium for performing personalization |
CN110096709A (en) * | 2019-05-07 | 2019-08-06 | 百度在线网络技术(北京)有限公司 | Command processing method and device, server and computer-readable medium |
CN110209766B (en) * | 2019-05-23 | 2021-01-29 | 招商局金融科技有限公司 | Data display method, electronic device and storage medium |
CN110209766A (en) * | 2019-05-23 | 2019-09-06 | 招商局金融科技有限公司 | Method for exhibiting data, electronic device and storage medium |
CN110176315A (en) * | 2019-06-05 | 2019-08-27 | 京东方科技集团股份有限公司 | Medical answering method and system, electronic equipment, computer-readable medium |
CN111339239A (en) * | 2019-06-13 | 2020-06-26 | 海通证券股份有限公司 | Knowledge retrieval method and device, storage medium and server |
CN110799973B (en) * | 2019-09-27 | 2024-04-19 | 京东方科技集团股份有限公司 | Information query method and display device |
CN110799973A (en) * | 2019-09-27 | 2020-02-14 | 京东方科技集团股份有限公司 | Information query method and display device |
US11782976B2 (en) | 2019-09-27 | 2023-10-10 | Boe Technology Group Co., Ltd. | Method for querying information and display device |
CN111597322B (en) * | 2019-12-28 | 2023-04-21 | 华南理工大学 | Automatic template mining system and method based on frequent item sets |
CN111597322A (en) * | 2019-12-28 | 2020-08-28 | 华南理工大学 | Automatic template mining system and method based on frequent item set |
CN111209374A (en) * | 2020-01-07 | 2020-05-29 | 平安科技(深圳)有限公司 | Data query display method and device, computer system and readable storage medium |
CN111209374B (en) * | 2020-01-07 | 2023-06-06 | 平安科技(深圳)有限公司 | Data query display method, device, computer system and readable storage medium |
CN111353021B (en) * | 2020-02-28 | 2023-08-18 | 百度在线网络技术(北京)有限公司 | Intention recognition method and device, electronic device and medium |
CN111353021A (en) * | 2020-02-28 | 2020-06-30 | 百度在线网络技术(北京)有限公司 | Intention recognition method and apparatus, electronic apparatus, and medium |
CN111401048A (en) * | 2020-03-10 | 2020-07-10 | 北京五八信息技术有限公司 | Intention identification method and device |
CN111401048B (en) * | 2020-03-10 | 2021-05-11 | 北京五八信息技术有限公司 | Intention identification method and device |
CN111597433B (en) * | 2020-04-10 | 2023-08-01 | 北京百度网讯科技有限公司 | Resource searching method and device and electronic equipment |
CN111597433A (en) * | 2020-04-10 | 2020-08-28 | 北京百度网讯科技有限公司 | Resource searching method and device and electronic equipment |
CN111488451A (en) * | 2020-06-29 | 2020-08-04 | 上海飞旗网络技术股份有限公司 | SVM classification model-based user query information and intention extraction method and device |
CN111488451B (en) * | 2020-06-29 | 2020-09-18 | 上海飞旗网络技术股份有限公司 | SVM classification model-based user query information and intention extraction method and device |
CN111949898A (en) * | 2020-08-28 | 2020-11-17 | 平安国际智慧城市科技股份有限公司 | Search result ordering method, device, equipment and computer readable storage medium |
CN112182176A (en) * | 2020-09-25 | 2021-01-05 | 北京字节跳动网络技术有限公司 | Intelligent question answering method, device, equipment and readable storage medium |
CN113656584A (en) * | 2021-08-18 | 2021-11-16 | 维沃移动通信有限公司 | User classification method and device, electronic equipment and storage medium |
CN114610914A (en) * | 2022-03-11 | 2022-06-10 | 北京百度网讯科技有限公司 | Information processing method and device and electronic equipment |
CN114969339A (en) * | 2022-05-30 | 2022-08-30 | 中电金信软件有限公司 | Text matching method and device, electronic equipment and readable storage medium |
CN114969339B (en) * | 2022-05-30 | 2023-05-12 | 中电金信软件有限公司 | Text matching method and device, electronic equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108804532B (en) | 2020-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804532A (en) | A kind of recognition methods, the device of the excavation and query intention of query intention | |
CN106415537B (en) | Locally applied search result is inserted into WEB search result | |
US9489457B2 (en) | Methods and apparatus for initiating an action | |
US8812474B2 (en) | Methods and apparatus for identifying and providing information sought by a user | |
US9996531B1 (en) | Conversational understanding | |
JP2019507417A (en) | User interface for multivariable search | |
CN109564571A (en) | Utilize the inquiry recommended method and system of search context | |
US20130144857A1 (en) | Methods and apparatus for searching the internet | |
CN108388582A (en) | The mthods, systems and devices of related entities for identification | |
US9268767B2 (en) | Semantic-based search system and search method thereof | |
CN105224586A (en) | From previous session retrieval situation | |
US10656907B2 (en) | Translation of natural language into user interface actions | |
JP2013537332A (en) | Method and apparatus for internet search | |
US8635201B2 (en) | Methods and apparatus for employing a user's location in providing information to the user | |
CN109165302A (en) | Multimedia file recommendation method and device | |
CN105900087A (en) | Rich content for query answers | |
US20130018864A1 (en) | Methods and apparatus for identifying and providing information of various types to a user | |
CN105745643A (en) | Methods and systems for creating image-based content based on text-based content | |
CN103970791B (en) | A kind of method, apparatus for recommending video from video library | |
CN110096655A (en) | Sort method, device, equipment and the storage medium of search result | |
Sandholm et al. | Real-time, location-aware collaborative filtering of web content | |
TW201224810A (en) | Methods and apparatus for selecting a search engine to which to provide a search query | |
CN106227873A (en) | Searching method and device | |
CN104281656A (en) | Method and device for adding label information into application program | |
CN109165344A (en) | Method and apparatus for pushed information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |