CN108804532A - A kind of recognition methods, the device of the excavation and query intention of query intention - Google Patents

A kind of recognition methods, the device of the excavation and query intention of query intention Download PDF

Info

Publication number
CN108804532A
CN108804532A CN201810416613.1A CN201810416613A CN108804532A CN 108804532 A CN108804532 A CN 108804532A CN 201810416613 A CN201810416613 A CN 201810416613A CN 108804532 A CN108804532 A CN 108804532A
Authority
CN
China
Prior art keywords
intention
major class
intended
keyword
url
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810416613.1A
Other languages
Chinese (zh)
Other versions
CN108804532B (en
Inventor
谢润泉
连凤宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810416613.1A priority Critical patent/CN108804532B/en
Publication of CN108804532A publication Critical patent/CN108804532A/en
Application granted granted Critical
Publication of CN108804532B publication Critical patent/CN108804532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This application discloses the recognition methods of a kind of excavation of query intention and query intention, device, belong to field of communication technology, this method includes, extract each keyword in Query Information to be checked, obtain multiple keyword sets, distribution probability between being intended to based on each keyword set obtained in advance and each major class, calculate distribution probability of the Query Information in each major class intention, and then determine that target major class is intended to, and the intention type based on each keyword in target major class intention, further determine that the son of Query Information is intended to, in this way, improve the efficiency and accuracy of query intention identification, extend the query context of Query Information, increase the diversity of the query result of Query Information.

Description

A kind of recognition methods, the device of the excavation and query intention of query intention
Technical field
This application involves field of communication technology more particularly to a kind of identification sides of the excavation and query intention of query intention Method, device.
Background technology
This part intends to provides background or context for the presently filed embodiment stated in claims.Herein Description recognizes it is the prior art not because not being included in this part.
Currently, when being scanned for according to Query Information input by user, it can be by way of participle from Query Information After middle extraction keyword, search result is returned to by the way of Keywords matching, still, due to the pass extracted from Query Information There may be a variety of ambiguities for keyword, and therefore, the search result of acquisition may differ greatly with the query intention of user.
It is identified for the query intention to user, and is scanned for this, to more accurately be returned most to user Meet the search result of its demand.In the prior art, generally use carries out Query Information and the query intention template of excavation The mode matched determines user's query intention, in the above method, on the one hand, time that the excavation of query intention template needs compared with Long, on the other hand, the query intention template excavated can not also cover all user's query intentions, cause template coverage rate compared with It is low, therefore, query intention recognition efficiency and the accuracy of user how are improved, is be worthy of consideration the problem of.
Invention content
The embodiment of the present application provides a kind of recognition methods, the device of the excavation and query intention of query intention, to right When the query intention of user is excavated and identified, efficiency and the accuracy of query intention excavation and identification are improved.
In a first aspect, a kind of method for digging of query intention, including:
For any keyword set, the corresponding intention mined information of the keyword is obtained;
Based on the incidence relation being intended between mined information and major class intention, determine that each intention mined information is corresponding respectively Major class is intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
It is intended to based on the corresponding major class of each intention mined information, determines keyword set in each major class intention respectively Distribution probability, and determine according to each distribution probability the query intention of keyword set.
Preferably, it includes at least one of following to be intended to mined information:URL, URL title, URL click datas and intention are mended Fill information, wherein obtained in the entitled search results scanned for using keyword set of URL and URL, URL points Data are hit to be determined according to click logs data for URL.
Preferably, being intended to based on the corresponding major class of each intention mined information, determine that keyword set is big at each respectively Distribution probability in class intention, specifically includes:
It is intended to for each corresponding major class of each intention mined information, the URL and URL marks being intended to based on the corresponding major class Topic determines the URL distribution probabilities between keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on keyword set in major class intention Distribution probability determines distribution probability of the keyword set in major class intention.
Preferably, based on URL the and URL titles that the corresponding major class is intended to, determine that keyword set is intended to it with the major class Between URL distribution probabilities, including:
Based on each URL that the corresponding major class is intended to, determine that the URL between keyword set is intended to the major class is matched Degree;
Based on each URL titles that the corresponding major class is intended to, the title between keyword set and the major class are intended to is determined Matching degree;
URL matching degrees between being intended to based on keyword set and the major class and title match degree, determine keyword set URL distribution probabilities between being intended to the major class.
Preferably, based on each URL that the corresponding major class is intended to, between determining that keyword set and the major class are intended to URL matching degrees, including:
Obtain URL name placements of each URL of corresponding major class intention in search result;
Based on the incidence relation between URL name placements and matching attenuation index, each URL name placement is determined respectively Corresponding matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is keyword set and the major class URL matching degrees between intention.
Preferably, based on each URL titles that the corresponding major class is intended to, between determining that keyword set and the major class are intended to Title match degree, including:
It is intended to each corresponding URL title for the major class, determines that every URL titles are intended to encompass with the major class respectively Each term vector between cosine similarity, and by the maximum value in each cosine similarity, as the URL titles with should Semantic similarity between major class intention;
Caption position sequence of each URL titles in search result is obtained, and based on caption position sequence and matching attenuation Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Semantic Similarity and phase that the major class is intended between each corresponding URL title and major class intention are determined respectively Matching product between the matching attenuation index answered;
Determine that the adduction of each matching product is the title match degree between keyword set and major class intention.
Second aspect, a kind of recognition methods of query intention, including:
Each keyword set is obtained based on each keyword extracted from the Query Information of reception, and is obtained using power Distribution probability between each keyword set that the method that profit requires first aspect obtains and each major class intention;
It is intended to for each major class, according to distribution probability of each keyword set in major class intention, determines inquiry letter Cease the distribution probability in major class intention;
Distribution probability based on Query Information in each major class intention determines that the corresponding target major class of Query Information is intended to;
It is intended to according to determining target major class, determines the intention type of each keyword, and the meaning based on each keyword Graph type determines that the son of Query Information is intended to, it is intended that type obtains after dividing major class intention according to resource requirement.
Preferably, the distribution probability according to each keyword set in major class intention, determines Query Information in the major class Distribution probability in intention, specifically includes:
It is intended to for each major class, according to distribution probability and respective weights of each keyword set in major class intention Adduction, determine the match query degree between Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on maximum distribution probability and Query Information and each major class, determines inquiry Distribution probability of the information under major class intention.
Preferably, being intended to according to determining target major class, determines the intention type of each keyword, specifically include:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on target major class Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of intention type is multiple, resource of the keyword under each intention type based on acquisition Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial The highest intention type of word distribution probability is the intention type of the keyword, wherein keyword distribution probability characterization is based on key Resource distribution of the search result that word returns under each intention type.
Preferably, intention type includes type of subject and demand type;And
Intention type based on each keyword determines that the son of Query Information is intended to, specifically includes:
Based on the incidence relation between each keyword and intention type, the pass that type of subject is corresponded in each keyword is determined Keyword is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of main body word is obtained; And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of demand word is obtained;
Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is intended to.
Preferably, in the intention type based on each keyword, after determining that the son of Query Information is intended to, further wrap It includes:
It is based respectively on the combination that main body word is intended to each height, Query Information is scanned for, obtains search result.
The third aspect, a kind of excavating gear of query intention, including:
Search unit obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit, for based on the incidence relation being intended between mined information and major class intention, determining respectively each It is intended to the corresponding major class of mined information to be intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit determines keyword set respectively for being intended to based on the corresponding major class of each intention mined information Distribution probability in each major class intention.
Fourth aspect, a kind of identification device of query intention, including:
Acquiring unit, for obtaining each keyword set based on each keyword extracted from the Query Information of reception It closes, and the distribution obtained between each keyword set and each major class intention using the device acquisition of the third aspect is general Rate;
First determination unit, for being intended to for each major class, according to point of each keyword set in major class intention Cloth probability determines distribution probability of the Query Information in major class intention;
Second determination unit determines Query Information pair for the distribution probability based on Query Information in each major class intention The target major class answered is intended to;
Third determination unit determines the intention type of each keyword, and be based on for being intended to according to determining target major class The intention type of each keyword determines that the son of Query Information is intended to, it is intended that type is that major class is intended to according to resource requirement It is obtained after division.
5th aspect, provides a kind of terminal device, including at least one processing unit and at least one storage unit, Wherein, storage unit is stored with computer program, when program is executed by processing unit so that processing unit executes any of the above-described The step of excavation of query intention a kind of and the recognition methods of query intention.
6th aspect, provides a kind of computer-readable medium, is stored with the computer program that can be executed by terminal device, When program is run on the terminal device so that terminal device executes excavation and the query intention of a kind of any of the above-described query intention Recognition methods the step of.
In a kind of excavation of query intention provided by the embodiments of the present application and the recognition methods of query intention, device, it is based on The each keyword extracted from the Query Information of reception obtains each keyword set, is intended to for each major class, according to each Distribution probability of the keyword set in major class intention determines distribution probability of the Query Information in major class intention;It is based on Distribution probability of the Query Information in each major class intention determines that the corresponding target major class of Query Information is intended to;According to determining mesh It marks major class to be intended to, determines the intention type of each keyword, and the intention type based on each keyword, determine Query Information Son is intended to, it is intended that type obtains after dividing major class intention according to resource requirement.In this way, using predetermined keyword Distribution probability between set and major class intention determines distribution probability of the Query Information in each major class intention, and then determines mesh It marks major class to be intended to be intended to son, improves the efficiency and accuracy of query intention identification, extend the query context of Query Information, Improve the diversity of subsequent search result.
Other features and advantage will illustrate in the following description, also, partly become from specification It obtains it is clear that being understood by implementing the application.The purpose of the application and other advantages can be by the explanations write Specifically noted structure is realized and is obtained in book, claims and attached drawing.
Description of the drawings
Attached drawing described herein is used for providing further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please do not constitute the improper restriction to the application for explaining the application.In the accompanying drawings:
Fig. 1 is the application scenarios schematic diagram according to embodiment of the present invention;
Fig. 2 a are a kind of schematic diagram of the method for digging system of query intention in the application embodiment;
Fig. 2 b are search result schematic diagram in the application embodiment;
Fig. 3 a are a kind of implementing procedure figure of the recognition methods of query intention in the application embodiment;
Fig. 3 b are the distribution probability figure of Query Information in the application embodiment;
Fig. 3 c are the schematic diagram of intention type in the application embodiment;
Fig. 3 d are the syntactic structure schematic diagram one that Query Information extends in the application embodiment;
Fig. 3 e are the syntactic structure schematic diagram two that Query Information extends in the application embodiment;
Fig. 3 f are the configuration diagram of the recognition methods of query intention in the application embodiment;
Fig. 4 is a kind of structural schematic diagram of the excavating gear of query intention in the application embodiment;
Fig. 5 is a kind of structural schematic diagram of the identification device of query intention in the application embodiment;
Fig. 6 is terminal device structural schematic diagram in the application embodiment.
Specific implementation mode
In order to which the efficiency and accuracy, the embodiment of the present application that improve the identification of user's query intention provide a kind of query intention Excavation and query intention recognition methods, device.
First, the part term involved in the embodiment of the present application is illustrated, in order to those skilled in the art understand that.
1, terminal device:Types of applications program, and pair that can will be provided in mounted application program can be installed Equipment as shown in carrying out, the electronic equipment can be mobile, can also be it is fixed,.For example, mobile phone, tablet computer, each Class wearable device, mobile unit, personal digital assistant (personal digital assistant, PDA), point-of-sale terminal (point of sales, POS) or other electronic equipments etc. that can realize above-mentioned function.
2, major class is intended to:For what is obtained after the intention of the inquiry of user is classified according to topic.
The preferred embodiment of the application is illustrated below in conjunction with Figure of description, it should be understood that described herein Preferred embodiment is only used for describing and explaining the application, is not used to limit the application, and in the absence of conflict, this Shen Please in embodiment and embodiment in feature can be combined with each other.
Currently, when being scanned for according to Query Information input by user, the mode of participle extraction is usually first passed through, from looking into It askes in information and extracts keyword, and return to (Uniform Resource Locator, URL) mark by the way of Keywords matching Topic, i.e. Document Title.But since Query Information is usually shorter, and the keyword extracted from Query Information there may be More ambiguity, therefore, the search result of acquisition may differ greatly with the real query intention of user.
For example, Query Information input by user is " rural love 10 ", the URL of return is entitled, and " the 10th says:How to treat A side in love comes from rural area ", it is clear that each keyword that the search result of return only includes with Query Information matches, But to user inquiry it is real be intended to and it is uncorrelated.
In view of this, the embodiment of the present application provides a kind of identifying schemes of the excavation and query intention of query intention, base The each keyword extracted in the Query Information from reception obtains each keyword set, is intended to for each major class, according to Distribution probability of each keyword set in major class intention determines that distribution of the Query Information in major class intention is general Rate;Distribution probability based on the Query Information in each major class intention determines the corresponding target major class meaning of the Query Information Figure;It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement.
The recognition methods of the excavation and query intention of a kind of query intention provided by the embodiments of the present application, can be applied to terminal In equipment, which can be mobile phone, tablet computer, PDA (Personal Digital Assistant, palm PC) Deng.
In order to which the efficiency and accuracy, the embodiment of the present invention that improve the identification of user's query intention give solution, join Application scenarios schematic diagram shown in FIG. 1 is examined, the client for being provided with function of search is installed, then user 10 on user equipment 11 Inquiry request is sent by the user end to server 12 installed in user equipment 11, server 12 is receiving inquiry request Afterwards, each keyword set is obtained based on each keyword extracted from the Query Information of reception, is intended to for each major class, According to distribution probability of each keyword set in major class intention, distribution of the Query Information in major class intention is determined Probability;Distribution probability based on the Query Information in each major class intention, determines the corresponding target major class of the Query Information It is intended to;It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement. In this way, improving efficiency and the accuracy of user's query intention identification.
It should be noted that be communicatively coupled by network between user equipment 11 and server 12, which can be with For LAN, wide area network etc..User equipment 11 can be portable equipment (such as:Mobile phone, tablet, laptop etc.), also may be used Think that PC (PC, Personal Computer), server 12 can be capable of providing setting for Internet service to be any Standby, the client in user equipment 11 can be the client with function of search, can be wechat and QQ browsers etc..
It should be noted that above application scene is merely for convenience of understanding spirit and principles of the present invention and showing, this The embodiment of invention is unrestricted in this regard.On the contrary, embodiments of the present invention can be applied to it is applicable any Scene.
It is a kind of implementing procedure figure of the method for digging of query intention provided by the present application shown in Fig. 2 a.This method Specific implementation flow it is as follows:
Step 200:Server is directed to any keyword set, obtains the corresponding intention mined information of the keyword.
Specifically, it includes at least one of following to be intended to mined information:Uniform resource locator (Uniform Resource Locator, URL), URL titles, URL click datas and be intended to supplemental information.
Wherein, it is intended that supplemental information is each actual resource obtained by keyword, i.e., is directed to each major class intention in advance and sets Therefore the actual resource set, is pre-set this is because the coverage rate of URL and URL titles is not complete for each intention class Actual resource is to increase coverage rate.For example, actual resource is the trade name that shopping is intended in class, brand name etc., music is intended to class In singer, song, album etc..
Wherein, when obtaining intention mined information, any one in following methods or arbitrary combination may be used:
First way is:Server by search engine (e.g., Baidu and wechat) to keyword set to be excavated into Row search, obtains each intention mined information for including URL and URL titles returned based on the keyword.Wherein, based on key It, need not be close to appearance between each keyword when set of words scans for.
Be a kind of search result schematic diagram for example, refering to shown in Fig. 2 b, for keyword set { credit card, enchashment } into The search result obtained after row Baidu search, including each URL and URL titles.
The second way is:Server carries out blog search according to click logs data to keyword set, obtains each The URL click datas of URL.
The third mode is:Server is set pre-set for actual resource (e.g., trade name, singer, song etc.) In the set for the intention supplemental information set, keyword set is carried out to be intended to supplemental information search.
Step 201:Server determines each intention respectively based on the incidence relation being intended between mined information and major class intention The corresponding major class of mined information is intended to.
Specifically, before being excavated to query intention, intention that server inquires user previously according to topic into Row classification obtains each major class and is intended to, and establishes the incidence relation that major class is intended to and is intended between mined information.
Wherein, which includes the incidence relation between URL and major class intention, between URL titles and major class are intended to Incidence relation, URL click datas and major class be intended between incidence relation, it is intended that supplemental information and major class be intended between pass Connection relationship.
Optionally, major class intention can be divided into:Tourism, game, sport, music, video, software, literature, cuisines, medical treatment, Finance and economics, automobile, house property, animation, education, science and technology is military, shopping, chicken soup text, amusement, mother and baby, fashion, and public platform commonly uses inquiry (weather, logistics etc.), personage, information, picture, question and answer, encyclopaedia (experience, knowledge).
In this manner it is possible to be based on each intention mined information, determine that the corresponding major class of each intention mined information is intended to.
Step 202:Server is based on each intentions mined information corresponding major class intention, determine keyword set respectively with often The URL distribution probabilities that one major class is intended to.
Specifically, server is intended to for each major class, following steps are executed:
It is intended to based on the corresponding major class of each URL, determines that the URL between keyword set is intended to a major class is matched Degree, and be intended to based on the corresponding major class of each URL titles, determine the title match between keyword set and the major class are intended to Degree, and based on the URL matching degrees and title match degree between the keyword set and major class intention, determine the keyword set Close the URL distribution probabilities between major class intention.Wherein, distribution probability and URL matching degrees and title match degree are in positive It closes.
Specifically, when determining the URL matching degrees between keyword set and major class intention, following steps can be executed:
URL name placements of each URL of corresponding major class intention in described search result is obtained, and is based on URL Incidence relation between name placement and matching attenuation index determines each corresponding matching attenuation of URL name placements respectively Index, and the adduction of the matching attenuation index of each URL determined is calculated, it is intended to as the keyword set and the major class Between URL matching degrees.
Optionally, when calculating URL matching degrees, following formula may be used:
Wherein, urlmatchFor URL matching degrees, urliIt anticipates for major class for i-th of the URL obtained based on keyword set, c Figure, m are the sum of the corresponding URL of keyword set, indicator (urli, c) and it is 0-1 functions, if urliCorresponding major class meaning Figure is c, is 1, otherwise be 0, pos (i) for the position penalty negatively correlated with i, and i and m are positive integer.
Specifically, when the title match between the determining keyword set and major class intention is spent, following step can be executed Suddenly:
First, it is directed to each URL title of the keyword set respectively, executes following steps:Calculate a URL title Cosine similarity between each term vector being intended to the major class, and the maximum value in each cosine similarity is chosen, make Semantic similarity between being intended to for the URL titles and the major class.
Then, caption position sequence of each URL title in search result is obtained respectively, and is arranged based on caption position Incidence relation between sequence and matching attenuation index determines the corresponding matching attenuation index of each caption position sequence respectively, And based on the semantic similarity between each URL title and a major class intention, between corresponding matching attenuation index Matching product adduction, obtain the keyword set and the major class be intended between title match degree, wherein title match degree It is proportionate with the matching product.
Optionally, when calculating title match is spent, following formula may be used:
Wherein, titlematchFor title match degree, consine () is for calculating the cosine phase between different term vectors Like the function of degree, titlel_ vector is the title vector of first of the URL title obtained based on keyword set, wordk_c_ Vector is k-th of term vector that major class is intended to that c includes, and pos (l) is the position penalty negatively correlated with l, and n marks for URL The sum of topic, y are the sum for each term vector for being intended to setting for major class, and k is the serial number of term vector, and i, n and y are just whole Number.
Optionally, based on the URL matching degrees and title match degree between the keyword set and major class intention, determining should Keyword set and the major class be intended between URL distribution probabilities when, following formula may be used:
Wherein, Purl(c | j) is keyword set j and major class is intended to the URL distribution probabilities between c, urlmatchIt is URL With degree, titlematchFor title match degree, jqvIt is constant, w1 and w2 to be directed to the search temperature of keyword set setting in advance It is weighted value.
Step 203:Each URL click datas that server is intended to according to the corresponding major class determine that keyword set is big at this Click distribution probability in class intention.
Wherein, it is point based on the corresponding URL click datas of Query Information in each major class intention to click distribution probability What cloth determined.
Step 204:Each intention supplemental information that server is intended to according to the corresponding major class determines that keyword set is big at this Intention supplemental information distribution probability in class intention.
Wherein, it is intended that supplemental information distribution probability is anticipated in each major class based on the corresponding each intention supplemental information of Query Information Distribution in figure is determining.
In this way, the hobby of user can be determined by the URL click datas of user, by being intended to supplemental information, raising is searched The coverage rate of rope.
Step 205:URL distribution probability, click distribution probability of the server based on keyword set in major class intention With intention supplemental information distribution probability, distribution probability of the keyword set in major class intention is determined.
Specifically, server is based on the click distribution probability determined and is intended to supplemental information distribution probability, for each A major class is intended to, and executes following steps:
Calculate the click distribution probability under the major class is intended to, it is intended that supplemental information distribution probability and the distribution determined are general The adduction of rate and the product of corresponding weighted value determines distribution probability of the keyword set in major class intention.
Optionally, when calculating the distribution probability after adjustment, following formula may be used:
P (c | j)=Pd(c|j)wd+pn(c|j)wn+Purl(c|j)wurl
Wherein, P (c | j) is distribution probability of the keyword set in major class intention, PdTo click distribution probability, pn is It is intended to supplemental information distribution probability, Purl(c | j) it is URL distribution probabilities, j is keyword set, and c is intended to for major class, wd、wnWith wurlFor weighted value.
In the embodiment of the present application, only by taking the distribution probability between determining that a keyword set and a major class are intended to as an example It illustrates, is based on identical principle, it may be determined that the distribution probability that each keyword set is intended to each major class respectively. Details are not described herein.
Step 206:Distribution probability of the server based on keyword set in each major class intention, determines keyword set Query intention.
Pass through each distribution probability, it may be determined that keyword set respectively each major class intention in distribution situation, to really Fixed corresponding query intention.
In this way, the database of the distribution probability between each keyword set and each major class intention is established, so as to During subsequent, further to be identified to the query intention of Query Information based on each distribution probability.
It is a kind of implementing procedure figure of the recognition methods of query intention provided by the present application shown in Fig. 3 a.This method Specific implementation flow it is as follows:
Step 300:Server obtains each keyword set based on each keyword extracted from the Query Information of reception It closes.
Specifically, server carries out participle extraction to Query Information, several keywords are obtained, and respectively to the part of acquisition Or whole keywords carry out permutation and combination, obtain multiple keyword sets.Wherein, keyword set is combined into arbitrary n keyword Sequential combination, n are integer, no more than the quantity of whole keywords of a Query Information.
Step 301:It is intended to for each major class, server is based on point between keyword set and a major class intention Cloth probability determines the match query degree between the Query Information and the major class are intended to.
Specifically, when executing step 301, the distribution between match query degree and keyword set and a major class are intended to is general Rate is proportionate.
Optionally, when the match query between calculating Query Information and a major class are intended to is spent, following formula may be used:
Wherein, score (c | q) is Query Information and major class is intended to match query degree between c, q for Query Information institute There are keyword set, wjFor the weighted value of keyword set j, P (c | j) is keyword set j and major class is intended to the distribution between c Probability.
Step 302:For server based on the match query degree between Query Information and each major class intention, determining respectively should Distribution probability of the Query Information in each major class intention.
Specifically, when executing step 204, following formula may be used:
Wherein, to be Query Information q be intended to the distribution probability in c to P (c | q) in major class, score (c | q) be Query Information q with Major class is intended to the match query degree between c, and b is the sum of major class intention, the serial number that x is intended to for major class, and p (c | j) it is keyword Set j and major class are intended to the distribution probability between c, and a is constant.
Step 303:The distribution probability that server is intended to according to Query Information in each major class determines that the target of Query Information is big Class is intended to.
It is intended to specifically, server will choose the highest preceding t major class of distribution probability, is determined as target major class intention, t is Positive integer.
It is the distribution probability figure of Query Information shown in Fig. 3 b.For example, Query Information:" moonlight of talk on the journey to west is precious Distribution probability of the box " in video (video) is 0.6, and the distribution probability in encyclopaedia (baike) is 0.12, at amusement (ent) In distribution probability be 0.07, distribution probability in animation (dongman) is 0.06.Then the highest major class of distribution probability is intended to Video is that the target major class of " Moon light treasure box of talk on the journey to west " is intended to.Based on identical principle, it may be determined that other Query Informations Distribution probability and target major class are intended to, and details are not described herein.
Step 304:Server determines the intention type of each keyword.
Specifically, server is directed to each keyword respectively, each target major class is intended to, and executes following steps:
First, server obtains each intention type for being intended to setting for target major class, determines the intention type of keyword.
Wherein, it is intended that type obtains after dividing major class intention according to resource requirement.
It is the schematic diagram of each intention type for example, refering to shown in Fig. 3 c.Optionally, the intention class for including during video is intended to Type can be movie name, TV play name, variety name, performer, broadcasting, personage, film review, ticketing service, music, plot, public platform, neglect Frequency and video requirement etc..
Then, server judges whether the quantity of intention type is one, if so, determining that the intention type is the key Otherwise the intention type of word calculates resource distribution probability and user point of the keyword of acquisition under each intention type Set of the data respectively with respective weights product is hit, as keyword distribution probability of the keyword under each intention type, and By the highest intention type of keyword distribution probability, it is determined as the last intention type of the keyword.Wherein, keyword distribution is general Rate is the resource distribution and user preferences for characterizing the search result returned based on keyword under each intention type.
Step 305:Intention type of the server based on each keyword determines that the son of the Query Information is intended to.
Specifically, each intention type can also be divided into type of subject and demand type.Such as " nothing asks Xi Dong to keyword set Film review " in, the intention type of " without ask Xi Dong " had both been movie name, also based on type, " film review " is demand type.
Server is determined based on the incidence relation between each keyword and intention type and is corresponded to main body class in each keyword The keyword of type is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;Based on the pass between intention type and conjunctive word Connection relationship obtains the corresponding conjunctive word of intention type of main body word;And based on the incidence relation between demand word and conjunctive word, The corresponding conjunctive word of acquisition demand word;Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is anticipated Figure.
Step 306:Server is based respectively on the combination that the main body word and each height are intended to, to the Query Information into Row search, obtains search result.
Specifically, server scans for after being combined with main body word after each sub- intention progress or combinations, searched Hitch fruit.
If being combined since main body word is carried out and with each sub- intention, excessive search terms can increase retrieval pressure, drop Low effectiveness of retrieval in the embodiment of the present application, increases or nodes in the syntax tree structure of existing Query Information, will inquiry letter Breath is extended, and improves effectiveness of retrieval.
For example, refering to shown in Fig. 3 d, for the syntactic structure schematic diagram one of Query Information extension.Query Information is " to embroider spring knife Where watch ", determine that the intention type of main body word " embroidering spring knife " is movie name, " viewing " is demand word.Obtain the pass of movie name It is " film review ", " stage photo " and " director " to join word.
In another example refering to shown in Fig. 3 e, for the syntactic structure schematic diagram two of Query Information extension.Query Information is " warwolf 2 Wonderful " determines that the intention type of main body word " warwolf 2 " is movie name, and " wonderful " is demand word.Obtain movie name Conjunctive word is " film review ", " stage photo " and " director ".
In another example refering to shown in Fig. 3 e, for the syntactic structure schematic diagram two of Query Information extension.Query Information is " to read aloud Person's variety " determines that the intention type of main body word " declaimer " is variety name, and " variety " is demand word.Obtain the association of variety name Word is " welcome guest ", " evaluation " and " viewing ".
In this manner it is possible to by the distribution probability between predetermined keyword set and major class intention, quickly really The target major class for determining Query Information is intended to, and then by the intention type of each keyword, determines that son is intended to, and pass through sub- intention pair Query Information is expanded, and the diversity of search result is increased.
Further, refering to the evaluation and test table shown in table 1, identified for query intention.Wechat and Baidu is respectively adopted in server It scans for, wherein top is to choose Query Information according to temperature, and Random is the Query Information by obtaining at random, it is clear that By the accuracy for the result that major class is intended to scan for, less than the result scanned for by the query intention determined after extension Accuracy.
Table 1
Search engine Major class is intended to (top) Major class is intended to (Random) Purposes (top) Purposes (Random)
Wechat 84.3% 80.6% 88.2% 75.3%
Baidu 96.2% 95.5% 93.4% 84.1%
It is the configuration diagram of the recognition methods of query intention shown in Fig. 3 f.Server is previously according to each keyword Distribution probability between set and each major class intention clicks distribution probability and is intended to supplemental information distribution probability, determines crucial Distribution probability of the set of words in each major class intention.Then when carrying out the identification of query intention to Query Information, based on inquiry letter Each keyword set of breath obtains corresponding distribution probability, and then determines the target distribution probability of Query Information, and then passes through mesh The intention type for marking major class intention and each keyword determines that the son of Query Information is intended to, finally, obtains the inquiry of Query Information As a result.
Based on same inventive concept, a kind of excavating gear of query intention is additionally provided in the embodiment of the present application, due to upper It states device and principle that equipment solves the problems, such as is similar to a kind of method for digging of query intention, therefore, the implementation of above-mentioned apparatus can With referring to the implementation of method, overlaps will not be repeated.
As shown in figure 4, it is a kind of structural schematic diagram of the excavating gear of query intention provided by the embodiments of the present application, packet It includes:
Search unit 40 obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit 41, for based on the incidence relation being intended between mined information and major class intention, determining respectively Each corresponding major class of mined information that is intended to is intended to, wherein major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit 42 determines keyword set respectively for being intended to based on the corresponding major class of each intention mined information Close the distribution probability in each major class intention.
Preferably, it includes at least one of following to be intended to mined information:URL, URL title, URL click datas and intention are mended Fill information, wherein obtained in the entitled search results scanned for using keyword set of URL and URL, URL points Data are hit to be determined according to click logs data for URL.
Preferably, being intended to based on the corresponding major class of each intention mined information, determine keyword set at each respectively Major class be intended in distribution probability when, the second determination unit 42 is specifically used for:
It is intended to for each corresponding major class of each intention mined information, the URL and URL marks being intended to based on the corresponding major class Topic determines the URL distribution probabilities between keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on keyword set in major class intention Distribution probability determines distribution probability of the keyword set in major class intention.
Preferably, in URL the and URL titles being intended to based on the corresponding major class, determine that keyword set is intended to the major class Between URL distribution probabilities when, the second determination unit 42 is additionally operable to::
Based on each URL that the corresponding major class is intended to, determine that the URL between keyword set is intended to the major class is matched Degree;
Based on each URL titles that the corresponding major class is intended to, the title between keyword set and the major class are intended to is determined Matching degree;
URL matching degrees between being intended to based on keyword set and the major class and title match degree, determine keyword set URL distribution probabilities between being intended to the major class.
Preferably, in each URL being intended to based on the corresponding major class, between determining that keyword set and the major class are intended to When URL matching degrees, the second determination unit 42 is additionally operable to:
Obtain URL name placements of each URL of corresponding major class intention in search result;
Based on the incidence relation between URL name placements and matching attenuation index, each URL name placement is determined respectively Corresponding matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is keyword set and the major class URL matching degrees between intention.
Preferably, in each URL titles being intended to based on the corresponding major class, determine that keyword set is intended to it with the major class Between title match when spending, the second determination unit 42 is additionally operable to:
It is intended to each corresponding URL title for the major class, determines that every URL titles are intended to encompass with the major class respectively Each term vector between cosine similarity, and by the maximum value in each cosine similarity, as the URL titles with should Semantic similarity between major class intention;
Caption position sequence of each URL titles in search result is obtained, and based on caption position sequence and matching attenuation Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Semantic Similarity and phase that the major class is intended between each corresponding URL title and major class intention are determined respectively Matching product between the matching attenuation index answered;
Determine that the adduction of each matching product is the title match degree between keyword set and major class intention.
Based on same inventive concept, a kind of excavating gear of query intention is additionally provided in the embodiment of the present application, due to upper It states device and principle that equipment solves the problems, such as is similar to a kind of method for digging of query intention, therefore, the implementation of above-mentioned apparatus can With referring to the implementation of method, overlaps will not be repeated.
As shown in figure 5, it is a kind of structural schematic diagram of the identification device of query intention provided by the embodiments of the present application, packet It includes:
Acquiring unit 50, for obtaining each keyword set based on each keyword extracted from the Query Information of reception It closes, and obtains between excavating gear each keyword set obtained and each major class intention using above-mentioned query intention Distribution probability;
First determination unit 51, for being intended to for each major class, according to each keyword set in major class intention Distribution probability determines distribution probability of the Query Information in major class intention;
Second determination unit 52 determines Query Information for the distribution probability based on Query Information in each major class intention Corresponding target major class is intended to;
Third determination unit 53 determines the intention type of each keyword, and base for being intended to according to determining target major class In the intention type of each keyword, determine that the son of Query Information is intended to, it is intended that type is to be intended to need according to resource by major class It asks and is obtained after dividing.
Preferably, in the distribution probability according to each keyword set in major class intention, determine that Query Information is big at this Class be intended in distribution probability when, the second determination unit 52 is specifically used for:
It is intended to for each major class, according to distribution probability and respective weights of each keyword set in major class intention Adduction, determine the match query degree between Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on maximum distribution probability and Query Information and each major class, determines inquiry Distribution probability of the information under major class intention.
Preferably, according to determining target major class intention, when determining the intention type of each keyword, third determination unit 53 are specifically used for:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on target major class Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of intention type is multiple, resource of the keyword under each intention type based on acquisition Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial The highest intention type of word distribution probability is the intention type of the keyword, wherein keyword distribution probability characterization is based on key Resource distribution of the search result that word returns under each intention type.
Preferably, intention type includes type of subject and demand type;And
In the intention type based on each keyword, when determining that the son of Query Information is intended to, third determination unit 53 has Body is used for:
Based on the incidence relation between each keyword and intention type, the pass that type of subject is corresponded in each keyword is determined Keyword is main pronouns, general term for nouns, numerals and measure words, and the keyword of corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of main body word is obtained; And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of demand word is obtained;
Determine that the arbitrary son for being combined as Query Information of demand word and each conjunctive word of acquisition is intended to.
Preferably, in the intention type based on each keyword, after determining that the son of Query Information is intended to, third determines Unit 53 is specifically used for:
It is based respectively on the combination that main body word is intended to each height, Query Information is scanned for, obtains search result.
For convenience of description, above each section is divided by function describes respectively for each module (or unit).Certainly, exist Implement the function of each module (or unit) can be realized in same or multiple softwares or hardware when the application.
Based on same technical concept, the embodiment of the present application also provides a kind of terminal devices 600, with reference to shown in Fig. 6, terminal Equipment 600 is used to implement the method that above-mentioned each embodiment of the method is recorded, such as embodiment shown in implementing Fig. 3 a, terminal device 600 may include memory 601, processor 602, input unit 603 and display panel 604.
The memory 601, the computer program for storing the execution of processor 602.Memory 601 can include mainly depositing Store up program area and storage data field, wherein storing program area can storage program area, the application program needed at least one function Deng;Storage data field can be stored uses created data etc. according to terminal device 600.Processor 602 can be in one Central Processing Unit (central processing unit, CPU), or be digital processing element etc..Input unit 603, can For obtaining user instruction input by user.The display panel 604, for showing information input by user or being supplied to The information of user, in the embodiment of the present application, display panel 604 is mainly used for display circle of each application program in display terminal The control object shown in face and each display interface.Optionally, liquid crystal display (liquid may be used in display panel 604 Crystal display, LCD) or the forms such as OLED (organic light-emitting diode, Organic Light Emitting Diode) To configure display panel 604.
Above-mentioned memory 601, processor 602, input unit 603 and display panel 604 are not limited in the embodiment of the present application Between specific connection medium.The embodiment of the present application is in figure 6 with memory 601, processor 602, input unit 603, display It is connected by bus 605 between panel 604, bus 605 is indicated with thick line in figure 6, the connection type between other components, only It is to be schematically illustrated, does not regard it as and be limited.The bus 605 can be divided into address bus, data/address bus, controlling bus Deng.For ease of indicating, only indicated with a thick line in Fig. 6, it is not intended that an only bus or a type of bus.
Memory 601 can be volatile memory (volatile memory), such as random access memory (random-access memory, RAM);Memory 601 can also be nonvolatile memory (non-volatile Memory), such as read-only memory, flash memory (flash memory), hard disk (hard disk drive, HDD) or solid State hard disk (solid-state drive, SSD) or memory 601 can be used for carrying or store with instruction or data The desired program code of structure type and can by any other medium of computer access, but not limited to this.Memory 601 It can be the combination of above-mentioned memory.
Processor 602, for realizing the embodiment as shown in implementing Fig. 3 a, including:
The processor 602, for calling the computer program stored in the memory 601 to execute such as implementing Fig. 3 a institutes The embodiment shown.
The embodiment of the present application also provides a kind of computer readable storage medium, it is stored as executing and is held needed for above-mentioned processor Capable computer executable instructions, it includes for executing the program executed needed for above-mentioned processor.
In some possible embodiments, a kind of various aspects of the method for digging of query intention provided by the present application are also It can be implemented as a kind of form of program product comprising program code, when described program product is run on the terminal device, Said program code be used for make the terminal device execute this specification foregoing description according to the various exemplary implementations of the application Step in a kind of method for digging of query intention of mode.For example, the terminal device can be executed as shown in implementing Fig. 3 a Embodiment.
The arbitrary combination of one or more readable mediums may be used in described program product.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example may be-but not limited to-electricity, magnetic, optical, electromagnetic, red The system of outside line or semiconductor, device or device, or the arbitrary above combination.The more specific example of readable storage medium storing program for executing (non exhaustive list) includes:Electrical connection, portable disc with one or more conducting wires, hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
A kind of may be used for program product of the method for digging of query intention for presently filed embodiment is portable Compact disk read-only memory (CD-ROM) and include program code, and can run on the computing device.However, the journey of the application Sequence product is without being limited thereto, in this document, readable storage medium storing program for executing can be any include or storage program tangible medium, the journey Sequence can be commanded the either device use or in connection of execution system, device.
Readable signal medium may include in a base band or as the data-signal that a carrier wave part is propagated, wherein carrying Readable program code.Diversified forms may be used in the data-signal of this propagation, including --- but being not limited to --- electromagnetism letter Number, optical signal or above-mentioned any appropriate combination.Readable signal medium can also be other than readable storage medium storing program for executing it is any can Read medium, which can send, propagate either transmission for being used by instruction execution system, device or device or Program in connection.
The program code for including on readable medium can transmit with any suitable medium, including --- but being not limited to --- Wirelessly, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
Can with any combination of one or more programming languages come write for execute the application operation program Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It executes on computing device, partly execute on a user device, being executed as an independent software package, partly in user's calculating Upper side point is executed or is executed in remote computing device or server completely on a remote computing.It is being related to far In the situation of journey computing device, remote computing device can pass through the network of any kind --- including LAN (LAN) or extensively Domain net (WAN)-be connected to user calculating equipment, or, it may be connected to external computing device (such as utilize Internet service Provider is connected by internet).
It should be noted that although being referred to several units or subelement of device in above-detailed, this stroke It point is only exemplary not enforceable.In fact, according to presently filed embodiment, it is above-described two or more The feature and function of unit can embody in a unit.Conversely, the feature and function of an above-described unit can It is embodied by multiple units with being further divided into.
In addition, although the operation of the application method is described with particular order in the accompanying drawings, this do not require that or Hint must execute these operations according to the particular order, or have to carry out shown in whole operation could realize it is desired As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/or by one Step is decomposed into execution of multiple steps.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, the application can be used in one or more wherein include computer usable program code computer The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The application is with reference to method, the flow of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment of the application has been described, created once a person skilled in the art knows basic Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the application range.

Claims (15)

1. a kind of method for digging of query intention, which is characterized in that including:
For any keyword set, the corresponding intention mined information of the keyword is obtained;
Incidence relation between being intended to based on the intention mined information and major class determines that each intention mined information is corresponding respectively Major class is intended to, wherein the major class is intended that the intention of the inquiry obtained according to topic classification;
It is intended to based on the corresponding major class of each intention mined information, determines the keyword set in each major class intention respectively Distribution probability, and according to each distribution probability determine the keyword set query intention be distributed.
2. the method as described in claim 1, which is characterized in that the intention mined information includes at least one of following:System One Resource Locator URL, URL title, URL click datas and intention supplemental information, wherein the entitled utilizations of the URL and URL It is obtained in the search result that the keyword set scans for, the URL click datas are for URL according to click What daily record data was determined.
3. method as claimed in claim 2, which is characterized in that be intended to based on the corresponding major class of each intention mined information, respectively It determines distribution probability of the keyword set in each major class intention, specifically includes:
It is intended to for corresponding each major class of each intention mined information, based on URL the and URL titles that the corresponding major class is intended to, Determine the URL distribution probabilities between the keyword set and the major class are intended to;
According to each URL click datas that the corresponding major class is intended to, click of the keyword set in major class intention is determined Distribution probability;
According to each intention supplemental information that the corresponding major class is intended to, intention of the keyword set in major class intention is determined Supplemental information distribution probability;
URL distribution probabilities, click distribution probability and intention supplemental information based on the keyword set in major class intention Distribution probability determines distribution probability of the keyword set in major class intention.
4. method as claimed in claim 3, which is characterized in that based on URL the and URL titles that the corresponding major class is intended to, determine URL distribution probabilities between the keyword set and major class intention, including:
Based on each URL that the corresponding major class is intended to, determine that the URL between the keyword set is intended to the major class is matched Degree;
Based on each URL titles that the corresponding major class is intended to, the title between the keyword set and the major class are intended to is determined Matching degree;
URL matching degrees between being intended to based on the keyword set and the major class and title match degree, determine the keyword URL distribution probabilities between set and major class intention.
5. method as claimed in claim 4, which is characterized in that based on each URL that the corresponding major class is intended to, determine the pass URL matching degrees between keyword set and major class intention, including:
Obtain URL name placements of each URL of corresponding major class intention in described search result;
Based on the incidence relation between URL name placements and matching attenuation index, determine that each URL name placement corresponds to respectively Matching attenuation index;
Determine that the adduction of the matching attenuation index for each URL that the corresponding major class is intended to is the keyword set and the major class URL matching degrees between intention.
6. method as claimed in claim 4, which is characterized in that based on each URL titles that the corresponding major class is intended to, determine institute The title match degree between keyword set and major class intention is stated, including:
It is intended to each corresponding URL title for the major class, it is every determines that every URL titles are intended to encompass with the major class respectively Cosine similarity between one term vector, and by the maximum value in each cosine similarity, as the URL titles and the major class Semantic similarity between intention;
Caption position sequence of each URL titles in described search result is obtained, and based on caption position sequence and matching attenuation Incidence relation between index determines the corresponding matching attenuation index of each caption position sequence respectively;
Determine respectively the major class be intended to each corresponding URL title and the major class be intended between Semantic Similarity with it is corresponding Matching product between matching attenuation index;
Determine that the adduction of each matching product is the title match degree between the keyword set and major class intention.
7. a kind of query intention recognition methods, which is characterized in that including:
Each keyword set is obtained based on each keyword extracted from the Query Information of reception, and obtains and is wanted using right Distribution between asking method each keyword set obtained described in 1~6 any claim and each major class to be intended to Probability;
It is intended to for each major class, according to distribution probability of each keyword set in major class intention, determines the inquiry letter Cease the distribution probability in major class intention;
Distribution probability based on the Query Information in each major class intention determines the corresponding target major class meaning of the Query Information Figure;
It is intended to according to determining target major class, determines the intention type of each keyword, and the intention class based on each keyword Type determines that the son of the Query Information is intended to, and the intention type obtains after dividing major class intention according to resource requirement.
8. the method for claim 7, which is characterized in that general according to distribution of each keyword set in major class intention Rate determines distribution probability of the Query Information in major class intention, specifically includes:
For each major class be intended to, according to each keyword set the major class intention in distribution probability and respective weights plus With determine the match query degree between the Query Information and the major class are intended to;
Obtain maximum distribution probability of each keyword set in major class intention;
Match query degree between being intended to based on the maximum distribution probability and the Query Information and each major class, is determined Distribution probability of the Query Information under major class intention.
9. the method for claim 7, which is characterized in that be intended to according to determining target major class, determine each keyword Intention type specifically includes:
For each keyword, being associated between each intention type and keyword that are intended to encompass based on the target major class Relationship determines the corresponding intention type of the keyword;
When determining that the quantity of the intention type is multiple, resource of the keyword under each intention type based on acquisition Distribution probability and user click data determine keyword distribution probability of the keyword under each intention type, and determine crucial The highest intention type of word distribution probability is the intention type of the keyword, wherein the keyword distribution probability characterization is based on Resource distribution of the search result that keyword returns under each intention type.
10. the method for claim 7, which is characterized in that the intention type includes type of subject and demand type;With And
Intention type based on each keyword determines that the son of the Query Information is intended to, specifically includes:
Based on the incidence relation between each keyword and intention type, the keyword that type of subject is corresponded in each keyword is determined Keyword for main pronouns, general term for nouns, numerals and measure words, corresponding demand type is demand word;
Based on the incidence relation between intention type and conjunctive word, the corresponding conjunctive word of intention type of the main body word is obtained; And
Based on the incidence relation between demand word and conjunctive word, the corresponding conjunctive word of the demand word is obtained;
Determine that the son for being arbitrarily combined as the Query Information of the demand word and each conjunctive word of acquisition is intended to.
11. method as claimed in claim 10, which is characterized in that in the intention type based on each keyword, determine institute After the son intention for stating Query Information, further comprise:
It is based respectively on the combination that the main body word is intended to each height, the Query Information is scanned for, obtains search knot Fruit.
12. a kind of excavating gear of query intention, which is characterized in that including:
Search unit obtains the corresponding intention mined information of the keyword for being directed to any keyword set;
First determination unit is determined each respectively for the incidence relation between being intended to based on the intention mined information and major class It is intended to the corresponding major class of mined information to be intended to, wherein the major class is intended that the intention of the inquiry obtained according to topic classification;
Second determination unit determines the keyword set respectively for being intended to based on the corresponding major class of each intention mined information Distribution probability in each major class intention.
13. a kind of identification device of query intention, which is characterized in that including:
Acquiring unit, for obtaining each keyword set based on each keyword extracted from the Query Information of reception, and It obtains using the distribution between each keyword set obtained of the device described in claim 12 and each major class intention Probability;
First determination unit, it is general according to distribution of each keyword set in major class intention for being intended to for each major class Rate determines distribution probability of the Query Information in major class intention;
Second determination unit determines the inquiry letter for the distribution probability based on the Query Information in each major class intention Corresponding target major class is ceased to be intended to;
Third determination unit determines the intention type of each keyword, and based on each for being intended to according to determining target major class The intention type of a keyword determines that the son of the Query Information is intended to, and the intention type is that major class is intended to according to resource What demand obtained after dividing.
14. a kind of terminal device, which is characterized in that including at least one processing unit and at least one storage unit, In, the storage unit is stored with computer program, when described program is executed by the processing unit so that the processing is single First perform claim requires the step of 1-6 or 7-11 any claim the methods.
15. a kind of computer-readable medium, which is characterized in that it is stored with the computer program that can be executed by terminal device, when When described program is run on the terminal device so that the terminal device perform claim requires any the methods of 1-6 or 7-11 The step of.
CN201810416613.1A 2018-05-03 2018-05-03 Query intention mining method and device and query intention identification method and device Active CN108804532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810416613.1A CN108804532B (en) 2018-05-03 2018-05-03 Query intention mining method and device and query intention identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810416613.1A CN108804532B (en) 2018-05-03 2018-05-03 Query intention mining method and device and query intention identification method and device

Publications (2)

Publication Number Publication Date
CN108804532A true CN108804532A (en) 2018-11-13
CN108804532B CN108804532B (en) 2020-06-26

Family

ID=64093548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810416613.1A Active CN108804532B (en) 2018-05-03 2018-05-03 Query intention mining method and device and query intention identification method and device

Country Status (1)

Country Link
CN (1) CN108804532B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783608A (en) * 2018-12-20 2019-05-21 出门问问信息科技有限公司 Determination method, apparatus, readable storage medium storing program for executing and the electronic equipment of goal hypothesis
CN109783736A (en) * 2019-01-18 2019-05-21 广东小天才科技有限公司 A kind of intention estimation method and system
CN109815314A (en) * 2019-01-04 2019-05-28 平安科技(深圳)有限公司 A kind of intension recognizing method, identification equipment and computer readable storage medium
CN109918555A (en) * 2019-02-20 2019-06-21 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the medium suggested for providing search
CN110096709A (en) * 2019-05-07 2019-08-06 百度在线网络技术(北京)有限公司 Command processing method and device, server and computer-readable medium
CN110162535A (en) * 2019-03-26 2019-08-23 腾讯科技(深圳)有限公司 For executing personalized searching method, device, equipment and storage medium
CN110176315A (en) * 2019-06-05 2019-08-27 京东方科技集团股份有限公司 Medical answering method and system, electronic equipment, computer-readable medium
CN110209766A (en) * 2019-05-23 2019-09-06 招商局金融科技有限公司 Method for exhibiting data, electronic device and storage medium
CN110799973A (en) * 2019-09-27 2020-02-14 京东方科技集团股份有限公司 Information query method and display device
CN111209374A (en) * 2020-01-07 2020-05-29 平安科技(深圳)有限公司 Data query display method and device, computer system and readable storage medium
CN111339239A (en) * 2019-06-13 2020-06-26 海通证券股份有限公司 Knowledge retrieval method and device, storage medium and server
CN111353021A (en) * 2020-02-28 2020-06-30 百度在线网络技术(北京)有限公司 Intention recognition method and apparatus, electronic apparatus, and medium
CN111368161A (en) * 2018-12-26 2020-07-03 北京搜狗科技发展有限公司 Search intention recognition method and intention recognition model training method and device
CN111401048A (en) * 2020-03-10 2020-07-10 北京五八信息技术有限公司 Intention identification method and device
CN111488451A (en) * 2020-06-29 2020-08-04 上海飞旗网络技术股份有限公司 SVM classification model-based user query information and intention extraction method and device
CN111597322A (en) * 2019-12-28 2020-08-28 华南理工大学 Automatic template mining system and method based on frequent item set
CN111597433A (en) * 2020-04-10 2020-08-28 北京百度网讯科技有限公司 Resource searching method and device and electronic equipment
CN111666006A (en) * 2019-03-05 2020-09-15 京东方科技集团股份有限公司 Method and device for drawing question and answer, drawing question and answer system and readable storage medium
CN111949898A (en) * 2020-08-28 2020-11-17 平安国际智慧城市科技股份有限公司 Search result ordering method, device, equipment and computer readable storage medium
CN112182176A (en) * 2020-09-25 2021-01-05 北京字节跳动网络技术有限公司 Intelligent question answering method, device, equipment and readable storage medium
CN113656584A (en) * 2021-08-18 2021-11-16 维沃移动通信有限公司 User classification method and device, electronic equipment and storage medium
CN114610914A (en) * 2022-03-11 2022-06-10 北京百度网讯科技有限公司 Information processing method and device and electronic equipment
CN114969339A (en) * 2022-05-30 2022-08-30 中电金信软件有限公司 Text matching method and device, electronic equipment and readable storage medium
CN110799973B (en) * 2019-09-27 2024-04-19 京东方科技集团股份有限公司 Information query method and display device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103873601A (en) * 2012-12-11 2014-06-18 百度在线网络技术(北京)有限公司 Addressing class query word mining method and system
CN106302350A (en) * 2015-06-01 2017-01-04 阿里巴巴集团控股有限公司 URL monitoring method, device and equipment
CN106649818A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Recognition method and device for application search intentions and application search method and server
WO2017107708A1 (en) * 2015-12-25 2017-06-29 北京搜狗科技发展有限公司 User proxy self-adaptation uniform resource locator prefix mining method and device
CN107958078A (en) * 2017-12-13 2018-04-24 北京百度网讯科技有限公司 Information generating method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103873601A (en) * 2012-12-11 2014-06-18 百度在线网络技术(北京)有限公司 Addressing class query word mining method and system
CN106302350A (en) * 2015-06-01 2017-01-04 阿里巴巴集团控股有限公司 URL monitoring method, device and equipment
WO2017107708A1 (en) * 2015-12-25 2017-06-29 北京搜狗科技发展有限公司 User proxy self-adaptation uniform resource locator prefix mining method and device
CN106649818A (en) * 2016-12-29 2017-05-10 北京奇虎科技有限公司 Recognition method and device for application search intentions and application search method and server
CN107958078A (en) * 2017-12-13 2018-04-24 北京百度网讯科技有限公司 Information generating method and device

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783608A (en) * 2018-12-20 2019-05-21 出门问问信息科技有限公司 Determination method, apparatus, readable storage medium storing program for executing and the electronic equipment of goal hypothesis
CN111368161A (en) * 2018-12-26 2020-07-03 北京搜狗科技发展有限公司 Search intention recognition method and intention recognition model training method and device
CN111368161B (en) * 2018-12-26 2024-01-09 北京搜狗科技发展有限公司 Search intention recognition method, intention recognition model training method and device
CN109815314B (en) * 2019-01-04 2023-08-08 平安科技(深圳)有限公司 Intent recognition method, recognition device and computer readable storage medium
CN109815314A (en) * 2019-01-04 2019-05-28 平安科技(深圳)有限公司 A kind of intension recognizing method, identification equipment and computer readable storage medium
WO2020140373A1 (en) * 2019-01-04 2020-07-09 平安科技(深圳)有限公司 Intention recognition method, recognition device and computer-readable storage medium
CN109783736A (en) * 2019-01-18 2019-05-21 广东小天才科技有限公司 A kind of intention estimation method and system
CN109783736B (en) * 2019-01-18 2022-03-08 广东小天才科技有限公司 Intention presumption method and system
US11436282B2 (en) 2019-02-20 2022-09-06 Baidu Online Network Technology (Beijing) Co., Ltd. Methods, devices and media for providing search suggestions
CN109918555A (en) * 2019-02-20 2019-06-21 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the medium suggested for providing search
CN109918555B (en) * 2019-02-20 2021-10-15 百度在线网络技术(北京)有限公司 Method, apparatus, device and medium for providing search suggestions
CN111666006A (en) * 2019-03-05 2020-09-15 京东方科技集团股份有限公司 Method and device for drawing question and answer, drawing question and answer system and readable storage medium
CN111666006B (en) * 2019-03-05 2022-01-14 京东方科技集团股份有限公司 Method and device for drawing question and answer, drawing question and answer system and readable storage medium
CN110162535A (en) * 2019-03-26 2019-08-23 腾讯科技(深圳)有限公司 For executing personalized searching method, device, equipment and storage medium
CN110162535B (en) * 2019-03-26 2023-11-07 腾讯科技(深圳)有限公司 Search method, apparatus, device and storage medium for performing personalization
CN110096709A (en) * 2019-05-07 2019-08-06 百度在线网络技术(北京)有限公司 Command processing method and device, server and computer-readable medium
CN110209766B (en) * 2019-05-23 2021-01-29 招商局金融科技有限公司 Data display method, electronic device and storage medium
CN110209766A (en) * 2019-05-23 2019-09-06 招商局金融科技有限公司 Method for exhibiting data, electronic device and storage medium
CN110176315A (en) * 2019-06-05 2019-08-27 京东方科技集团股份有限公司 Medical answering method and system, electronic equipment, computer-readable medium
CN111339239A (en) * 2019-06-13 2020-06-26 海通证券股份有限公司 Knowledge retrieval method and device, storage medium and server
CN110799973B (en) * 2019-09-27 2024-04-19 京东方科技集团股份有限公司 Information query method and display device
CN110799973A (en) * 2019-09-27 2020-02-14 京东方科技集团股份有限公司 Information query method and display device
US11782976B2 (en) 2019-09-27 2023-10-10 Boe Technology Group Co., Ltd. Method for querying information and display device
CN111597322B (en) * 2019-12-28 2023-04-21 华南理工大学 Automatic template mining system and method based on frequent item sets
CN111597322A (en) * 2019-12-28 2020-08-28 华南理工大学 Automatic template mining system and method based on frequent item set
CN111209374A (en) * 2020-01-07 2020-05-29 平安科技(深圳)有限公司 Data query display method and device, computer system and readable storage medium
CN111209374B (en) * 2020-01-07 2023-06-06 平安科技(深圳)有限公司 Data query display method, device, computer system and readable storage medium
CN111353021B (en) * 2020-02-28 2023-08-18 百度在线网络技术(北京)有限公司 Intention recognition method and device, electronic device and medium
CN111353021A (en) * 2020-02-28 2020-06-30 百度在线网络技术(北京)有限公司 Intention recognition method and apparatus, electronic apparatus, and medium
CN111401048A (en) * 2020-03-10 2020-07-10 北京五八信息技术有限公司 Intention identification method and device
CN111401048B (en) * 2020-03-10 2021-05-11 北京五八信息技术有限公司 Intention identification method and device
CN111597433B (en) * 2020-04-10 2023-08-01 北京百度网讯科技有限公司 Resource searching method and device and electronic equipment
CN111597433A (en) * 2020-04-10 2020-08-28 北京百度网讯科技有限公司 Resource searching method and device and electronic equipment
CN111488451A (en) * 2020-06-29 2020-08-04 上海飞旗网络技术股份有限公司 SVM classification model-based user query information and intention extraction method and device
CN111488451B (en) * 2020-06-29 2020-09-18 上海飞旗网络技术股份有限公司 SVM classification model-based user query information and intention extraction method and device
CN111949898A (en) * 2020-08-28 2020-11-17 平安国际智慧城市科技股份有限公司 Search result ordering method, device, equipment and computer readable storage medium
CN112182176A (en) * 2020-09-25 2021-01-05 北京字节跳动网络技术有限公司 Intelligent question answering method, device, equipment and readable storage medium
CN113656584A (en) * 2021-08-18 2021-11-16 维沃移动通信有限公司 User classification method and device, electronic equipment and storage medium
CN114610914A (en) * 2022-03-11 2022-06-10 北京百度网讯科技有限公司 Information processing method and device and electronic equipment
CN114969339A (en) * 2022-05-30 2022-08-30 中电金信软件有限公司 Text matching method and device, electronic equipment and readable storage medium
CN114969339B (en) * 2022-05-30 2023-05-12 中电金信软件有限公司 Text matching method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN108804532B (en) 2020-06-26

Similar Documents

Publication Publication Date Title
CN108804532A (en) A kind of recognition methods, the device of the excavation and query intention of query intention
CN106415537B (en) Locally applied search result is inserted into WEB search result
US9489457B2 (en) Methods and apparatus for initiating an action
US8812474B2 (en) Methods and apparatus for identifying and providing information sought by a user
US9996531B1 (en) Conversational understanding
JP2019507417A (en) User interface for multivariable search
CN109564571A (en) Utilize the inquiry recommended method and system of search context
US20130144857A1 (en) Methods and apparatus for searching the internet
CN108388582A (en) The mthods, systems and devices of related entities for identification
US9268767B2 (en) Semantic-based search system and search method thereof
CN105224586A (en) From previous session retrieval situation
US10656907B2 (en) Translation of natural language into user interface actions
JP2013537332A (en) Method and apparatus for internet search
US8635201B2 (en) Methods and apparatus for employing a user's location in providing information to the user
CN109165302A (en) Multimedia file recommendation method and device
CN105900087A (en) Rich content for query answers
US20130018864A1 (en) Methods and apparatus for identifying and providing information of various types to a user
CN105745643A (en) Methods and systems for creating image-based content based on text-based content
CN103970791B (en) A kind of method, apparatus for recommending video from video library
CN110096655A (en) Sort method, device, equipment and the storage medium of search result
Sandholm et al. Real-time, location-aware collaborative filtering of web content
TW201224810A (en) Methods and apparatus for selecting a search engine to which to provide a search query
CN106227873A (en) Searching method and device
CN104281656A (en) Method and device for adding label information into application program
CN109165344A (en) Method and apparatus for pushed information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant