CN105808541A - Information matching processing method and apparatus - Google Patents

Information matching processing method and apparatus Download PDF

Info

Publication number
CN105808541A
CN105808541A CN201410838112.4A CN201410838112A CN105808541A CN 105808541 A CN105808541 A CN 105808541A CN 201410838112 A CN201410838112 A CN 201410838112A CN 105808541 A CN105808541 A CN 105808541A
Authority
CN
China
Prior art keywords
product information
key word
search key
gear
clicking rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410838112.4A
Other languages
Chinese (zh)
Other versions
CN105808541B (en
Inventor
王涛
黄鹏
林锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Singapore Holdings Pte Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201410838112.4A priority Critical patent/CN105808541B/en
Priority to PCT/CN2015/098247 priority patent/WO2016107455A1/en
Publication of CN105808541A publication Critical patent/CN105808541A/en
Application granted granted Critical
Publication of CN105808541B publication Critical patent/CN105808541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of data processing, and particularly relates to an information matching processing method. The method comprises the steps of: acquiring each search keyword and product information, and forming search keyword and product information characteristic pairs from all the search keywords and all the product information in pairs; calculating a correlation of each search keyword and product information characteristic pair, and according to a correlation calculation result, determining a correlation gear of each search keyword and product information characteristic pair; calculating a predicted click rate of each search keyword and product information characteristic pair, and by utilizing a quantile, determining a predicted click rate gear corresponding to the predicted click rate of each search keyword and product information characteristic pair; and according to the correlation gear and the predicted click rate gear, determining a score of each search keyword and product information characteristic pair, wherein the scores are used for representing matching degrees of the search keywords and the corresponding product information.

Description

A kind of information matches processing method and device
Technical field
The present invention relates to technical field of data processing, particularly relate to a kind of information matches processing method and device.
Background technology
Along with the development of computer and Internet technology, e-commerce website obtains rapid development.E-commerce website is typically stored with data or the product of magnanimity, in order to improve the efficiency of user search product of interest, the search word that Website server often inputs according to user, the product mated with described search word is recommended to user.In the product mated with search word recommended to user, some are high with search word matching degree, quality is good and carried out the product of advertisement promotion often by preferential recommendation to user.And seller often selects the measured product of matter to carry out advertisement promotion to improve the sales volume of the product.When seller carries out advertisement promotion, need the product information for issuing to buy and search for key word accordingly, if the product information that seller issues is more high with the matching degree of search key word, product is then more big by the probability of user search, and buyer user is also more likely to find the product mated with search word such that it is able to get useful information in information ocean.
Therefore, accurately judge the matching degree of product information and search word, be possible not only to improve seller user and promote the effectiveness of product, it is also possible to client that minimizing buyer user's repeated searching product brings and the data interaction of server, improve Consumer's Experience, promote the performance of server simultaneously.
The judgement product information of prior art existence and the matching degree method of search word, often by the dependency calculating search word and advertised product, search word and the matching degree of release product information is judged, it is recommended that seller buys the search key word that matching degree is high according to described relevance scores.
But, this method that prior art exists, only consider the dependency of search word and advertised product, and do not consider that advertised product is by the degree of user preference, the matching therefore thus calculated is inaccurate.Inaccurate matching result of calculation not only results in seller and fails effectively to promote its product, also leading to website is not the product mated completely with its demand, interest to the product that buyer user recommends, buyer has to repeatedly retrieve to get its product really interested, thus adding the data interaction of user place client and server, increase the data processing load of server, reduce the process performance of server, and seriously occupy the Internet bandwidth resource of preciousness.
Summary of the invention
For solving above-mentioned technical problem, the invention discloses a kind of information matches processing method and device, objectivity and the accuracy of information matches can be improved, improve Consumer's Experience, reduce the data processing load of server, improve the process performance of server, save valuable Internet bandwidth resource.
Technical scheme is as follows:
First aspect according to embodiments of the present invention, discloses a kind of product information matched processing method, and described method includes:
Obtain each search key word and product information, and described each search key word and product information are formed search key word and product information feature pair between two;
Calculate the dependency of each described search key word and product information feature pair, determine the dependency gear of each described search key word and product information feature pair according to correlativity calculation result;
What calculate each described search key word and product information feature pair estimates clicking rate, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear;
Determining the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, described scoring is for characterizing the matching degree of described search key word and product information.
Second aspect according to embodiments of the present invention, discloses a kind of product information matching treatment device, and described device includes:
Acquiring unit, is used for obtaining each search key word and product information, and described each search key word and product information is formed search key word and product information feature pair between two;
Dependency gear determines unit, for calculating the dependency of each described search key word and product information feature pair, determines the dependency gear of each described search key word and product information feature pair according to correlativity calculation result;
Estimate clicking rate gear and determine unit, estimate clicking rate for what calculate each described search key word and product information feature pair, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear;
Matching determines unit, and for determining the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, described scoring is for characterizing the matching degree of described search key word and product information.
What one aspect of the embodiment of the present invention can reach has the beneficial effect that method and apparatus provided by the invention, when determining the matching degree of search key word and product information, not only allow for the dependency of search key word and product information, also contemplate product by the degree of user preference, introduce and can objectively respond product and carried out estimating clicking rate calculate by the clicking rate factor of estimating of the degree of user preference, and always according to default ratio rules (such as, normal distribution law) determine the clicking rate gear corresponding to the probability that this advertised product is clicked by user under this search key word, the matching degree of search key word and product information is comprehensively determined by dependency gear and clicking rate gear, thus obtaining matching result more accurately.Thus, it is possible not only to improve seller user and promotes the effectiveness of product, the data interaction of client that buyer user's repeated searching product brings and server can also be reduced, improve Consumer's Experience, reduce the data processing load of server, improve the process performance of server, save valuable Internet bandwidth resource.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, the accompanying drawing used required in embodiment or description of the prior art will be briefly described below, apparently, the accompanying drawing that the following describes is only some embodiments recorded in the present invention, for those of ordinary skill in the art, under the premise not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
A kind of information matches process flow schematic diagram that Fig. 1 provides for the embodiment of the present invention;
The standard normal distribution point position that Fig. 2 provides for the embodiment of the present invention represents intention;
Fig. 3 estimates clicking rate gear distribution schematic diagram for what the embodiment of the present invention provided;
The information matches that Fig. 4 provides for the embodiment of the present invention processes device schematic diagram.
Detailed description of the invention
In order to make those skilled in the art be more fully understood that the technical scheme in the present invention, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under not making creative work premise, all should belong to the scope of protection of the invention.
The invention discloses a kind of information matches processing method and device, not only allow for the dependency of search key word and product information, also contemplate product by the degree of user preference, introduce and can reflect that product is undertaken estimating clicking rate calculating by the clicking rate factor of estimating of the degree of user preference, and determine the clicking rate gear corresponding to the probability that this advertised product is clicked by user under this search key word according to normal distribution law, the matching degree of search key word and product information is comprehensively determined by dependency gear and clicking rate gear, thus obtaining matching result more accurately.
In a kind of application scenarios of the present invention, in ecommerce class website, seller needs to buy search key word to promote its advertised product, the method that the embodiment of the present invention provides can apply to web site server end, for judging the matching degree of search key word and the product information of seller's issue, thus recommending to buy the high search key word of matching degree to seller, promoting the effectiveness of product improving seller user, improving the probability that seller's consumer products is clicked by buyer user further;On the other hand, the efficiency of buyer's user search product can also be improved, reduce the data interaction of the client brought of buyer user's repeated searching product and server, improve Consumer's Experience, reduce the data processing load of server, improve the process performance of server, save valuable Internet bandwidth resource.
Referring to Fig. 1, for a kind of information matches process flow schematic diagram that the embodiment of the present invention provides.
S101, obtains each search key word and product information, and described each search key word and product information is formed search key word and product information feature pair between two.
Generally for seller, the product of its operation is various, it is likely to belong to different classifications, at this moment, can be respectively processed for the product information of seller, obtain one or more word that can describe its product information, and form search key word and product information feature pair between two with search key word.Such as, the product information of seller includes MP3 player, iphone6, Note4, earphone etc..Search key word is mobile phone, then the search key word formed and product information feature are to just including (mobile phone, MP3 player), (mobile phone, iphone6), (mobile phone, Note4), (mobile phone, earphone).Certainly, these are only exemplary illustration, be not intended as limitation of the present invention.Wherein, described product information is specifically as follows advertised product information.
It should be noted that before performing step S102 and step S103, it is possible to described each search key word and product information are carried out pretreatment, and described pretreatment includes carrying out the extraction of the required semantic feature of various features coupling and processes.The concrete mode processed can be various, is not defined at this.
Additionally, priority execution sequence uninevitable between step S102 and step S103, the two can be performed in parallel, it is also possible to performs reversedly.
S102, calculates the dependency of each described search key word and product information feature pair, determines the dependency gear of each described search key word and product information feature pair according to correlativity calculation result.
Wherein, calculating of dependency obtains mainly through classification dependency and the text relevant of search key word with advertised product.Wherein, classification dependency refers to the matching degree clicking classification and advertised product place classification of search key word;Text relevant includes many-side, being primarily referred to as the attributes match degree during the core word of search key word describes with advertised product with the attribute of appearance in the core word matching degree of advertised product title and search key word, comprehensive classification coupling can obtain relevance scores with text matches.
When implementing, step S102 specifically may include that the matching judgment to carrying out various features by described search key word and product information feature;Matching judgment result according to described various features, it is determined that the dependency gear of described search key word and product information feature pair.
When implementing, when carrying out correlation calculations, described search key word and the product information feature matching judgment to carrying out various features: classification characteristic matching judges and text feature matching judgment is at least one.
Further, described classification characteristic matching is judged as judging whether described search key word and product information belong to same classification.In the present invention one implements, described classification characteristic matching judges that being often referred to the classification carried out according to text implication judges.As identical with the classification of release product information in described search key word classification, then the result that classification characteristic matching judges is "Yes", and otherwise, the result that classification characteristic matching judges is "No".Wherein, a kind of special circumstances that result is "No" that classification characteristic matching judges are that described search key word does not have classification, and more serious for not having the search key word of classification to be usually its long-tail ratio, described long-tail is namely seldom by the search key word of user search.Such as, described search key word is " mp3 ", and release product is " audio player ", then both belong to same classification, and the result that classification characteristic matching judges is "Yes".Described search key word is " mp3 ", and release product is " radio ", then both are not belonging to same classification, and the result that classification characteristic matching judges is "No".
Further, described text feature matching judgment is whether the content of text judging described search key word and release product information is associated.Specifically, text feature matching judgment of the present invention includes: completely at least one in matching judgment, part matching judgment, centre word matching judgment, the complete matching judgment of centre word, hiding word matching judgment and reverse preposition matching judgment.Certainly, text feature matching judgment can also include extracting Text eigenvector, the method utilizing the similarity of cosine angle formulae calculating text vector.This is not defined by the present invention.
According to search key word and product information feature to after carrying out the matching judgment of various features, namely can according to the matching judgment result of described various features, it is determined that the dependency gear of described search key word and product information feature pair.In the present invention, dependency gear is divided into excellent poor third gear.
As shown in table 1, the one divided for dependency gear schematically illustrates, and certainly can also adopt other gear division methods, not be defined at this.
Table 1
S103, what calculate each described search key word and product information feature pair estimates clicking rate, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear.
When implementing, step S103 may include that and estimates the proportionality coefficient that each gear of clicking rate gear is corresponding;The numerical value of quantile is determined according to described proportionality coefficient;The numerical value estimating clicking rate and described quantile according to described each described search key word and product information feature pair determine described in estimate the gear at clicking rate place interval.
Preferably, described quantile is normal distribution quantile.
It is described in detail below in conjunction with an example.
First standard normal distribution quantile is introduced.Standard normal distribution is also called Gauss distribution, it it is the normal distribution 0 to be mean, to be standard deviation with 1, it is designated as N (0,1), it is one and presents bell probability distribution curve, and two is little, broad in the middle, the gross area under curve is 1, and it is defined as: if it be μ, scale parameter is the probability distribution of σ that stochastic variable X obeys location parameter, be designated as:
X~N (μ, σ2)(1)
Its probability density function is
f ( x ) = 1 σ 2 π e - ( x - μ ) 2 2 σ 2 - - - ( 2 )
Then claiming f to obey 0 is average, and 1 is the standard normal distribution of standard deviation.
Normal distribution quantile is for portraying the rule that the area under the curve under normal distribution meets, the upper α quantile definition of standard normal distribution: set X~N (0,1), for appointing the α given, (0 < α < 1), claims to meet P (X > ZaThe point Z of)=αaUpper α quantile for standard normal distribution.As looked into the gaussian distribution table schematic diagram shown in Fig. 2, work as Za=1, find α=0.158655.
The quantile that normal distribution is conventional has following rule:
Under function curve within the scope of the area of 68.268949% standard deviation about average.
The area of 95.449974% is about average in the scope of two standard deviation 2 σ.
The area of 99.730020% is about average in the scope of three standard deviation 3 σ.
The area of 99.993666% is about average in the scope of four standard deviation 4 σ.
The present invention applies normal distribution law just and has carried out estimating the gear division of clicking rate.
Wherein, estimating clicking rate eCTR is by historical multiexposure, multiple exposure and click behavior are set up mathematical probabilities model, and by this model, whether following exposure is produced click and be predicted, the value finally provided refers under certain word, the probability clicked by user after the exposure of certain product, therefore, it is the value between 0~1, and value is more big, illustrates that clicked probability is more big.
ECTR estimates the LR model adopting industrywide standard, and LR model includes feature extraction and two parts of model training.Wherein, the clicking rate of estimating calculating each described search key word and product information feature pair includes: to described search key word and product information feature to carrying out feature extraction, obtain each feature characteristic of correspondence weight according to training pattern;The feature extracted and described feature characteristic of correspondence weight calculation is utilized to estimate clicking rate.
Wherein, the feature of feature extraction include set forth below in one or arbitrarily combine: the dependency of the text message of described search key word, the category information of described search key word, the title of described product information, the attribute of described product information, described search key word and described product information.
Then, after obtaining feature weight by model training, it is possible to estimate advertisement and (Query, offer) is estimated clicking rate eCTR.Wherein, Query is search key word, and offer is product information.
LR model belongs to generalized linear model, and it is that linear model obtains through the change of Logistic formula, specifically as expression formula is:
y = 1 1 + e - &Sigma; i w i f i - - - ( 3 )
Wherein, wiFor feature weight, fiFor eigenvalue, y be final calculate estimate clicking rate, final result is defined between (0,1) by formula, just with click probability and match.
In theory, estimate eCTR accurately and should meet Gauss normal distribution, the eCTR of advertisement pair is divided gear by the dimension using key word and the overall situation, the eCTR of each advertisement pair, it can drop on the corresponding interval of overall eCTR distribution surely, and namely this interval determines this advertisement and corresponding is estimated clicking rate gear.Clicking rate gear division methods is estimated, it is ensured that the scoring of the advertised product of major part client is in average level, and the advertised product of fraction client is in better or poor level according to provided by the invention.
In embodiments of the present invention, according to practical business analysis and empirically determined, determine will estimate clicking rate gear divide as well, in, differ from 3 grades, the proportionality coefficient respectively 3:4:3 that each gear is corresponding, namely gear advertised product proportion as well is 30%, gear be in advertised product proportion be 40%, gear be difference advertised product proportion be 30%, scoring corresponding respectively is 5 stars, 4 stars and 3 stars.Specifically refer to Fig. 3, divide schematic diagram for estimating clicking rate gear.Wherein, abscissa is for estimating clicking rate value, and vertical coordinate is the frequency, area under the curve correspondence probability (i.e. ratio value).
When implementing, when according to when the ratio cut partition of 3:4:3 is overall or key word dimension estimates clicking rate eCTR distribution, it is desirable under the deviation a range of curve of average, distribution area is 0.4, and both sides are due to symmetrical relations, it is respectively then 0.3, can obtain according to the rule of the conventional quantile of normal distribution:
Z a = &mu; &PlusMinus; &sigma; 2 - - - ( 4 )
Wherein, μ is average, and σ is standard deviation, ZaFor normal distribution quantile.
It is to say, after determining and estimating the proportionality coefficient that each gear of clicking rate gear is corresponding, the numerical value of normal distribution quantile namely can be determined according to described proportionality coefficient.
Assume that Fig. 3 obeys standard normal distribution, i.e. X~N (0,1), for appointing the α given, (0<α<1), claims the upper α quantile that some Z α is standard normal distribution meeting P (X>Z α)=α, the corresponding lower α quantile of Z (1-α).
Z α is a numerical value, when X~N (0,1), then P (X > Z α)=α.Citing illustrates, and looks for α, correspondence to find Z α in gaussian distribution table.Such as look into the value of Z0.025, namely need to look into Z value corresponding for 1-0.025=0.975, search gaussian distribution table shown in Fig. 2, the Z value that can just find 0.9750 correspondence is 1.96, therefore Z0.025=1.96 looks into the α value of Z α=1.96 correspondence in turn, need first to look into 1.96, correspond to 0.975,1-0.975=0.025=and be α value.
Then come as seen from Figure 3, two quantiles of a1 and a2 corresponding standard normal distribution respectively, by target ratio value in Fig. 3, can correspond on Z α 1 and Z α 2 respectively, Z α 1 and the value of Z α 2 can be obtained by above method, under standard normal distribution, the corresponding upper α quantile of Z α 1, the corresponding lower α quantile of Z α 2.
When implementing, when estimating each gear of clicking rate gear according to the ratio cut partition of 3:4:3, it can be seen that distribution area is 0.4 under the both sides deviation a range of curve of average, the left and right sides is due to symmetrical relations, it is respectively then 0.3, then divide the right side graph area that a2 quantile in bitmap is corresponding to be 0.3 at Fig. 3 standard normal distribution, namely look into Z0,3Value, namely need to look into Z value corresponding for 1-0.3=0.7.Looking into the normal distribution shown in Fig. 2 divides a table to obtain, and the Z value of 0.7 correspondence is 0.52, then Z0,3=0.52, namely a2 is 0.52;It is likewise possible to determine that the value of a1 is-0.52.A2 and a1 is then respectively to should the ratio two quantiles under normal distribution.The value of normal distribution quantile Z α 1 and Z α 2 can certainly be calculated according to formula (4).Owing to Fig. 3 meets standard normal distribution quantile, therefore, having X~N (0,1), namely μ is equal to 0, σ equal to 1, formula (4) calculate and obtain, Za=± 0.5, corresponding diagram 3, i.e. a1=-0.5, a2=0.5.
The value estimating clicking rate meets general normal distribution law.(μ is not equal to 0 to correspond to general normal distribution, σ is not equal to 1) when, corresponding quantile then can obtain by the rule of normal distribution quantile is approximate, and the quantile of general normal distribution correspondence ratio 3:4:3 is such that it is able to obtain below equation:
Z a = &mu; &PlusMinus; &sigma; 2 - - - ( 4 )
Wherein, μ is average, and σ is standard deviation.Wherein, μ and σ can be calculated by real data sample and obtain.Specifically, after clicking rate numerical value is estimated in acquisition, can obtaining the variances sigma of all average value mu estimating clicking rate and correspondence, circular is referred to the method that prior art exists.Then, according to average value mu and variances sigma, the numerical value of general normal distribution quantile is obtained according to formula (4).
After the numerical value determining general normal distribution quantile, then can according to the numerical values recited estimating clicking rate and normal distribution quantile, it is determined that described in estimate the gear at clicking rate place interval.Such as, according to standard normal distribution divide a table obtain estimate clicking rate belong to (0, μ-σ/2] time, its correspondence estimate clicking rate gear for poor;Estimate clicking rate when belonging between (μ-σ/2, μ+σ/2), its correspondence estimate during clicking rate gear is;Estimate clicking rate belong to [μ+σ/2,1) time, its correspondence estimate clicking rate gear as well.
Illustrate for 3:4:3 it should be noted that above for proportionality coefficient, when the proportionality coefficient determined is for other ratios, it is possible to the thought with reference to said method is calculated.
S104, determines the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, and described scoring is for characterizing the matching degree of described search key word and product information.
When implementing, the circular of scoring can be various, for instance adopting average weighted method to be marked or other implementations, this is not defined by the present invention.
With reference to table 2, for a kind of implementation of Star rating.
Table 2
Wherein, according to practical business analysis, it is possible to selected to make to make good use of the ratio that middle difference is 3:4:3 be that excellent advertisement is to dividing to dependency, corresponding is 5 stars respectively, 4 stars and 3 stars, is that good advertisement is to the ratio cut partition gear according to 1:1 for dependency, corresponding 2 stars and 1 star respectively, the division of excellent advertisement pair is as shown in table 2, and good advertisement, to due to only two grades, divides relatively easy, take distribution average point, good advertisement centering, is 2 stars more than average, is 1 star less than average.
In embodiments of the present invention, the matching degree combining correlation calculations and estimate clicking rate calculating search key word and advertised product, not only inform how are seller's user advertising quality and matching degree, also can objectively respond buyer user's probability that this advertised product is clicked by buyer when site search product, scoring star is more high, ranking is more forward, the probability that buyer clicks is more big, exposure and the feedback brought will be more, the rate of return on investment making advertiser is also more big, improves seller and promotes the effectiveness of product.For website buyer, the optimization of advertisement can be brought the lifting of product quality by advertiser, its direct result is exactly that user's experience in website can become better, the data interaction of user place client and server can tail off, reduce the data processing load of server, improve the process performance of server, save valuable Internet bandwidth resource.
Referring to Fig. 4, for the product information matching treatment device schematic diagram that the embodiment of the present invention provides.
A kind of product information matching treatment device 400, described device includes:
Acquiring unit 401, is used for obtaining each search key word and product information, and described each search key word and product information is formed search key word and product information feature pair between two.
Dependency gear determines unit 402, for calculating the dependency of each described search key word and product information feature pair, determines the dependency gear of each described search key word and product information feature pair according to correlativity calculation result.
Estimate clicking rate gear and determine unit 403, estimate clicking rate for what calculate each described search key word and product information feature pair, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear.
Matching determines unit 404, and for determining the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, described scoring is for characterizing the matching degree of described search key word and product information.
Further, described in estimate clicking rate gear and determine that unit includes estimating clicking rate computation subunit and gear determines subelement, wherein, described in estimate clicking rate computation subunit and include:
Subelement set up by model, for described search key word and product information feature to carrying out feature extraction, obtain each feature characteristic of correspondence weight according to training pattern;
Computation subunit, for utilizing the feature of extraction and described feature characteristic of correspondence weight calculation to estimate clicking rate.
Further, described model set up the feature that subelement extracts include set forth below in one or arbitrarily combine: the dependency of the text message of described search key word, the category information of described search key word, the title of described product information, the attribute of described product information, described search key word and described product information.
Further, described in estimate clicking rate gear and determine that unit includes estimating clicking rate computation subunit and gear determines subelement, wherein, described gear determines that subelement includes:
Proportionality coefficient determines subelement, for estimating the proportionality coefficient that each gear of clicking rate gear is corresponding;
Quantile determines subelement, for determining the numerical value of quantile according to described proportionality coefficient;
Subelement is determined in gear interval, and the gear for estimating clicking rate place according to the numerical value estimating clicking rate and described quantile of described each described search key word and product information feature pair described in determining is interval.
Wherein, described quantile is normal distribution quantile.
Further, described dependency gear determines that unit includes:
Characteristic matching subelement, for the matching judgment to carrying out various features by described search key word and product information feature;
Determine subelement, for the matching judgment result according to described various features, it is determined that the dependency gear of described search key word and product information feature pair.
Further, the matching judgment of the various features that described characteristic matching subelement carries out includes: classification characteristic matching judges and text feature matching judgment is at least one;
Described classification characteristic matching is judged as judging whether described search key word and product information belong to same classification;
Described text feature matching judgment is whether the content of text judging described search key word and product information is associated.
The function of above-mentioned each unit may correspond to the process step of the said method of Fig. 1 detailed description, repeats no more in this.It should be noted that owing to embodiment of the method is explained in detail, the description of device embodiment is relatively simple, it will be appreciated by persons skilled in the art that and be referred to embodiment of the method structure assembly of the invention embodiment.Other implementations that those skilled in the art obtain under not paying creative work belong to protection scope of the present invention.
It will be understood by those skilled in the art that; above method and apparatus embodiment is carried out exemplary illustration; more than being not intended as limitation of the present invention, other implementations that those skilled in the art obtain under not paying creative work belong to protection scope of the present invention.
It should be noted that, in this article, the relational terms of such as first and second or the like is used merely to separate an entity or operation with another entity or operating space, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " includes ", " comprising " or its any other variant are intended to comprising of nonexcludability, so that include the process of a series of key element, method, article or equipment not only include those key elements, but also include other key elements being not expressly set out, or also include the key element intrinsic for this process, method, article or equipment.When there is no more restriction, statement " including ... " key element limited, it is not excluded that there is also other identical element in including the process of described key element, method, article or equipment.The present invention can described in the general context of computer executable instructions, for instance program module.Usually, program module includes performing particular task or realizing the routine of particular abstract data type, program, object, assembly, data structure etc..The present invention can also be put into practice in a distributed computing environment, in these distributed computing environment, the remote processing devices connected by communication network perform task.In a distributed computing environment, program module may be located in the local and remote computer-readable storage medium including storage device.
Each embodiment in this specification all adopts the mode gone forward one by one to describe, between each embodiment identical similar part mutually referring to, what each embodiment stressed is the difference with other embodiments.Especially for device embodiment, owing to it is substantially similar to embodiment of the method, so describing fairly simple, relevant part illustrates referring to the part of embodiment of the method.Device embodiment described above is merely schematic, the wherein said unit illustrated as separating component can be or may not be physically separate, the parts shown as unit can be or may not be physical location, namely may be located at a place, or can also be distributed on multiple NE.Some or all of module therein can be selected according to the actual needs to realize the purpose of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.The above is only the specific embodiment of the present invention; it should be pointed out that, for those skilled in the art, under the premise without departing from the principles of the invention; can also making some improvements and modifications, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims (14)

1. an information matches processing method, it is characterised in that described method includes:
Obtain each search key word and product information, and described each search key word and product information are formed search key word and product information feature pair between two;
Calculate the dependency of each described search key word and product information feature pair, determine the dependency gear of each described search key word and product information feature pair according to correlativity calculation result;
What calculate each described search key word and product information feature pair estimates clicking rate, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear;
Determining the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, described scoring is for characterizing the matching degree of described search key word and product information.
2. method according to claim 1, it is characterised in that the clicking rate of estimating of each described search key word of described calculating and product information feature pair includes:
To described search key word and product information feature to carrying out feature extraction, obtain each feature characteristic of correspondence weight according to training pattern;
The feature extracted and described feature characteristic of correspondence weight calculation is utilized to estimate clicking rate.
3. method according to claim 2, it is characterized in that, the feature of described extraction include set forth below in one or arbitrarily combine: the dependency of the text message of described search key word, the category information of described search key word, the title of described product information, the attribute of described product information, described search key word and described product information.
4. method according to claim 1, it is characterised in that described utilize quantile to determine the clicking rate gear of estimating estimating clicking rate corresponding with each described search key word and product information feature pair includes:
Estimate the proportionality coefficient that each gear of clicking rate gear is corresponding;
The numerical value of quantile is determined according to described proportionality coefficient;
The numerical value estimating clicking rate and described quantile according to described each described search key word and product information feature pair determine described in estimate the gear at clicking rate place interval.
5. method according to claim 4, it is characterised in that described quantile is normal distribution quantile.
6. method according to claim 1, it is characterised in that calculate the dependency of each described search key word and product information feature pair, determines that according to correlativity calculation result the dependency gear of each described search key word and product information feature pair includes:
The matching judgment to carrying out various features by described search key word and product information feature;
Matching judgment result according to described various features, it is determined that the dependency gear of described search key word and product information feature pair.
7. method according to claim 6, it is characterised in that the matching judgment of described various features includes: classification characteristic matching judges and text feature matching judgment is at least one;
Described classification characteristic matching is judged as judging whether described search key word and product information belong to same classification;
Described text feature matching judgment is whether the content of text judging described search key word and product information is associated.
8. an information matches processes device, it is characterised in that described device includes:
Acquiring unit, is used for obtaining each search key word and product information, and described each search key word and product information is formed search key word and product information feature pair between two;
Dependency gear determines unit, for calculating the dependency of each described search key word and product information feature pair, determines the dependency gear of each described search key word and product information feature pair according to correlativity calculation result;
Estimate clicking rate gear and determine unit, estimate clicking rate for what calculate each described search key word and product information feature pair, utilize quantile determine with each described search key word and product information feature pair estimate clicking rate corresponding estimate clicking rate gear;
Matching determines unit, and for determining the scoring of each described search key word and product information feature pair according to described dependency gear and described clicking rate gear of estimating, described scoring is for characterizing the matching degree of described search key word and product information.
9. device according to claim 8, it is characterised in that described in estimate clicking rate gear and determine that unit includes estimating clicking rate computation subunit and gear determines subelement, wherein, described in estimate clicking rate computation subunit and include:
Subelement set up by model, for described search key word and product information feature to carrying out feature extraction, obtain each feature characteristic of correspondence weight according to training pattern;
Computation subunit, for utilizing the feature of extraction and described feature characteristic of correspondence weight calculation to estimate clicking rate.
10. device according to claim 9, it is characterized in that, described model set up the feature that subelement extracts include set forth below in one or arbitrarily combine: the dependency of the text message of described search key word, the category information of described search key word, the title of described product information, the attribute of described product information, described search key word and described product information.
11. device according to claim 8, it is characterised in that described in estimate clicking rate gear and determine that unit includes estimating clicking rate computation subunit and gear determines subelement, wherein, described gear determines that subelement includes:
Proportionality coefficient determines subelement, for estimating the proportionality coefficient that each gear of clicking rate gear is corresponding;
Quantile determines subelement, for determining the numerical value of quantile according to described proportionality coefficient;
Subelement is determined in gear interval, and the gear for estimating clicking rate place according to the numerical value estimating clicking rate and described quantile of described each described search key word and product information feature pair described in determining is interval.
12. device according to claim 11, it is characterised in that described quantile is normal distribution quantile.
13. device according to claim 8, it is characterised in that described dependency gear determines that unit includes:
Characteristic matching subelement, for the matching judgment to carrying out various features by described search key word and product information feature;
Determine subelement, for the matching judgment result according to described various features, it is determined that the dependency gear of described search key word and product information feature pair.
14. device according to claim 13, it is characterised in that the matching judgment of the various features that described characteristic matching subelement carries out includes: classification characteristic matching judges and text feature matching judgment is at least one;
Described classification characteristic matching is judged as judging whether described search key word and product information belong to same classification;
Described text feature matching judgment is whether the content of text judging described search key word and product information is associated.
CN201410838112.4A 2014-12-29 2014-12-29 A kind of information matches treating method and apparatus Active CN105808541B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410838112.4A CN105808541B (en) 2014-12-29 2014-12-29 A kind of information matches treating method and apparatus
PCT/CN2015/098247 WO2016107455A1 (en) 2014-12-29 2015-12-22 Information matching processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410838112.4A CN105808541B (en) 2014-12-29 2014-12-29 A kind of information matches treating method and apparatus

Publications (2)

Publication Number Publication Date
CN105808541A true CN105808541A (en) 2016-07-27
CN105808541B CN105808541B (en) 2019-11-08

Family

ID=56284233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410838112.4A Active CN105808541B (en) 2014-12-29 2014-12-29 A kind of information matches treating method and apparatus

Country Status (2)

Country Link
CN (1) CN105808541B (en)
WO (1) WO2016107455A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649605A (en) * 2016-11-28 2017-05-10 百度在线网络技术(北京)有限公司 Triggering way and device of promoting key words
CN107767172A (en) * 2017-10-12 2018-03-06 百度在线网络技术(北京)有限公司 Information-pushing method, device, server and medium
CN110516033A (en) * 2018-05-04 2019-11-29 北京京东尚科信息技术有限公司 A kind of method and apparatus calculating user preference
CN110633398A (en) * 2018-05-31 2019-12-31 阿里巴巴集团控股有限公司 Method for confirming central word, searching method, device and storage medium
CN110909182A (en) * 2019-11-29 2020-03-24 北京达佳互联信息技术有限公司 Multimedia resource searching method and device, computer equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047009B (en) * 2019-11-21 2023-05-23 腾讯科技(深圳)有限公司 Event trigger probability prediction model training method and event trigger probability prediction method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020138481A1 (en) * 2001-03-23 2002-09-26 International Business Machines Corporation Searching product catalogs
US20070016491A1 (en) * 2003-09-30 2007-01-18 Xuejun Wang Method and apparatus for search scoring
CN103514178A (en) * 2012-06-18 2014-01-15 阿里巴巴集团控股有限公司 Searching and sorting method and device based on click rate
CN104077306A (en) * 2013-03-28 2014-10-01 阿里巴巴集团控股有限公司 Search engine result sequencing method and search engine result sequencing system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729365A (en) * 2012-10-12 2014-04-16 阿里巴巴集团控股有限公司 Searching method and system
CN103778548B (en) * 2012-10-19 2018-05-29 阿里巴巴集团控股有限公司 Merchandise news and key word matching method, merchandise news put-on method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020138481A1 (en) * 2001-03-23 2002-09-26 International Business Machines Corporation Searching product catalogs
US20070016491A1 (en) * 2003-09-30 2007-01-18 Xuejun Wang Method and apparatus for search scoring
CN103678481A (en) * 2003-09-30 2014-03-26 雅虎公司 Method and apparatus for search scoring
CN103514178A (en) * 2012-06-18 2014-01-15 阿里巴巴集团控股有限公司 Searching and sorting method and device based on click rate
CN104077306A (en) * 2013-03-28 2014-10-01 阿里巴巴集团控股有限公司 Search engine result sequencing method and search engine result sequencing system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649605A (en) * 2016-11-28 2017-05-10 百度在线网络技术(北京)有限公司 Triggering way and device of promoting key words
CN106649605B (en) * 2016-11-28 2020-09-29 百度在线网络技术(北京)有限公司 Method and device for triggering promotion keywords
CN107767172A (en) * 2017-10-12 2018-03-06 百度在线网络技术(北京)有限公司 Information-pushing method, device, server and medium
CN110516033A (en) * 2018-05-04 2019-11-29 北京京东尚科信息技术有限公司 A kind of method and apparatus calculating user preference
CN110633398A (en) * 2018-05-31 2019-12-31 阿里巴巴集团控股有限公司 Method for confirming central word, searching method, device and storage medium
CN110909182A (en) * 2019-11-29 2020-03-24 北京达佳互联信息技术有限公司 Multimedia resource searching method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2016107455A1 (en) 2016-07-07
CN105808541B (en) 2019-11-08

Similar Documents

Publication Publication Date Title
CN105808541A (en) Information matching processing method and apparatus
CN103207899B (en) Text recommends method and system
US10095782B2 (en) Summarization of short comments
US9934293B2 (en) Generating search results
US20130339350A1 (en) Ranking Search Results Based on Click Through Rates
CN103049470B (en) Viewpoint searching method based on emotion degree of association
US20160026727A1 (en) Generating additional content
US20140108200A1 (en) Method and system for recommending search phrases
CN103365839A (en) Recommendation search method and device for search engines
CN103606097A (en) Method and system based on credibility evaluation for product information recommendation
CN105468649B (en) Method and device for judging matching of objects to be displayed
CN104462327B (en) Calculating, search processing method and the device of statement similarity
EP2473936A1 (en) Information retrieval based on semantic patterns of queries
CN102682001A (en) Method and device for determining suggest word
CN102495864A (en) Collaborative filtering recommending method and system based on grading
CN105023178B (en) A kind of electronic commerce recommending method based on ontology
CN104408033A (en) Text message extracting method and system
CN103425650A (en) Recommendation searching method and recommendation searching system
CN103345489A (en) Event inquiry demand processing method and device
CN105095625A (en) Click Through Ratio (CTR) prediction model establishing method and device, information providing method and information providing system
CN103353865B (en) Barter electronic trading commodity recommendation method based on position
CN103136213A (en) Method and device for providing related words
CN104317881A (en) Method for reordering microblogs on basis of authorities of users&#39; topics
CN105786810A (en) Method and device for establishment of category mapping relation
CN105550282A (en) User interest forecasting method by utilizing multidimensional data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240226

Address after: # 01-21, Lai Zan Da Building 1, 51 Belarusian Road, Singapore

Patentee after: Alibaba Singapore Holdings Ltd.

Country or region after: Singapore

Address before: Cayman Islands Grand Cayman capital building, a four storey No. 847 mailbox

Patentee before: ALIBABA GROUP HOLDING Ltd.

Country or region before: Cayman Islands