CN103049474A

CN103049474A - Search query and document-related data translation

Info

Publication number: CN103049474A
Application number: CN2012104134805A
Authority: CN
Inventors: 高剑峰; 威廉·多兰; 克里斯托弗·布罗克特; 王正灏; 李玫; 黄学东
Original assignee: Microsoft Corp
Current assignee: Microsoft Technology Licensing LLC
Priority date: 2011-10-25
Filing date: 2012-10-25
Publication date: 2013-04-17

Abstract

The subject disclosure is directed towards developing a translation model for mapping search query terms to document-related data. By processing user logs comprising search histories into word-aligned query-document pairs, the translation model may be trained using data, such as probabilities, corresponding to the word-aligned query-document pairs. After incorporating the translation model into model data for a search engine, the translation model is used may used as features for producing relevance scores for current search queries and ranking documents/advertisements according to relevance.

Description

Search inquiry and the translation of document related data

The cross reference of related application

It is the right of priority of 61/551,363 U.S. Provisional Patent Application and No. 13/328924 U.S. Patent application submitting on Dec 16th, 2011 that the application requires at the sequence number that on October 25th, 2011 submitted to.

Background technology

Searching for Internet is to locate relevant document and the advertisement meeting is challenging, and this is because search inquiry and web(webpage) document/advertisement often uses different dictions and vocabulary.There is the variety of issue relevant with present the Internet search technology.Usually, inquiry comprises the term different but relevant from the term in the relevant documentation, and this has caused being called as the known Issues about Information Retrieval of vocabulary vacancy problem.Sometimes, when inquiry comprises when having the term that causes unclean multiple implication, search engine retrieving to and the unmatched many documents of user's intention, this can be called noisy diffusion (noisy proliferation) problem.Because search inquiry is this fact of being write by the very different diction of various humans with the web document, these two problems are in fact more general in internet hunt.

The typical information search method that research institution develops is (no matter it in benchmark dataset (for example, text retrieval meeting (TREC) set) the prior art performance on is how) based on word bag and accurate term matching scheme, and can not effectively process these problems.Certain methods adopts and trends towards making special (ad-hoc) measure of noisy diffusion problem worse.Although proposed several methods determine term and the relation between the term in the document in the inquiry, the great majority in these methods depend on based on term inquire about and document in the inappropriate measure of existing Similarity of Term (such as the cosine similarity).For example, in pay search system, expected location document (it can comprise advertisement) relevant with search inquiry and that have potential user's concern, the user will more likely click them thus, yet, owing to the vocabulary vacancy problem that is caused by the language difference between document content and the search inquiry and/or noisy diffusion problem, known technology is returned irrelevant document usually.

Summary of the invention

Provide content of the present invention to come to introduce with the form of simplifying the selection of representative design, the below is further described in embodiment.Content of the present invention both had been not intended to key feature or the essential characteristic that shows theme required for protection, also was not intended to any mode that limits the scope of theme required for protection with meeting and used.

Briefly, the various aspects of the theme described in the literary composition are directed to document and the search inquiry translation model between the sublanguage of common language (for example, English).In one aspect, exploitation relates to for the translation model that the search inquiry term is mapped to document related data (describing such as advertisement): make up the right word alignment training corpus of inquiry-document that comprises word alignment.In one aspect, can generate training corpus with the search history that has recorded, the search history that has recorded comprises the click event that comes from search inquiry.For whenever a pair of, can suppose that given search inquiry is translated into the Document Title clicked or advertisement is described, this is because the user can not select the document or the advertisement that have nothing to do.For each inquiry-document to determining word alignment (for example mapping between query term and the document related term/phrase between document related term and the query term, such as one to one mapping) afterwards, estimate specifically document related term and the accordingly translation probability between the query term in the word alignment.These translation probabilities can be used by the search engine that is deployed to the internet.

On the other hand, the training institution of search engine can generate word alignment training corpus and the identification bilingual phrase of inquiry-advertisement (that is, two phrases (bi-phrase)).Training institution can calculate and inquire about-phrase translation probability that the two phrases of advertisement are associated, and generation is for the query translation probability based on phrase of advertisement, whether these query translation probability based on phrase are provided for search engine, be used for can generating or translating and document is carried out rank from the data relevant with such document based on search inquiry.On the other hand, the search engine provider can use phrase-based translation model, with by supporting the advertiser about the information of better keyword, description of being advised etc.

According to the following detailed description of carrying out by reference to the accompanying drawings, it is obvious that other advantage can become.

Description of drawings

The present invention illustrates and is not limited to accompanying drawing by way of example, and in the accompanying drawings, similar Reference numeral represents similar element, and in the accompanying drawings:

Fig. 1 shows the block diagram according to the example system that is used for search inquiry and the translation of document related data of an illustrative embodiments.

Fig. 2 shows the block diagram according to the exemplary streamline that is used for the translation model training of an illustrative embodiments.

Fig. 3 show according to an illustrative embodiments be used for the paid advertisement search exemplary working time data stream block diagram.

Fig. 4 shows the process flow diagram that is used for the illustrative steps of phrase-based translation model that search inquiry term and advertisement related data are shone upon according to the exploitation of an illustrative embodiments.

Fig. 5 is the block diagram that expression can be implemented the exemplary non-limiting networked environment of each embodiment described in the literary composition.

Fig. 6 is the exemplary non-limiting computing system of expression one or more aspect that can implement each embodiment described in the literary composition or the block diagram of running environment.

Embodiment

The various aspects of the technology described in the literary composition are generally for search inquiry and the translation of document related data.The document related data can comprise advertisement landing page, advertisement description and/or Document Title etc.After generating with or not coming the translation model of the semantic similarity between acquisition search query portion and the documentation section with the alignment template, translation model can be incorporated in the model data of search engine.In the situation that search engine is deployed, when whether translating and when search inquiry was mapped to one or more relevant documentation, translation model can be as the source of characteristic information from the document related data based on search inquiry.

Should be appreciated that any example in the literary composition is nonrestrictive.So, the present invention is not limited to any specific embodiment, aspect, design, structure, function or the example described in the literary composition.On the contrary, any embodiment, aspect, design, structure, the function described in the literary composition or be exemplified as nonrestrictive, and can according to generally calculate and search in provide the variety of way of benefit and advantage to use the present invention.

Fig. 1 shows the block diagram according to the example system that is used for document and search inquiry translation of an illustrative embodiments.The assembly of this example system can comprise: usage data 102, training institution 104, model data 106, search engine provider 108 and example user 110.Should be understood that any user in the example user 110 expression search engine user colonies.When example user 110 transmitted search inquiry by local computing device, exemplary search engine adopted various models from model data 106 to respond search inquiry with Search Results as described herein.After 102 accumulation a period of times, one or more model is analyzed and generated to 104 pairs of usage datas 102 of training institution at usage data, and described one or more model is deployed to search engine provider 108 subsequently with the renewal as model data 106.How study makes up a plurality of models can be carried out in the mode of off-line with the identification relevant documentation.

According to an embodiment, usage data 102 can be included in the search history of polymerization that collect, that be associated with a plurality of search engine user in the special time period (for example, a year).Usage data 102 can comprise search inquiry, the relevant search result that has recorded and the click event that comes from search inquiry, and corresponding to the document with URL(uniform resource locator) (URL) (comprising advertisement).Usage data 102 can also comprise the document related data, such as Document Title and/or advertisement keyword and description etc.

Training institution 104 can utilize for the various data of calculating the translation probability between search inquiry sublanguage and the document/advertisement sublanguage, such as alignment template 112 and/or word alignment training corpus 114.Relate to common language (such as English) although it being understood that the exemplary embodiment of these translation probabilities, each probability refers to the vocabulary vacancy between word that occur, different or the phrase in the information retrieval system of being everlasting.The search inquiry term can be mapped to the different terms with identical or similar meaning and/or be mapped to a plurality of implications of passing in various document/advertisements.

For example, in response to the search inquiry for " jogging shoes(jogging shoes) ", exemplary search engine may be not be identified as the advertisement that comprises phrase " running shoes(running shoes) " relevant, perhaps as an alternative, may be with this ad classification for having low correlation, even if semantic relation shared in these two phrases.In order to repair such vocabulary vacancy, corresponding translation probability is caught semantic relation or the similarity between these two phrases.In one embodiment, corresponding translation probability comprises following value: this value representation phrase " running shoes " can be from the probability that " jogging shoes " translates and phrase " jogging shoes " can translate from " running shoes ", and the correlativity that represents thus advertisement and search inquiry how.

In order to determine whether word or phrase share semantic relation, according to an illustrative embodiments, the document related data that training institution 104 extracts the search inquiry term and is associated with click event.After making up word alignment, training institution 104 becomes word alignment training corpus 114 with the data-switching of extracting, and word alignment training corpus 114 comprises as the inquiry-document of the word alignment of the word of search inquiry term and/or document related data or phrase pair.In one embodiment, training institution 104 can produce alignment template 112 with word alignment training corpus 114, and alignment template 112 can comprise the broad sense version of these words or phrase.

Alignment template 112 can provide the word alignment that substitutes that uses general part of speech (for example, sharing the word grouping of semantic relation) rather than actual words.In one embodiment, one or more feature (function) that is associated with this exemplary search engine can be used the template 112 of aliging, to come that in response to search inquiry rank is carried out in document/advertisement.Each feature can be divided into search inquiry the subset that the search inquiry term is mapped to the alignment template 112 of document related data (such as document/advertisement keyword), and produces the value (such as the vector of correlativity score or correlativity score) that combines to form characteristic information (for example forming weighted mean) with other values.It being understood that and to adopt many other features to calculate the correlativity score of particular document/advertisement, such as the quantity of language construction (for example, the value relevant with the good formation quality of advertisement title/description), the template of aliging subset or ordering etc.

In one embodiment, training institution 104 can by generating the search inquiry term of the precedence record in the usage data 102 and the map information between the document related data based on word alignment training corpus 114, make up translation model 116.Map information can comprise the various probability that are suitable for word alignment training corpus 114, such as except based on the translation probability of word and/or based on the query mappings probability the translation probability of phrase.Training institution 104 can adopt the expectation maximization technology restrain (for example training) based on word or based on the translation probability of phrase with basically with inquiry-document to mating, and maximize the right query mappings probability of each document.The query translation probability can represent that one or more part (such as advertisement description or Document Title) from given document generates the conditional probability of search inquiry.As described in the text, exemplary search engine can use the query translation probability as search inquiry and the correct translation between the potential Search Results or the likelihood of mapping of be untreated (pending).

In an illustrative embodiments, training institution 104 can be attached to translation model 116 in the model data 106, is used for being used by exemplary search engine.For example, training institution 104 can pass through interpolation (for example, linearity or logarithm-linear interpolation) and will combine with language model (such as a gram language model) based on the translation model of word.It being understood that translation model 116 can combine with any n gram language model (such as binary, ternary or model of DHGF).As another example, training institution 104 can be attached to translation model 116 in (linear or non-linear) rank model framework, in described rank model framework, phrase-based translation model and/or can produce for the various features to document/advertisement rank in response to search inquiry based on the translation model of word, as described herein.Linear rows name model framework can also use other models for different characteristic.As an alternative, training institution 104 can be stored in translation model 116 in the model data 106, is used for (for example in the situation that do not combine with other models) document/advertisement rank is directly being used.

After training institution 104 was attached to translation model 116 in the model data 106, exemplary search engine (such as search engine 118) can use translation probability to assist search inquiry and Document mapping.In order to produce may being correlated with and useful Search Results of listing in response to the current search inquiry, search engine 118 adopts various mechanisms (such as correlativity mechanism 120 and/or projecting body 122) to come identifying such as the collection of document of advertisement and suitable rank.

In one embodiment, collection of document can filter with various characteristic informations 124 in correlativity mechanism 120, and various characteristic informations 124 can use a model data 106 and produce.For example, correlativity score/value can be calculated based on the translation probability that is provided by translation model 116 by correlativity mechanism 120.Having for current search inquiry, the document of high translation probability also can have the highest likelihood that has correlativity.Correlativity mechanism 120 can compare and remove the document that is lower than threshold value with these scores and precedence data 126.

Projecting body 122 also can use characteristic information 124, determines to click prediction score (such as click-through rate (click-through rate)) for each residue document.For example, projecting body 122 can distribute peak to advance rate to following document (such as advertisement): described document has in the situation that given current search is inquired about the highest clicked posterior probability, and/or the highest likelihood with the correlativity that provides such as translation model 116.As another example, peak advances rate can depend on various other features, such as the position of document on search results pages, the readability of document related data (for example, advertisement title/description).Projecting body 122 can adopt the neural network sorting unit, the neural network sorting unit is integrated large measure feature, if show in search results pages with the prediction advertisement, then its have much may be clicked.Collection of document with the click-through rate that surpasses the predefine threshold value will be stored in and also finally present to user 110 in the precedence data 126.

In an illustrative embodiments, search engine provider 108 can also provide one or more component software/instrument (such as suggestion mechanism 128), causes the advertisement of higher click-through rate with auxiliary advertiser's exploitation.In an illustrative embodiments, suggestion mechanism 128 can produce the strategy 130 for improvement of ad revenue, and it is included in and uses to improve one or more keyword/phrase of rank in description or the title.In another illustrative embodiments, strategy 130 can also comprise one or more search inquiry term/keyword (for example, consisting of all or part of search inquiry) of bidding to realize to the higher traction of advertiser's webpage.

In another illustrative embodiments, suggestion mechanism 128 can generate metadata streams 132 for the advertisement of the word that comprises translation and/or phrase based on the model data 106 that comprises translation model 116.For example, metadata streams 132 can comprise keyword, advertisement title/description and/or other metadata of landing page information (for example, URL or title), translation.Search engine provider 108 can invest metadata streams 132 the current metadata of following advertisement.The below shows the example format of metadata streams 132:

Advertiser's landing page URL/ title
	The advertisement title
Advertisement is described
	The keyword of translation

Fig. 2 shows the block diagram according to the exemplary streamline that is used for the translation model training of an illustrative embodiments.The element of exemplary streamline (for example, step or processing) can start from element 202, at element 202 places, extracts inquiry-advertisement pair from each user journal that comprises search history (for example, coming from the ad click of search inquiry).Show the element for document and search inquiry translation although it being understood that Fig. 2, advertisement and search inquiry translation also can be carried out in same or analogous mode.Correspondingly, training institution (such as the training institution 104 among Fig. 1) can carry out at least some in the element of exemplary streamline.

Element 204 refers to train the word alignment model and/or the word alignment model is applied to inquiry-document pair.Suppose that the document related data translates into search inquiry, the word alignment model generally refers in the situation that model parameter set and the search inquiry term union of sets likelihood (joint likelihood) of given document related data.The model parameter set can comprise the arrangement (a from the word of document related data (such as Document Title) ₁... a _j), this arrangement mappings is to the index of search inquiry term position (1...j).This arrangement that can be called as in the text word alignment can be expressed as sequence of values (numerical series): in this sequence of values, and each a _jHave 0 and I(for example, the length of the related data such as the document such as Document Title or keyword/label) between value i, if so that the word at j place, the position of search inquiry is connected to the word at i place, the position of Document Title, then a _j=i, and if it be not connected to any document word, then a _j=0.

The word alignment model can be based on the dependence between document word and the search inquiry term.In one embodiment, the word alignment model can suppose that each position in the word sequence has the equal equiprobability of the corresponding words that is assigned in the search inquiry, perhaps can calculate the conditional probability of each Document Title position.For example, first word in the Document Title is put than any other lexeme and can be had the higher probability that is mapped to the search inquiry term.Additional information when word alignment can provide between two word/phrases the occurrence count.For example, the translation probability that uses word alignment to estimate can be considered distortion or the consistance about the position of the word/phrase that is mapped to another the word/phrase in the Document Title in the search inquiry.

Training institution can adopt for the various technology (for example, expectation maximization and modification thereof) that generate word alignment.Some technology in these technology (for example Viterbi (Viterbi) technology/algorithm) can be removed some that do not translate into other sublanguages and " hide " word and/or make it possible to realize one to one mapping between query term and the Document Title word.In an illustrative embodiments, most possible word sequence calculates for the bilingual word of each inquiry-advertisement or phrase (i.e. two phrases) in training institution, and the bilingual word of wherein said inquiry-advertisement or phrase are the unit of can be used as translates into another kind of sublanguage from a kind of sublanguage continuous word or phrases.These word sequences can make training institution can concentrate on the keyword of the refinement that forms advertisement, and suppose that search inquiry generates or translates from these keywords.

Element 206 is directed to the right extraction of word/phrase.Each comprises one or more search inquiry term (q) and one or more document related term (w) to (q, w), such as the word in advertisement title or the description.Element 208 refers to calculate translation probability p (q|w) and translation probability p (w|q) based on word alignment.In an illustrative embodiments, the conditional probability (for example, likelihood) that translation probability p (q|w) expression particular term q can translate from given word w.In another illustrative embodiments, the conditional probability (for example, posterior probability) that translation probability p (w|q) expression specific word w can translate from given term q.

Can use the training data of deriving from user journal (for example, by { (Q _i, D _i), i=1 ... inquiry-document that N} represents to) obtain word translation probability P (q|w).Training method can be followed the standard procedure of training statistics word alignment model.In one embodiment, maximize by the translation probability that makes generated query the title from training data model parameter θ be optimized:

θ^{*} = {\arg \max}_{θ} Π_{i = 1}^{N} P (Q_{i} | D_{i}, θ), - - - (1)

P (Q|D, θ) has adopted the form as the known words alignment model of following equation, and wherein ε is constant, and J is the length of Q, and I is the length of document related data D:

P (Q | A, θ) = \frac{ϵ}{(I + 1) J} \underset{q &Element; Q}{Π} \underset{w &Element; A}{Σ} P (q | w, θ) - - - (2)

In order to find optimum word translation probability, use expectation maximization (EM) algorithm, for example move the iteration of certain number of times (for example, three times) at training data, as the means of avoiding over-fitting (over-fitting).Another kind of alternative mode is in phrase rank decomposed P (Q|A) and trains as described in the text phrase-based translation model.

Element 210 refers to that the translation probability that will learn stores in the translation model set.Model is captured on the rank of word, n unit and phrase search inquiry has much document or documents of may being mapped to have and muchly may be mapped to search inquiry.Make Q represent search inquiry, and D represent the specific description (for example, the title of webpage or advertisement landing page) of document.As described herein, right for each (Q, D), also click in the situation of D one or more user who inputs Q, can suppose that D is correlated with respect to Q.The exemplary translation model can be for any (Q, D) to translation probability (such as P (Q|D) and P (D|Q)) is provided, perhaps particularly, can (for example represent the advertisement related data for A, the title of advertisement landing page) any (Q, A) is to providing translation probability (for example P (Q|A) and P (A|Q).Can use various technology to decompose and also estimate reliably these translation probabilities.As example, the operation parameter estimation technique illustrates how to calculate P (Q|D) and training translation model as example in the equation (3).

Make Q=q ₁... q _JFor inquiring about and make D=w ₁... w _ITitle or description for web document or advertising page (for example, landing document).Be the word bag based on the translation model of word supposition Q and D, and in the situation that the translation probability of given D Q be calculated as:

P (Q | A) = \underset{q &Element; Q}{Π} \underset{w &Element; A}{Σ} P (q | w) P (w | D) . - - - (3)

P (w|D) is the monobasic probability of word w in A herein, and P (q|w) is for translating into w the probability of query term q.Usually, translation model allows by to those other term allocation nonzero probabilities w being translated as other semantic relevant query term.

Go to the rank document, the translation model based on word of equation (3) can be smoothed before it is applied to document ranking.A kind of suitable smoothing model is defined as:

P (Q | D) = \underset{q &Element; Q}{Π} P_{s} (q | D) - - - (4)

Herein, P _s(q, D) is background linear model and based on the linear interpolation of the translation model of word, and wherein α ∈ [0,1] is the interpolation weight of regulating by rule of thumb:

P_{s} (q | D) = αP (q | C) + (1 - α) \underset{w &Element; A}{Σ} P (q | w) P (w | D) - - - (5)

P (q, w) is the translation model based on word that can use equation (1) or equation (2) to estimate.P (Q|C) and P (w|D) represent respectively not level and smooth background and document model, and use maximal possibility estimation to estimate in the equation below:

P (q | C) = \frac{C (q; C)}{| C |} - - - (6)

P (w | D) = \frac{C (w; D)}{| D |} - - - (7)

C (q; C) and C (w; D) be respectively the counting of q in (q, w) right set C and in document, and | C| and | D| be respectively the set size and the size of document.In one embodiment, although search inquiry and document can be identical and be associated from different sublanguage owing to basic language,, each word/phrase has certain probability of being associated with translation certainly (that is, P (q=w|w)〉0).On the one hand, lowly reduce retrieval performance from translation probability by giving low weight to the coupling term.On the other hand, the very high advantage of not utilizing translation model from probability.According to an embodiment, equation (5) is modified to equation (8), to regulate from translation probability by mixing linearly based on estimation and the maximal possibility estimation of translation clearly:

P _s(q|A)=α P (q|C)+(1-α) P _Mx(q|D), wherein (8)

P_{mx} (q | A) = βP (q | D) + (1 - β) \underset{w &Element; D}{Σ} P (q | w) P (w | D) - - - (9)

In above-mentioned equation, β ∈ [0,1] regulates parameter, and the degree from translation probability is regulated in expression.β=1 is set in equation (9), and to make the translation model breviaty be to have the level and smooth gram language model of Jelinek-Mercer.P (q|D) in the equation (9) is the not level and smooth document model of being estimated by equation (7), so that for

P (q|D)=0.

Fig. 3 show according to an illustrative embodiments be used for the paid advertisement search exemplary working time data stream block diagram.The processing of carrying out during the data stream in exemplary working time starts from the search inquiry parsing and enrichment (enrichment) processes 302.As shown in the figure, search inquiry is divided into term set Q={q ₁... q _JAnd become Q' by enrichment.For example, the search inquiry Q' of enrichment can comprise additional/intermediate search term and/or target classification.The search inquiry of enrichment is sent to advertisement selection and processes 304, and advertisement selection is processed the advertising aggregator that 304 identifications are mapped to the part of one or more target classification and/or term set.

In one embodiment, based on translation model, correlativity filtration treatment 306 can be reduced into advertising aggregator the subset that has above the relevant advertisements of the translation probability of predefine threshold value.Correlativity filtration treatment 306 can be used based on the translation model of word so that each advertisement keyword is translated into query term independently, or uses phrase-based translation model so that the sequence of word is carried out advertisement keyword to the translation of query term.In another embodiment, correlativity filtration treatment 306 can also be reduced with other features the subset of relevant advertisements.For each advertisement, the value of these features can be incorporated into the correlativity score for relevant advertisements refinement (refinement).

In one embodiment, click-through rate prediction processing 308 can also be calculated about the user with translation model the much probability/values that may select/click the certain relevant advertisement.Advance prediction rate based on correlativity score and point, rank and allocation process 310 are carried out rank with the subset of relevant advertisements, and produce the search results pages that comprises according to the subset of the relevant advertisements of the order of rank.

Fig. 4 shows the process flow diagram of illustrative steps that is used for advertisement is mapped to the phrase-based translation model of search inquiry according to the deployment of an illustrative embodiments.Illustrative steps can start from step 402, and proceeds to step 404, at step 404 place, has produced word alignment training corpus.Search engine indication training institution produces word alignment between the corresponding words in each word of search inquiry and the advertisement related data (such as description or title).In one embodiment, the mapping between the word in the word alignment different sublanguages that can refer to translate each other.Training institution can use predetermined word alignment model, perhaps can use the search engine usage data (for example, advancing data) of record to train the word alignment model.

In another embodiment, word alignment can represent: the corresponding phrase that comes from for this continuous phrase each the continuous phrase in the search inquiry (Q), in the advertisement title (A), and vice versa.At first, training institution uses about the expectation maximization of the right word alignment model of inquiry-advertisement (title) at both direction and trains to learn two based on the translation model of word: from search inquiry to the advertisement title first based on the translation model of word and the second translation model based on word from the advertisement title to search inquiry.Based on the word alignment model between each search inquiry and each the advertisement title (for example, " hiding " word alignment), training institution determines Viterbi word alignment, V ^*=v ₁... v _J, wherein, query term position j maps to the word v of advertisements topic (A) in each direction according to following equation (10) to (12) _j:

V^{*} = {\arg \max}_{V} P (Q, V | A) - - - (10)

= {\arg \max}_{V} {P (J | I) Π_{j = 1}^{J} P (q_{j} | w_{v_{j}})} - - - (11)

= {[{\arg \max}_{v_{j}} P (q_{j} | w_{v_{j}})]}_{j = 1}^{J} - - - (12)

The Viterbi word alignment refers to that generally P (Q, V|A) is maximum word alignment.In order to calculate the Viterbi word alignment, for each j, the word translation probability is selected to make by training institution

Large as far as possible v _jIn one embodiment, two Viterbi word alignments make up by the following method: from the common factor of these two Viterbi word alignments, comprise gradually more alignment mappings or connection according to one group of known heuristic rule.

Step 406 is for extracting two phrases and estimating the phrase translation probability.In an illustrative embodiments, two phrases comprise conforming to the word alignment of combination and merge to use the bilingual phrase of the known heuristic rule selection of this group.For example, training institution can set up maximum phrase length.

As described herein, phrase-based translation model can be the production model (generative model) of advertisement related data (A) being translated into search inquiry (Q).Substitute as translate isolator single word in based on the translation model of word, the phrase model is translated into the word sequence among the Q with the word sequence among the A (being phrase), thereby merges contextual information.For example, may learn phrase " stuffynose(has a stuffed up nose) " can translate from " cold(flu) " with relatively high probability, even single word is to (namely " stuffy(is obstructed) "/" cold(flu) " and " nose(nose) "/" cold(flu) ") can not have a high word translation probability.

In one embodiment, advertisement landing page is described (A) and is divided into K non-NULL word sequence w ₁..., w _k, then each non-NULL word sequence is translated into new non-NULL word sequence q ₁..., q _k, and these phrases replaced and be coupled to form the inquiry Q.Variable w and q represent continuous word sequence.

Table 1 shows the production of exemplary search queries Q and processes:

Text	Variable/step
		The flu family therapy	Ad(A)
[" flu ", " family therapy "]	Cut apart (S)
		[" having a stuffed up nose ", " family therapy "]	Translation (T)
(1→2，2→1)	Displacement (M)
		" family therapy is had a stuffed up nose "	Search inquiry (Q)

Table 1

Make S represent A is divided into K phrase w ₁..., w _k, make T represent K translation phrase q ₁..., q _K, (w wherein _i, q _i) to being called as two phrases.Make M represent to represent the displacement of K element of last rearrangement step.Make B (A, Q) that expression is translated into S, the T of Q, the set of M tlv triple with A.If supposition is even probability distribution in segmentation, then the translation probability based on phrase can be defined as:

P (Q | A) &Proportional; \underset{B (D, Q)}{\underset{(S, T, M) &Element;}{Σ}} P (T | A, S) \cdot P (M | A, S, T) - - - (13)

After approximate to this and application maximum, produce following equation:

P (Q | A) \approx \underset{B (D, Q)}{\max_{(S, T, M) &Element;}} P (T | A, S) \cdot P (M | A, S, T) - - - (14)

At given Viterbi word alignment V ^*Situation under, when according to word alignment training corpus to given inquiry-advertisement when scoring, or during disposing by search engine, training institution adopts and V ^*Consistent S, T, M tlv triple, they are represented as B (C, Q, V ^*).In one embodiment, consistance mean if two words at V ^*Middle alignment, then these words will appear at identical two phrase (w _i, q _i) in.In case word alignment is fixed, final displacement is well-determined, so that can abandon this factor, thus equation (14) is rewritten as:

P (Q | A) \approx \underset{B (D, Q)}{\max_{(S, T, M) &Element;}} P (T | A, S) - - - (15)

For remaining factor P (T|A, S), suppose the inquiry T=q of segmentation ₁... q _kFrom left to right by as described in the following equation, translating independently each phrase (w ₁... w _K) and generate P (q wherein _K| w _K) be the phrase translation probability:

P (T | A, S) = Σ_{k = 1}^{K} P (q_{k} | w_{k}) - - - (16)

The query translation probability P (Q|A) based on phrase by equation (10) to (16) definition can be calculated with dynamic programing method effectively.The amount of order α _jGeneral probability for the sequence of the query phrase that covers a j query term.P (Q|A) can use following recursive calculation:

Initialization: α ₀(17)

Conclude:

α_{j} = \underset{j^{'} < j, q = q_{j^{'} + 1} . . . q_{j}}{Σ} {α_{j^{'}} P (q | w_{q})} - - - (18)

Total formula: P (Q|A)=α _j(19)

In the situation of given collected bilingual phrase, estimate phrase translation probability P (q|w with comparative counting _q), wherein N (w, q) is the number of times that w is aligned q in training data:

P (q | w_{q}) = \frac{N (w, q)}{N (w)} - - - (20)

As the alternative equation of equation (20), training institution can be estimated as the amount that is called as term weight the smoothed version of phrase translation probability.Make P (q|w) in the literary composition for based on the translation model of word (for example, equation (1) is to (9)) described word translation probability, and make that V is word alignment (for example, " hiding " word alignment) between query term position i=1...|q| and the heading position j=1...|w|, then by P _wThe term weight of (q|w, V) expression can use following equation to calculate:

P_{w} (q | w, V) = Π_{i = 1}^{| q |} \frac{1}{| {j | (j, i) &Element; V} |} \underset{&ForAll; (i, j) &Element; V}{Σ} P (q_{i} | w_{j}) - - - (21)

Step 408 is for disposing phrase-based translation model with search engine.In one embodiment, search engine can comprise information retrieval system, when rank is carried out in document/advertisement, this information retrieval system uses phrase-based translation model as the source of characteristic information, perhaps as an alternative, search engine can adopt in response to search inquiry translation model directly rank to be carried out in advertisement.In the embodiment of an alternative, the set of phrase-based translation model can be used for calculating various eigenwerts, comprise example feature P (A|Q) and P (Q|A), wherein P (A|Q) and P (Q|A) refer to inquire about from search inquiry translation advertisement title and from advertisement title translation search.

The embodiment of some information retrieval systems utilizes linear rows name model framework, and in this framework, it is combined that the different models except one or more translation model can be used as feature.Linear rows name model adopts the form of the set of M feature, i.e. f _m, m=1...M wherein.Each feature is the arbitrary function that (Q, A) is mapped to actual value,

This model has M parameter, i.e. λ _m, m=1...M wherein, wherein each parameter is associated with a fundamental function.

Step 410 is for search inquiry being processed and produced the Search Results that comprises relevant advertisements.The correlativity score of the advertisement A that is associated with search inquiry Q is calculated as:

Score (Q | A) = Σ_{m = 1}^{M} λ_{m} f_{m} (Q, A) - - - (22)

According to each embodiment, except other known features or substitute other known features, can use the combination in any of following feature based on translation model.As example, search engine can utilize the phrase translation feature f that equals logP (Q|A) _PT(Q, A, V), wherein, P (Q|A) calculates to (19) by equation (17), phrase translation probability P (q|w _q) use equation (20) to estimate.As another example, search engine can utilize the term weight feature f that equals logP (Q|A) _LW(Q, A, V), wherein, P (Q|A) uses equation (17) to calculate phrase translation probability P (q|w to (19) _q) use equation (20) to estimate.

In addition, search engine can utilize and equal

Phrase alignment feature f _PA(Q, A, B), wherein, B is the set of K bilingual phrase, a _kThe starting position that is translated into the title phrase of k query phrase, and b _K-1It is the final position that is translated into the title phrase of (k-1) individual query phrase.Feature is carried out modeling to the degree that query phrase is reordered.For all possible B, search engine only comes computation of characteristic values, B according to Viterbi alignment B ^*=argmax _BP (Q, B|A).Summation operator in equation (18) is got the maximal operator replacement, and B* can use the technology of the dynamic programming recurrence that is similar to equation (17) to (19) to calculate.

What search engine can also utilize unjustified word penalizes feature f _UWP(Q, A, V), it is defined as the ratio between the sum of the number of unjustified query term and query term.Search engine can also utilize the language model feature f that equals log P (Q|A) _LM(Q, A), wherein, P (Q|A) has the level and smooth linear model of Jelinek-Mercer (that is, by equation (4) to (9) definition, β=1 wherein).Search engine also can utilize the word translation feature f that equals logP (Q|A) _WT(Q, A), wherein, P (Q|A) is by the word translation model of equation (3) definition, wherein uses the expectation maximization of equation (1) to train to estimate the word translation probability.

After the correlativity score was calculated in the advertisement that is associated for each, step 410 was also for according to the correlativity score rank being carried out in the advertisement that is associated.Such rank generates the Search Results of listing the advertisement that is associated with the order of rank.In the advertisement that is associated some can be owing to can't realizing that the minimum relatedness score is removed.

Step 412 is for producing strategy for one or more advertisement user who is associated with advertisement after the rank.The suggestion mechanism of search engine can produce candidate keywords with translation model, to improve the keyword competitive bidding.Suggestion mechanism also can produce according to some the preselected keywords for improvement of advertisement webpage or landing page candidate's advertisement and describe.Suggestion mechanism also can produce the information that represents improved budget alloments based on clicking prediction likelihood (that is, click-through rate).Step 414 determines whether to process next search inquiry.If there is no more search inquiry, then step 414 proceeds to step 416.If there is more search inquiry, then step 414 is back to step 410.Step 416 finishes illustrative steps.

Exemplary networking and distributed environment

Those having ordinary skill in the art will appreciate that, the various embodiment that describe in the literary composition and method can be implemented in conjunction with any computing machine or other clients or server apparatus, described computing machine or other clients or server apparatus can be deployed to the part of computer network or be deployed in the distributed computing environment, and can be connected to the data storage area of any type.In this respect, the various embodiment that describe in the literary composition can be embodied in any computer system or the environment, and described computer system or environment have the storer of any amount or storage unit and application and the processing of any amount that occurs in the storage unit of any amount.This includes but not limited to following environment: wherein server computer and client computer are deployed in network environment or the distributed computing environment, have long-range or local storage.

Distributed Calculation provides by the communication between computing equipment and the system and exchanges Sharing computer resource and service.These resources and service comprise for the message exchange of object (such as file), buffer memory and disk storage.These resources and service also are included in for shared processing ability on a plurality of processing units of load balance, resource expansion, processing specialization etc.Distributed Calculation is utilized the connectivity of network, allows client to utilize their collective ability that whole enterprise is benefited.In this respect, various device can have following application, object or resource: it can participate in as for the various embodiment of this theme disclosure and in the resource management agent of describing.

Fig. 5 provides exemplary networking or the schematic diagram of distributed computing environment.Distributed computing environment comprises calculating object 510,512 etc. and calculating object or equipment 520,522,524,526,528 etc., calculating object 510,512 etc. and calculating object or equipment 520,522,524,526,528 etc. can comprise by the programs of exemplary application 530,532,534,536,538 expressions, method, data storage area, FPGA (Field Programmable Gate Array) etc.Be appreciated that, calculating object 510,512 etc. and calculating object or equipment 520,522,524,526,528 etc. can comprise different equipment, such as PDA(Personal Digital Assistant), audio/video devices, mobile phone, MP3 player, personal computer, kneetop computer etc.

Each calculating object 510,512 etc. and calculating object or equipment 520,522,524,526,528 etc. can by communication network 540 and one or more other calculating object 510,512 etc. and calculating object or equipment 520,522,524,526,528 etc. directly or indirectly communicate.Even be shown as discrete component in Fig. 5, communication network 540 can comprise that also the system in Fig. 5 provides other calculating objects and the computing equipment of service, and/or can represent a plurality of interconnection network (not shown).Each calculating object 510,512 etc. or calculating object or equipment 520,522,524,526,528 etc. can also comprise following application (as using 530,532,534,536,538): that the application that is suitable for providing with various embodiment according to this theme disclosure communicates or the application that provides is implemented, API or other objects, software, firmware and/or hardware can be provided for it.

There are various systems, assembly and the network configuration of supporting distributed computing environment.For example, computing system can be via local network or the network that extensively distributes, link together by wired or wireless system.At present, the foundation structure that many network-coupled are calculated to being provided for extensively distributing also contains the internet of many different networks, and still, any network infrastructure can be used for the incident communication of system described in the various embodiment for example.

Therefore, can utilize the main frame of network topology and network infrastructure (such as user terminal/server framework, reciprocity framework or hybrid framework)." client " is use and the class of the service of its irrelevant another kind of or group or the member of group.Client can be the process of the service that provided by another program or process of request, for example, says roughly the set into instruction or task.Client process is used the service of asking, and need not " knowing " about any operational detail of other programs or the service of himself.

In user terminal/server framework, particularly in networked system, client normally is provided by the computing machine of the shared network resource that is provided by another computing machine (for example, server).In the diagram of Fig. 5, as non-limiting example, calculating object or equipment 520,522,524,526,528 grades can be considered to client, and calculating object 510,512 grades can be considered to server, wherein, calculating object 510 as server, 512 grades provide data, services, for example from client calculating object or equipment 520,522,524,526,528 receive datas such as grade, data storage, the processing of data, data are sent to client calculating object or equipment 520,522,524,526,528 etc., but any computing machine can according to circumstances be considered to be client, server, or client and server both.

Server normally can be by the remote computer system of long-range or local network (for example internet or wireless network infrastructure) access.Client process can be movable in first computer system, and server processes can be movable in the second computer system, client process and server processes communicate by communication media each other, thus the distributed function of providing and allow a plurality of clients to utilize the information ability of server.

Be in the network environment of internet at communication network 540 or bus, for example, calculating object 510,512 etc. can be Web server, and other calculating objects or equipment 520,522,524,526,528 etc. are communicated by letter with Web server via any agreement in a plurality of known protocols (such as HTTP(Hypertext Transport Protocol)).Also can be used as client (such as calculating object or equipment 520,522,524,526,528 etc.) as the calculating object 510 of server, 512 etc., this can be the feature of distributed computing environment.

Example calculation equipment

As mentioned, advantageously, the technology described in the literary composition can be applied to any equipment.Therefore, be understandable that, imagine various types of hand-helds, portable and other computing equipments and calculating object and use in conjunction with each embodiment.Correspondingly, the following general purpose remote computer of describing among Fig. 6 below is an example of computing equipment.

Embodiment can be partly implements to use with the developer by the service of equipment or object by operating system, and/or is included in the application software of one or more function aspects that operates to carry out the various embodiment described in the literary composition.Software can be described with the general context of the computer executable instructions (such as program module) carried out by one or more computing machine (such as client station, server or other equipment).Those of ordinary skill in the art will be understood that, computer system has various configurations and the agreement that can be used for transmitting data, and therefore concrete configuration or agreement should not be regarded as restrictive.

Therefore Fig. 6 shows the example of suitable computingasystem environment 600, wherein, can implement one or more aspect of the embodiment described in the literary composition, although the above clearly, but computingasystem environment 600 is an example of suitable computing environment, and is not intended to and proposes about using or any restriction of the scope of function.In addition, computingasystem environment 600 be not intended to be interpreted as having with the assembly shown in the exemplary computer system environment 600 in any one or make up relevant any interdependence.

With reference to Fig. 6, the exemplary remote equipment that is used for one or more embodiment of enforcement comprises the universal computing device of computing machine 610 forms.The assembly of computing machine 610 can include but not limited to: processing unit 620, system storage 630 and system bus 622, system bus 622 will comprise that the various system components of system storage are coupled to processing unit 620.

Computing machine 610 generally includes various computer-readable mediums, and described computer-readable medium can be any usable medium that can be accessed by computing machine 60.System storage 630 can comprise the computer-readable storage medium of volatibility and/or nonvolatile memory form, such as ROM (read-only memory) (ROM) and/or random-access memory (ram).By example rather than restriction, system storage 630 can also comprise operating system, application program, other program modules and routine data.

The user can will order with input information in computing machine 610 by input equipment 640.The display device of monitor or other types also is connected to system bus 622 by interface (such as output interface 650).Except monitor, computing machine can also comprise other peripheral output devices that can connect by output interface 650, such as loudspeaker and printer.

Computing machine 610 can use the logic connection of one or more other remote computers (such as remote computer 670) and move in networking or distributed environment.Remote computer 670 can be personal computer, server, router, network PC, peer device or other common network nodes or any other remote media consumption or transmission equipment, and any or all of element of describing with respect to computing machine 610 above can comprising.Logic depicted in figure 6 connects and comprises network 672, for example Local Area Network or wide area network (WAN), but also can comprise other network/bus.Such network environment is common in computer network, Intranet and the internet of dwelling house, office, enterprise-wide.

As mentioned above, although in conjunction with various computing equipments and the network architecture exemplary embodiment is described, basic conception can be applied to expect to improve the level of resources utilization, any network system and any computing equipment or system.

In addition, exist various ways to implement same or analogous function, such as so that suitable API, kit, driver code, operating system, control, independence or the Downloadable software object etc. of the technology that provides in the literary composition can be provided in application and service.Therefore, from API(or other software objects) angle and imagine embodiment the literary composition from the software of implementing one or more embodiment described in the literary composition or item of hardware.Therefore, the various embodiment described in the literary composition can have the aspect of whole hardware, part hardware, part software and software.

Word " exemplary " is used for expression in the text as example, example or explanation.For fear of query, disclosed theme is not limited to such example in the literary composition.In addition, be described as any aspect of " exemplary " or design in the literary composition and both not necessarily be interpreted as than other aspects or design preferably or favourable, also do not mean that the exemplary configurations that is equal to known to persons of ordinary skill in the art and the technology got rid of.In addition, term " comprises " for using, the degree of " having ", " comprising " and other similar words, for avoiding query, " to comprise " identical mode with the term of accepting word as opening, such term is intended to inclusive (inclusive), does not get rid of any extra or other elements when using in the claims.

As mentioned, the various technology of describing in the literary composition can combined with hardware or software implement, or suitably the combination of combined with hardware and software is implemented.As used herein, term " assembly ", " module ", " system " etc. are intended to expression and computer related entity, the combination of hardware, hardware and software, software or executory software equally.For example, assembly can be but thread, program and/or the computing machine of the process that is not limited to move at processor, processor, object, executable file, execution.By way of example, the application that moves on computing machine and the computing machine can be assembly.One or more assembly can reside in process and/or the execution thread, and assembly can be localized on the computing machine and/or is distributed between two or more the computing machines.

With reference to the mutual said system of having described between several assemblies.Be understandable that, such system and assembly can comprise the sub-component of these assemblies or appointment, assembly or some in the sub-component and/or assembly and the aforesaid various displacements of foundation and the combination that adds of appointment.Sub-component also may be implemented as the assembly that is couple to communicatedly other assemblies, rather than is included in the assembly in the parent component (layering).In addition, note, one or more assembly can be combined in the single component that polymerizable functional is provided, or be divided into several independent sub-components, and be also noted that, any one or more middle layer (such as the administration and supervision authorities) sub-component being coupled to communicatedly can be provided, thereby integrated functionality is provided.Any element described in the literary composition also can with literary composition in not do not specifically describe but those skilled in the art usually known one or more other assemblies carry out alternately.

In view of the example system described in the literary composition, can also can understand with reference to the process flow diagram of each figure according to the method that described theme is implemented.Although purpose for the purpose of simplifying the description, method is shown and described as a series of, it should be understood that and recognize, each embodiment is not limited to the order of piece because some pieces can with different occur in sequence and/or with literary composition in institute's other pieces of describing and describing occur simultaneously.In the situation that discrete or branch shows flow process by process flow diagram, be appreciated that various other the orders of branch, flow path and piece that to implement to realize same or similar result.In addition, the piece shown in some is optional in implementing the method for hereinafter describing.Conclusion

Although the present invention easily carries out various modifications and alternative structure, illustrative embodiment more of the present invention shown in the drawings and being described in detail in the above.Yet, should be appreciated that, be not intended to the present invention is limited to disclosed concrete form, on the contrary, be intended to contain all modifications, alternative structure and the equivalent that drop in the spirit and scope of the present invention.

Except the various embodiment described in the literary composition, it being understood that and to use other similar embodiment, maybe can make amendment and add described embodiment, carrying out the identical or identical functions of corresponding embodiment, and do not depart from described embodiment.Further, a plurality of process chip or a plurality of equipment can be shared the execution of one or more function described in the literary composition, similarly, can cross over a plurality of equipment and realize storage.Therefore, the invention is not restricted to any single embodiment, but make an explanation with the range consistent with claims, spirit and scope.

Claims

1. method in the computing environment, that carry out at least one processor at least in part, comprise use being used for the translation model (116) of one or more search inquiry term mapping (204) to the document related data, described application comprises: process and comprise corresponding to the inquiry-document of the word alignment described translation model (116) to the data of (114); Described translation model (116) is arrived in the information retrieval model (106) in conjunction with (408); And use (410) described information retrieval model (106) to produce the Search Results that comprises relevant documentation in response to search inquiry.

2. method according to claim 1, wherein, processing described translation model also comprises: process the search engine usage data with the inquiry-document of identification word alignment pair, in order to use with each inquiry-document the posteriority distribution and the likelihood distribution that are associated are trained described translation model.

3. method according to claim 1, wherein, process described translation model and also comprise the translation probability of estimating the semantic relation between expression search inquiry sublanguage and the document sublanguage, wherein, estimate described translation probability also comprise following at least one: regulate from translation probability or calculate the query translation probability of advertisement.

4. method according to claim 1 also comprises: at least one in the metadata streams that generation is associated with advertisement or the suggestion keyword.

5. method according to claim 1, also comprise following at least one: the click prediction score of calculating the correlativity score of each potential document or calculating each relevant documentation based on described Search Results based on described search inquiry.

6. the system in the computing environment, comprise training institution (104), described training institution is configured to process word alignment training corpus (114) and the two phrases of identification (406) inquiry-advertisement, wherein said training institution (104) also is configured to calculate the phrase translation probability that (208) are associated with the two phrases of described inquiry-advertisement, provides (408) to search engine for advertisement based on the query translation probability of phrase and with described query translation probability based on phrase to produce (406).

7. system according to claim 6, wherein, described search engine also comprises rank mechanism, described rank mechanism is configured to calculate the score of each advertisement in the given search inquiry situation according to described query translation probability based on phrase, wherein, described rank mechanism also is configured at least one in the following functions: must assign to filter the Search Results of described search inquiry or the characteristic information of the phrase translation model that calculating comprises described phrase translation probability based on a group of described advertisement.

8. system according to claim 6, wherein, described system also comprises suggestion mechanism, described suggestion mechanism is configured to produce the strategy be used to the maximize revenue that makes the one group of advertisement that is associated with the advertiser.

9. have one or more computer-readable medium of computer executable instructions, described computer executable instructions may further comprise the steps when being performed:

Access (302) translation model (116), the semantic similarity between described translation model acquisition search query portion and the advertisement part;

One or more relevant advertisements is arrived in search inquiry mapping (304,306);

Based on described translation model (116) described one or more relevant advertisements is carried out rank (308,310); And

Produce (410) Search Results, described Search Results comprises described one or more relevant advertisements that has for the order of the rank of described search inquiry.

10. one or more computer-readable medium according to claim 10 also has such computer executable instructions, and described instruction comprises:

Generate the characteristic information based on phrase that is used for based on the alignment template described one or more relevant documentation being carried out rank.