CN103136310B - Based on individual sites of historical queries on the new query conversion method - Google Patents

Based on individual sites of historical queries on the new query conversion method Download PDF

Info

Publication number
CN103136310B
CN103136310B CN 201110413826 CN201110413826A CN103136310B CN 103136310 B CN103136310 B CN 103136310B CN 201110413826 CN201110413826 CN 201110413826 CN 201110413826 A CN201110413826 A CN 201110413826A CN 103136310 B CN103136310 B CN 103136310B
Authority
CN
Grant status
Grant
Patent type
Prior art keywords
query
site
original
search
candidate
Prior art date
Application number
CN 201110413826
Other languages
Chinese (zh)
Other versions
CN103136310A (en )
Inventor
秦涛
徐亮
王怡青
刘铁岩
林威良
Original Assignee
微软技术许可有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Abstract

本发明揭示了一种基于个别网站的历史查询对新的查询进行转换的方法。 The present invention discloses a method based on the historical sites of individual queries on the new query conversion. 每个特定网站有自己的内容偏好和自己的用户群。 Each site has its own specific content preferences and their customer base. 当用户在某个网站进行查询,该查询与多数情况与该网站内容相关。 When queried by a user in a site, the query associated with the site's content with the most cases. 本发明利用在该网站的历史搜索信息对可能有歧义的原始查询进行转换。 The present invention utilizes search for information in the history of the site to the original query may be ambiguous conversion. 之后基于该替代查询在广告引擎中进行搜索,再将搜索到的广告返回给用户,从而使得真正用于广告搜索的查询项与该网站的目标业务更加匹配,同时使得广告结果与用户的搜索意图更匹配。 After a search query based on the replacement engine in the ad, then search ads returned to the user, so that the real advertisement for the search query terms and the site's target business more match, while allowing the user's search advertising results intention more matches.

Description

基于个别网站的历史查询对新的查询进行转换的方法 Based on individual sites of historical queries on the new query conversion method

技术领域 FIELD

[0001] 本发明涉及广告搜索技术,更具体地说,涉及在网站联盟(syndication network) 中根据特定网站的信息对原始查询进行转换的技术。 [0001] The present invention relates to a search advertising technology, more specifically, to carry out the conversion of the original inquiry based on specific information in the Sites Alliance (syndication network) in.

背景技术 Background technique

[0002] 商业广告引擎(比如bing,g〇〇gle)通常给很多其他的网站提供广告支持,这些网站拥有各自的页面、搜索输入界面及搜索引擎,但在进行后台搜索时,使用的是同一个广告引擎。 [0002] commercial engine (such as bing, g〇〇gle) usually provide advertising support to many other sites, these sites have their own page, search input interface and a search engine, but during the background search, using the same an advertising engine. 当用户在某个网站进行搜索时,该网站返回两类结果:一类是搜索结果,由网站自身提供或者第三方搜索引擎提供;另一类是广告结果,由网站联盟的广告引擎提供。 When users search a site, the site results returned two categories: one is the search results provided by the site's own search engine or a third party; the other is the result of advertising, provided by the Affiliate advertising engine.

[0003] 不同的网站具有不同的主题倾向和目标业务,因此对广告引擎提供的广告的喜好程度也不相同,网站会希望将与自己最为关注的目标业务相关的广告排列在靠前的位置, 这样这些广告更有可能被用户点击,而用户的点击能够给网站带来收益。 [0003] different sites with different themes and tend to target business and therefore the degree of preference for ad provided is not the same engine, the site will want to target relevant advertising business with their most concern are arranged in the front position, so that these ads are more likely to be the user clicks, the user clicks can bring benefits to the site. 点击与自己最为关注的目标业务相关的广告的数量越多,网站得到的收益也就越高。 The more clicks on an ad relevant to their business is most concerned about the target number, the higher the website gets revenue.

[0004] 在现有的网站联盟中,各个网站已经将共用的广告引擎提供的广告结果根据自己的目标业务进行了一定的处理。 [0004] In the existing network of sites, advertisers have shared the results of each site advertising engine provides a certain amount of processing according to their business objectives. 比如,网站“j〇bs。yahoo. com”与“bing”共用广告引擎,但两者的目标业务是不同的。 For example, the site "j〇bs.yahoo. Com" and "bing" shared advertising engine, but both the target business is different. “jobs. yahoo. com”的目标业务是工作机会和职业广告,而“bing” 的目标业务是通用搜索引擎。 "Jobs. Yahoo. Com" target business is advertising job opportunities and career, and "bing" is a common goal of business search engine. 于是,如果在这网站的网页上一同样的关键字进行搜索,得到的搜索结果是不同的,相应的,广告结果也应该不同。 So, if a search for the same keyword on the web page of this site, the search results are different, appropriate, ad results should be different.

[0005] 比如,同样在“jobs · yahoo · com”与“bing”中输入“computer”进行搜索,得到不同的搜索结果。 [0005] For example, the same input in the "jobs · yahoo · com" and "bing" in "computer" search, get different search results.

[0006] “jobs.yahoo, com”对于“computer”的搜索结果中排列靠前的搜索结果都与工作职位相关,这是由于“jobs . yahoo. com”自身的特性而造成的。 [0006] "jobs.yahoo, com" Search Results Search Results for "computer" in front of the arrangement are associated with the job, which is due to "jobs. Yahoo. Com" own characteristics caused. 由于“jobs. yahoo. com”是一家以工作职位为目标业务的网站,用户期望看到和工作职位相关的搜索结果和广告结果, 他们只会点击和工作职位相关的网页或广告,而只有广告才能够为“jobs. yahoo. com”带来收益;用户不会点击与工作职位无关的广告,因此不能够为“jobs. yahoo. com”带来收益。 Since "jobs. Yahoo. Com" is a job for the target business site, users expect to see search results and ad results and related jobs, they will click on the page or ad and job-related, but only ad to be able to bring "jobs yahoo com.." income; users will not click the ad has nothing to do with the job, and therefore can not generate revenue for "jobs yahoo com..". [0007] “bi ng”对于“c ompu ter”的搜索结果中有关于计算机产品销售的网页占据了搜索结果的靠前位置。 [0007] "bi ng" to "c ompu ter" in the search results about computer products sales pages occupy the front position of the search results. “bing”的目标业务是通用搜索引擎,其收益模式与类似于“jobs.yahoo, com”这样的专业网站不同,因此“bing”会选择对其收益最为有利的广告结果,这些广告结果与“jobs. yahoo. com”的结果有很大的不同。 "Bing" is a common business goal of search engines, and its revenue model is similar to "jobs.yahoo, com" such different professional website, so "bing" will choose the most favorable return on its ad results, and the results of these ads " the results jobs. yahoo. com "is very different.

[0008]对于使用共用的广告引擎的网站联盟中的网站来说,需要从广告引擎反馈的广告结果中选择符合自己目标业务的广告,这样才能有效的提高点击率,提高收益。 [0008] For affiliate sites use a common engine advertising in web sites, you need to select the ads that meet their goals of business from advertising engine advertising results of feedback in order to effectively improve the click-through rate and increase profitability. 目前虽然已经对广告结果有了一些处理,但这些处理并非具有很高的效率和很好的适用性。 Although there is currently some processing on the ad results, but these are not processed with high efficiency and good applicability. 在一些关键字的搜索中,这些处理能够达到一定的效果,但在相当多的关键字的处理中,目前的方案无法达到有效的处理,对于特定的网站来说,依旧会在其网页上将于自身目标业务无关的广告排列在靠前的位置。 In search for some keywords, these treatment can achieve a certain effect, but at considerable processing keywords, the current program can not achieve effective treatment for a particular site, it will still be in their pages on their own business objectives irrelevant ads are arranged in front of the location.

发明内容 SUMMARY

[0009] 本发明旨在提出一种能够基于网站联盟中的特定网站的历史查询对新的查询进行转换的方法从而提供更加准确的广告搜索结果的方法。 [0009] The method of the present invention is directed to a method to query for new query to convert the historic sites of the league based on a specific site in order to provide more accurate search advertising results.

[0010] 在一个实施例中,本发明揭示了一种基于个别网站的历史查询对新的查询进行转换的方法。 [0010] In one embodiment, the present invention discloses a method for individual sites based on historical queries new query transformation. 该方法首先在网站联盟中的一个个别网站接收用户的原始查询,该网站联盟包含数个独立的网站而该原始查询与该网站的主题相关。 Firstly, the site of the original query league receives a user's individual sites, the site contains several independent league website that the original query related to the topic of the website. 之后该方法从该网站获取所有用户的历史搜索信息,基于该历史搜索信息对所述原始查询进行转换,得到经转换的替代查询。 After this method to get all users from the site's search history information, converting the original search query based on the historical information, the availability of substitute queries converted. 最后基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 Finally, a search engine based advertising in the alternative query, the search advertising displayed to the user.

[0011] 在一个实施例中,本发明揭示了一种基于个别网站的历史查询对新的查询进行转换的方法。 [0011] In one embodiment, the present invention discloses a method for individual sites based on historical queries new query transformation. 该方法首先在网站联盟中的一个个别的网站接收原始查询,该网站联盟包含数个独立的网站而该原始查询与该网站的主题相关。 Firstly, an individual sites in a site the league received the original query, the site contains several independent league website that the original query related to the topic of the website. 然后该方法从该网站获取所有用户在过去一段时间内的和原始查询相关的倾向查询项。 The method then get all the users tend to query terms and the original query associated in the past period of time from the site. 将该倾向查询项合并到原始查询中,得到经转换的替代查询。 The tendency to merge into the original query term in the query, the query to obtain the converted substitute. 最后基于该替代查询在广告引擎中进行搜索并将搜索到的广告显示给用户。 Finally, based on the alternative query to search engine advertising and search advertising to the user.

[0012] 在一个实施例中,本发明揭示了一种基于个别网站的历史查询对新的查询进行转换的方法。 [0012] In one embodiment, the present invention discloses a method for individual sites based on historical queries new query transformation. 该方法首先在网站联盟中的一个个别的网站接收原始查询,该网站联盟包含数个独立的网站而该原始查询与该网站的主题相关。 Firstly, an individual sites in a site the league received the original query, the site contains several independent league website that the original query related to the topic of the website. 然后从该网站获取候选查询并对候选查询进行筛选,该筛选是基于候选查询与原始查询的属性比对和相似度、以及候选查询的使用频率而进行,筛选得到与原始查询匹配的候选查询。 The site is then obtained from the candidate and the candidate query query filter, the filter is an attribute of the candidate queries to the original query and to compare and similarity, frequency of use and the candidate query was screened with the original query-based queries matching candidate. 使用该匹配的候选查询替换所述原始查询,得到经转换的替代查询。 Using the matched candidate query replacing the original query, the query to obtain the converted substitute. 最后基于该替代查询在广告引擎中进行搜索并将搜索到的广告显示给用户。 Finally, based on the alternative query to search engine advertising and search advertising to the user.

[0013] 本发明能够针对网站联盟中特定网站的特性和历史数据对查询项进行针对性地转换,使得真正用于广告搜索的查询项与网站的目标业务更加匹配,从而获得更加具有价值的广告搜索结果。 [0013] The present invention enables the characteristics and historical data for Affiliate websites in particular for query terms targeted conversion, making the real advertisement for search query terms and the site's target business more match to obtain more valuable advertising search results.

附图说明 BRIEF DESCRIPTION

[0014] 本发明的上述的以及其他的特征、性质和优势将通过下面结合附图和实施例的描述而变得更加明显,在附图中,相同的附图标记始终表示相同的特征,其中: [0014] The above and other features, nature, and advantages of the invention will be described by the following example of embodiment become more apparent from the accompanying drawings and in the drawings, like reference numerals refer to like features, wherein :

[0015] 图1揭示了根据本发明的第一实施例的基于个别网站的历史查询对新的查询进行转换的方法的流程图。 [0015] FIG 1 discloses a flow chart of a query based on a history of individual sites new query method converting a first embodiment of the present invention.

[0016] 图2揭示了根据本发明的第二实施例的基于个别网站的历史查询对新的查询进行转换的方法的流程图。 [0016] Figure 2 discloses a flow chart of a query based on a history of individual sites new query method converts to a second embodiment of the present invention.

[0017] 图3揭示了根据本发明的第三实施例的基于个别网站的历史查询对新的查询进行转换的方法的流程图。 [0017] Figure 3 discloses a flow chart of a query based on a history of individual sites new query methods to convert a third embodiment of the present invention.

具体实施方式 detailed description

[0018] 本发明提出一种基于个别网站的历史查询对新的查询进行转换的方法。 [0018] The present invention provides a history-based query methods for individual Web sites, a new query conversion. 图1揭示了根据本发明的第一实施例的方法的流程图。 FIG 1 discloses a flow chart of a method according to a first embodiment of the present invention. 本发明的主要思想是:当网站联盟中的一个特定的网站接收到一个原始查询,比如用户登陆到该网站的网页上并输入了一个查询后, 对该网站进行分析,从该网站处获得与其目标业务相关的信息。 The main idea of ​​the invention is: When Affiliate of a particular website receives an original query, such as user login to the website and enter the site after a query, the site for analysis, obtained from its website at the target business-related information. 这些信息将被用于对原始查询进行转换,转换的过程是将与该网站的目标业务相关的比重增加,得到一个与网站的目标业务更加匹配的查询。 This information will be used to convert the original query, the conversion process is associated with the target to increase the proportion of the business of the site to give a better match to the target business website queries. 在使用广告引擎进行搜索时,将使用转换后的查询项,这样得到的搜索结果与网站的目标业务更加相关。 When using search engine advertising, will use the converted query terms, more relevant search results so obtained with a website business goals.

[0019] 该方法适用于具有任何有特别业务特征的网站。 [0019] The method is applicable to any site has special business features. 参考图1所示,该方法100包括: Referring shown in FIG. 1, the method 100 comprising:

[0020] 102.在网站联盟中的一个个别网站接收用户的原始查询。 [0020] 102. One individual sites in the Affiliate receives a user's original query. 该网站联盟包含数个独立的,具有各自的业务特征的网站,而该原始查询与该特定的网站相关。 The site contains several independent alliance, a website that features of their business, and that the original query associated with that particular site. 步骤102的一个实现方式是:用户在特定网站上输入查询进行搜索,就认为是输入了一个与该网站相关的原始查询。 Step 102 is an implementation: the user enters a query on a particular site to search, that it is the original query entered a associated with the site. 举一个例子说明,用户登陆到“jobs. yahoo. com”并输入了“computer”,贝Ij认为是用户输入了一个与“jobs. yahoo. com”相关的原始查询,该原始查询是“computer”。 Give you an example, users log in to "jobs. Yahoo. Com" and enter a "computer", Tony Ij considered to be the user has entered a with "jobs. Yahoo. Com" related to the original query, the original query was "computer" .

[0021] 104.从该网站获取所有用户的历史搜索信息,基于该历史搜索信息对所述原始查询进行转换,得到经转换的替代查询。 [0021] 104. taken from the site for all users search history information, converting the original search query based on the historical information, the availability of substitute queries converted. 历史搜索信息与该特定的网站相关,通常来自于该特定的网站的历史数据,比如该网站曾经使用过的历史查询、得到的历史搜索结果、浏览的历史网页等等。 Historical information related to the search for a specific site, usually historical data from that particular site, such as the site of history have used a query, the search history results, browse pages of history and so on. 该历史搜索信息具有两种使用的形式:一种方式是将历史搜索信息合并到原始查询中,得到经转换的替代查询,在这种方式中,历史搜索信息是一个倾向查询项。 The search history information has two forms: one way is to merge the search history information to the original query, the query to obtain substitute converted in this way, history is a tendency to search for information query term. 第二实施例涉及这种方式。 The second embodiment relates to such a manner. 另一种方式是使用历史搜索信息来替换原始查询,作为替代查询,在这种方式中,历史搜索信息也是一个查询,第三实施例涉及这种方式。 Another way is to replace the original query using search history information, instead query, in this way, the history information is a search query, the third embodiment relates to such a manner.

[0022] 106.基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 [0022] 106. Based on this alternative query in the search engine advertising, search advertising to be displayed to the user.

[0023] 图2揭示了根据本发明的第二实施例的方法,该方法200中,历史搜索信息是倾向查询项的形式,倾向查询项被合并到原始查询中。 [0023] Figure 2 discloses a method according to a second embodiment of the present invention, in the method 200, the form of historical search information is the tendency of query terms, query term tends to be incorporated into the original query. 如图2所示,该方法200包括: 2, the method 200 comprising:

[0024] 202.在网站联盟中的一个个别的网站接收原始查询。 [0024] 202. In a separate website Affiliate receives original query. 该网站联盟包含数个独立的网站,该原始查询与该网站的主题相关。 The site contains several independent alliance website, the original query related to the topic of the website. 步骤202与步骤102类似,此处不再重复描述。 Step 202 is similar to step 102, description is not repeated here.

[0025] 204.从该网站获取所有用户在过去一段时间内的和原始查询相关的倾向查询项。 [0025] 204. Users tend to get all the original query terms and other relevant in the past period of time from the site. 倾向查询项的获取方式有如下的两种: There are two tendencies following query terms Access:

[0026] 1)从该特定的网站的历史查询中获取倾向查询项。 [0026] 1) From a historical inquiry that particular website to obtain propensity query term. 比如在该特定的网站的历史查询中搜索出现频率最高的查询项作为倾向查询项。 For example, the most frequent search query terms in historical queries that particular site as a tendency query terms. 再次参考前面所举的在“jobs · yahoo · com”上输入“computer”的例子,倾向查询项是从网站,即“jobs · yahoo · com” 中分析得到,分析“jobs. yahoo. com”的历史查询记录,即所有在“jobs. yahoo. com”上输入的查询的记录,发现其中出现频率最高的查询项,或者说关键字是“job”,于是就将“job”选择为倾向查询项。 Referring again to the previously cited enter "computer" in the "jobs · yahoo · com" example, the tendency query term is, in other words "jobs · yahoo · com" analysis obtained from the website analysis "jobs. Yahoo. Com" of Search record history, that is, all recorded on "jobs. yahoo. com" enter the query and found that the most frequent query terms, or keywords are "job", so it will be "job" selected as the tendency query terms .

[0027] 2)从该特定的网站搜索的网页中获取倾向查询项。 [0027] 2) to obtain propensity for that particular query terms from the web site searchable. 比如在该特定的网站搜索的网页中搜索出现频率最高的查询项作为倾向查询项。 Such as search query terms appear most frequently in the pages of a particular site as a search query term tendency. 还是参考所举的在“jobs. yahoo. com”的例子,对通过“jobs. yahoo. com”进行的查询而搜索到的搜索结果,或者说网页进行分析,查找在这些网页上出现过的频率最高的查询项,发现是“job”,于是“job”就被选择作为倾向查询项。 Or reference cited example of "jobs. Yahoo. Com", and the query by "jobs. Yahoo. Com" and the search to search results, or the web page is analyzed to find frequency appears on these pages over the highest query term, found to be "job", then "job" was selected as the tendency query terms.

[0028] 206.将该倾向查询项合并到原始查询中,得到经转换的替代查询。 [0028] 206. The query term tends to be incorporated into the original query, the query to obtain the converted substitute. 对于倾向查询项来说,其被视为是对于原始查询的一种额外的查询条件,以提高查询的准确率。 For the query term tendency, its is seen as an additional query for the original query, in order to improve the accuracy of the query. 倾向查询项将于原始查询合并而得到替代查询。 Tendency query terms will be combined to give an alternative original query query. 比如,在前面所举的例子中,通过“jobs. yahoo. com” 而输入的原始查询“computer”与倾向查询项“job”合并得到转换后的替代查询“computer+ job”。 For example, in the preceding examples, the original query by "jobs. Yahoo. Com" inputted "computer" and tends query term "job" to obtain the combined conversion Alternatively Query "computer + job".

[0029] 208.基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 [0029] 208. Based on this alternative query in the search engine advertising, search advertising to be displayed to the user.

[0030] 图3揭示了根据本发明的第三实施例的方法,该方法300中,历史搜索信息也是一个查询并被用于替换原始查询。 [0030] Figure 3 discloses a method according to a third embodiment of the present invention, the method 300, a historical search query and the information is used to replace the original query. 如图3所示,该方法300包括: As shown in FIG. 3, the method 300 comprising:

[0031] 302.在网站联盟中的一个个别的网站接收原始查询。 [0031] 302. In a separate site in the Affiliate receives original query. 该网站联盟包含数个独立的网站,该原始查询与该网站的主题相关。 The site contains several independent alliance website, the original query related to the topic of the website. 步骤302与步骤102以及步骤202类似,此处不再重复描述。 Step 302 is similar to step 102 and step 202, description is not repeated here.

[0032] 304.从该网站获取候选查询。 [0032] 304. The acquisition candidate queries from the site. 候选查询是从该指定的网站的历史查询中获得,为了确保候选查询与原始查询的关联性,候选查询与原始查询具有至少一个相同的查询项。 Candidate query is obtained from historical queries the designated site, in order to ensure the relevance of the candidate query with the original query, the query and the candidate has at least one original query same query terms. 再以通过“jobs · yahoo · com”而输入的查询“computer”为例,原始查询为“computer”,在“jobs. yahoo. com”的历史查询中,选择与原始查询具有至少一个相同的查询项,即所有包含查询项“computer”的查询作为候选查询。 And then to query input through "jobs · yahoo · com" "computer" as an example, the original query as "computer", in "jobs. Yahoo. Com" historical query, select the original query has at least one same query items, ie all queries that contain the query term "computer" as a candidate query. 在此处,查询项是指查询中的一个字节,或者, 对于文字查询来说,一个查询项指一个单词。 Here, the query term refers to a byte in a query, or for text query, the query term refers to a word. 更加具体地,查询项是汉语中的一个字或者英语中的一个单词。 More specifically, the Chinese query term is a word or a word of English.

[0033] 306.对候选查询进行筛选,该筛选是基于候选查询与原始查询的属性比对和相似度、以及候选查询的使用频率而进行,筛选得到与原始查询匹配的候选查询。 [0033] 306. The screening of candidate query, the filter is based on an attribute of the candidate query and the original query and the similarity comparison is performed, and the frequency of the candidate query, the screened candidate query matches the original query. 对于候选查询的筛选包括三个方面的考虑: For screening of candidate query includes consideration of three aspects:

[0034] 1)与原始查询在属性上的相符合程度; [0034] 1) the degree of pertinence in the original query attribute;

[0035] 2)与原始查询的相似度; [0035] 2) the degree of similarity of the original query;

[0036] 3)候选查询的使用频率。 Frequency [0036] 3) the candidate query.

[0037] 针对上述三个方面的考虑,提出了如下的限制条件: [0037] For three of the above considerations, restrictions has been proposed:

[0038] 第一条限制条件是针对候选查询与原始查询的属性比对,候选查询与原始查询都是有一组查询项组成,每一个查询项被认为是一个字节(term),对于文字查询来说,一个查询项就是一个单词,比如对于“computer device”这个查询来说,认为其具有两个查询项(两个字节),分别为“computer”和“device”,该查询的字节长度为2。 [0038] The first restriction condition is for the candidate query and the original query attribute alignment, the candidate query and the original query is a set of query terms, each query term is considered to be a one byte (Term), for the words of the query , one query term is a word, such as for "computer device" this query is that it has two query terms (two bytes), respectively, for "computer" and "device", the query bytes length is 2. 第一条限制条件要求候选查询与原始查询之间具有不超过下述一项的差异: The first constraint requires no more than the candidate query with the following differences between an original query:

[0039] >候选查询与原始查询具有相同的字节(term)长度以及一项不同的查询项; [0039]> the candidate query with the same query as the original byte (Term) of different lengths and a query term;

[0040] >►候选查询比原始查询少一个字节; [0040]> ► candidate queries one byte less than the original query;

[0041] >候选查询比原始查询多一个字节。 [0041]> than the original query candidate queries a multi-byte.

[0042] 第二条限制条件针对候选查询与原始查询的相似度的比对,该比对是依据倒置文本频率(IDF)而进行。 [0042] The second constraint for candidate than similarity queries the original query, the comparison is based on the inverted document frequency (IDF) is performed. 具体而言,包括: Specifically, they include:

[0043]计算候选查询与原始查询的倒置文本频率(IDF)。 [0043] Calculation inverted document frequency of the candidate query and the original query (IDF). 对于来自给定的网站的给定的查询项的文本频率(DF),是指在一段时间内盖查询项的搜索频率,而倒置文本频率(IDF) 可以计算如下: For a given document frequency (DF) from a given site of the query terms, it refers to the frequency cap search query terms for a period of time, while the inverted document frequency (IDF) can be calculated as follows:

Figure CN103136310BD00071

[0045] 其中maxDF是具有最高搜索频率的查询项的搜索频率。 [0045] where maxDF is a search query terms of highest frequency search frequency.

[0046] 然后基于倒置文本频率计算候选查询与原始查询的相似度。 [0046] Then the candidate query and the original query similarity is calculated based on the inverted document frequency. 设qjPq」为两个查询, 比如原始查询和候选查询,则这两个查询qdPq』之间的相似度被计算为: Set qjPq "as the two queries, such as the original query and the candidate query, the similarity between these two queries qdPq" is calculated as:

Figure CN103136310BD00081

[0048] 其中SjPS2分别为原始查询和候选查询的倒置文本频率。 [0048] wherein SjPS2 each candidate query and the original query inverted document frequency.

[0049] 然后筛选相似度大于预定门限的候选查询,第二条限制条件要求原始查询和候选查询之间的相似度高于一个设定的阈值,该阈值可以根据应用的要求而进行调整。 Candidate [0049] then screened similarity is larger than a predetermined threshold query, the second restriction condition requires the similarity between the candidate query and the original query is higher than a threshold set, the threshold may be adjusted according to the requirements of the application.

[0050] 第三条限制条件是针对候选查询的使用频率的限制条件。 [0050] with the proviso that the third frequency constraints for the candidate query. 可以预见到,符合第一条限制条件和第二条限制条件的候选查询会不止一个,于是,第三条限制条件对候选查询的使用频率进行了限制。 It is contemplated that, in line with the first and second constraints candidate query constraints than one, then, the third restrictions on the frequency of use is limited candidate queries. 一般会在符合第一条和第二条的限制条件的候选查询中选择具有最高的使用频率的候选查询。 Subject to generally candidate queries a first and a second limiting conditions selected with the highest frequency of use of the candidate query. 作为一种实现,可以选择具有最高点击率的候选查询。 As an implementation, you can choose a candidate query highest click-through rate.

[0051] 第一条限制条件和第二条限制条件可以确保候选查询与原始查询之间的语义相似程度,在符合用户查询要求的情况下尽可能体现网站的特性。 [0051] The first and second constraint limits the degree of semantic similarity can be ensured between the candidate query and the original query, are characteristic of the site is consistent with the requirements of the user query as possible. 而第三条限制条件确保候选查询的被使用频率,通常使用频率高的查询能够得到更多的搜索结果,也能创造更好的收益。 The third constraint is used to ensure that the candidate query frequency, high frequency commonly used queries to get more search results, but also to create a better income.

[0052] 308.使用该匹配的候选查询替换原始查询,得到经转换的替代查询。 [0052] 308. The use of the alternative candidate query matches the original query, the query to obtain the converted substitute.

[0053] 310.基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 [0053] 310. Based on this alternative query in the search engine advertising, search advertising to be displayed to the user.

[0054] 本发明能够针对网站联盟中特定网站的特性和历史数据对查询项进行针对性地转换,使得真正用于广告搜索的查询项与网站的目标业务更加匹配,从而获得更加具有价值的广告搜索结果。 [0054] The present invention enables the characteristics and historical data for Affiliate websites in particular for query terms targeted conversion, making the real advertisement for search query terms and the site's target business more match to obtain more valuable advertising search results.

Claims (16)

  1. 1. 一种基于个别网站的历史查询对新的查询进行转换的方法,其特征在于,该方法包括: 在网站联盟中的一个个别网站接收用户的原始查询,该网站联盟包含数个独立的网站,该数个独立的网站各自具有不同的业务特征,该原始查询与该网站的主题相关; 从该网站获取所有用户的历史搜索信息,基于该历史搜索信息对所述原始查询进行转换,得到经转换的替代查询,该替代查询相较于所述原始查询而言与该网站的业务特征更加匹配; 基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 A query method for a new query based on historical conversion of individual sites, wherein the method comprises: the site of a league individual sites receive the user's original query, the site contains several independent alliance website the number of independent sites each have different business characteristics, the original query related to the topic of the site; get all the users search history information from the site, converting the original search query based on the history information obtained by conversion alternative query, the query alternative compared to the original query terms to better match with business characteristics of the site; based on the alternative query to search engine advertising, search advertising to be displayed to the user.
  2. 2. 如权利要求1所述的方法,其特征在于,所述历史搜索信息与所述网站相关。 2. The method according to claim 1, wherein the history information related to the search site.
  3. 3. 如权利要求2所述的方法,其特征在于,所述历史搜索信息被合并到所述原始查询中,得到经转换的替代查询。 The method according to claim 2, wherein the search history information is incorporated into the original query, the query to obtain the converted substitute.
  4. 4. 如权利要求2所述的方法,其特征在于,所述历史搜索信息被用于替换所述原始查询,作为经转换的替代查询。 4. The method according to claim 2, wherein the search history information is used to replace the original query, instead of the converted query.
  5. 5. —种基于个别网站的历史查询对新的查询进行转换的方法,其特征在于,该方法包括: 在网站联盟中的一个个别的网站接收原始查询,该网站联盟包含数个独立的网站,该数个独立的网站各自具有不同的业务特征,该原始查询与该网站的主题相关; 从该网站获取所有用户在过去一段时间内的和原始查询相关的倾向查询项; 将该倾向查询项合并到所述原始查询中,得到经转换的替代查询,该替代查询相较于所述原始查询而言与该网站的业务特征更加匹配; 基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 5. - kind of query method to convert a new query based on the history of individual sites, wherein the method comprises: a separate site in the Affiliate receives the original query, the site contains several independent alliance website, the number of independent sites each have different business characteristics, the original query related to the topic of the site; users tend to get all the original query terms and other relevant in the past period of time from the site; the tendency to merge query terms to the original query, the query to obtain substitute converted, the alternative query compared to the original query terms to better match with business features of the site; a search engine-based advertising in the alternative query, the search for the ads displayed to the user.
  6. 6. 如权利要求5所述的方法,其特征在于,获取倾向查询项包括从该网站的历史查询中获取倾向查询项。 The method as claimed in claim 5, characterized in that, to obtain propensity tends query term includes obtaining historical query from the query terms site.
  7. 7. 如权利要求6所述的方法,其特征在于,获取倾向查询项包括在该网站的历史查询中搜索出现频率最高的查询项作为倾向查询项。 7. The method according to claim 6, wherein the acquiring includes a search query term tends to the highest frequency of occurrence query terms in historical queries as the site tends query term.
  8. 8. 如权利要求5所述的方法,其特征在于,获取倾向查询项包括从该网站搜索的网页中获取倾向查询项。 8. The method according to claim 5, wherein acquiring comprises acquiring a tendency query terms from the query term tendency of the web site searched.
  9. 9. 如权利要求8所述的方法,其特征在于,获取倾向查询项包括在该网站搜索的网页中搜索出现频率最高的查询项作为倾向查询项。 9. The method according to claim 8, wherein the acquiring includes a search query term tends to the highest frequency of occurrence query terms on the page of the search site as the tendency query term.
  10. 10. —种基于个别网站的历史查询对新的查询进行转换的方法,其特征在于,该方法包括: 在网站联盟中的一个个别的网站接收原始查询,该网站联盟包含数个独立的网站,该数个独立的网站各自具有不同的业务特征,该原始查询与该网站的主题相关; 从该网站获取候选查询; 对候选查询进行筛选,该筛选是基于候选查询与原始查询的属性比对和相似度、以及候选查询的使用频率而进行,筛选得到与原始查询匹配的候选查询; 使用该匹配的候选查询替换所述原始查询,得到经转换的替代查询,该替代查询相较于所述原始查询而言与该网站的业务特征更加匹配; 基于该替代查询在广告引擎中进行搜索,将搜索到的广告显示给用户。 10. - kind of query method to convert a new query based on the history of individual sites, wherein the method comprises: a separate site in the Affiliate receives the original query, the site contains several independent alliance website, the number of independent sites each have different business characteristics, the original query related to the topic of the site; acquisition candidate queries from the site; the candidate query filter, the filter is based on an attribute candidate query with the original query alignments and similarity, and queries the candidate frequency is performed, and the screened candidate query matches the original query; the use of the matched candidate query replacing the original query to obtain the translated query alternative, this alternative compared to the original query for more inquiries and business matching characteristics of the site; a search engine-based advertising in the alternative query, the search advertising displayed to the user.
  11. 11. 如权利要求10所述的方法,其特征在于,所述候选查询是从该网站的历史查询中获得。 11. The method according to claim 10, characterized in that, the candidate query is obtained from the site's historical query.
  12. 12. 如权利要求11所述的方法,其特征在于,所述候选查询与原始查询具有至少一个相同的查询项。 12. The method according to claim 11, wherein the candidate query and the original query having at least one same query terms.
  13. 13. 如权利要求12所述的方法,其特征在于,所述查询项是汉语中的一个字或者英语中的一个单词。 13. The method of claim 12, wherein the query term is a word or a Chinese word in English.
  14. 14. 如权利要求11所述的方法,其特征在于,基于候选查询与原始查询的属性比对进行筛选包括筛选符合下述条件之一的候选查询: 候选查询与原始查询具有相同的字节(term)长度以及一项不同的查询项; 候选查询比原始查询少一个字节; 候选查询比原始查询多一个字节。 14. The method according to claim 11, wherein the screening comprises screening candidate queries meet the following conditions based on the attributes of one of the candidate query and the original query ratio of: the candidate query with the same query as the original byte ( term) lengths and a different query terms; candidate queries less than a byte of the original query; original query multiple candidate query than one byte.
  15. 15. 如权利要求11所述的方法,其特征在于,基于候选查询与原始查询的相似度进行筛选包括: 计算候选查询与原始查询的倒置文本频率(IDF); 基于倒置文本频率计算候选查询与原始查询的相似度; 筛选相似度大于预定门限的候选查询。 15. The method according to claim 11, wherein the filter comprises a candidate queries based on the similarity with the original query: calculating an inverted document frequency of the candidate query and the original query (the IDF); calculated based on the candidate query and inverted document frequency the similarity of the original query; screening candidate similarity is larger than a predetermined threshold inquiry.
  16. 16. 如权利要求11所述的方法,其特征在于,基于候选查询的使用频率而进行筛选包括: 筛选具有最高点击率的候选查询。 16. The method according to claim 11, wherein the filter comprises a candidate queries based on frequency of: screening candidate query highest CTR.
CN 201110413826 2011-12-02 2011-12-02 Based on individual sites of historical queries on the new query conversion method CN103136310B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110413826 CN103136310B (en) 2011-12-02 2011-12-02 Based on individual sites of historical queries on the new query conversion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110413826 CN103136310B (en) 2011-12-02 2011-12-02 Based on individual sites of historical queries on the new query conversion method

Publications (2)

Publication Number Publication Date
CN103136310A true CN103136310A (en) 2013-06-05
CN103136310B true CN103136310B (en) 2017-11-28

Family

ID=48496141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110413826 CN103136310B (en) 2011-12-02 2011-12-02 Based on individual sites of historical queries on the new query conversion method

Country Status (1)

Country Link
CN (1) CN103136310B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761972A (en) * 2003-03-18 2006-04-19 Nhn株式会社 A method of determining an intention of internet user, and a method of advertising via internet by using the determining method and a system thereof
CN1877581A (en) * 2006-07-12 2006-12-13 百度在线网络技术(北京)有限公司 Advertisement display system and method used for Internet search engine
CN1961316A (en) * 2004-05-29 2007-05-09 Nhn株式会社 Method and system for managing the impressing of the search listing based on advertisement group
CN102067105A (en) * 2008-03-18 2011-05-18 雅虎公司 Personalizing sponsored search advertising layout using user behavior history
CN102096882A (en) * 2002-09-24 2011-06-15 Google公司 Methods and apparatus for serving relevant advertisements

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8943043B2 (en) * 2010-01-24 2015-01-27 Microsoft Corporation Dynamic community-based cache for mobile search

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096882A (en) * 2002-09-24 2011-06-15 Google公司 Methods and apparatus for serving relevant advertisements
CN1761972A (en) * 2003-03-18 2006-04-19 Nhn株式会社 A method of determining an intention of internet user, and a method of advertising via internet by using the determining method and a system thereof
CN1961316A (en) * 2004-05-29 2007-05-09 Nhn株式会社 Method and system for managing the impressing of the search listing based on advertisement group
CN1877581A (en) * 2006-07-12 2006-12-13 百度在线网络技术(北京)有限公司 Advertisement display system and method used for Internet search engine
CN102067105A (en) * 2008-03-18 2011-05-18 雅虎公司 Personalizing sponsored search advertising layout using user behavior history

Also Published As

Publication number Publication date Type
CN103136310A (en) 2013-06-05 application

Similar Documents

Publication Publication Date Title
Weber et al. The demographics of web search
Song et al. Identifying opinion leaders in the blogosphere
US7617205B2 (en) Estimating confidence for query revision models
King Website optimization
Abhishek et al. Keyword generation for search engine advertising using semantic similarity between terms
US8380721B2 (en) System and method for context-based knowledge search, tagging, collaboration, management, and advertisement
US20100306229A1 (en) Systems and Methods for Improved Web Searching
US20110055188A1 (en) Construction of boolean search strings for semantic search
US20110295844A1 (en) Enhancing freshness of search results
US7899818B2 (en) Method and system for providing focused search results by excluding categories
US20070143278A1 (en) Context-based key phrase discovery and similarity measurement utilizing search engine query logs
US8315849B1 (en) Selecting terms in a document
US20060005113A1 (en) Enhanced document browsing with automatically generated links based on user information and context
US20060230005A1 (en) Empirical validation of suggested alternative queries
US20110072033A1 (en) Suggesting related search queries during web browsing
US7752190B2 (en) Computer-implemented method and system for managing keyword bidding prices
US8346791B1 (en) Search augmentation
US20070143266A1 (en) Computer-implemented method and system for combining keywords into logical clusters that share similar behavior with respect to a considered dimension
US7346615B2 (en) Using match confidence to adjust a performance threshold
US20060287920A1 (en) Method and system for contextual advertisement delivery
US20070038614A1 (en) Generating and presenting advertisements based on context data for programmable search engines
US8321278B2 (en) Targeted advertisements based on user profiles and page profile
US20050071224A1 (en) System and method for automatically targeting web-based advertisements
US20050071325A1 (en) Increasing a number of relevant advertisements using a relaxed match
US20080071763A1 (en) Dynamic updating of display and ranking for search results

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150727

C41 Transfer of patent application or patent right or utility model
GR01