CN102004782A - Search result sequencing method and search result sequencer - Google Patents

Search result sequencing method and search result sequencer Download PDF

Info

Publication number
CN102004782A
CN102004782A CN 201010559233 CN201010559233A CN102004782A CN 102004782 A CN102004782 A CN 102004782A CN 201010559233 CN201010559233 CN 201010559233 CN 201010559233 A CN201010559233 A CN 201010559233A CN 102004782 A CN102004782 A CN 102004782A
Authority
CN
China
Prior art keywords
search
search results
weight
search engine
ordering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201010559233
Other languages
Chinese (zh)
Inventor
吴明达
冯鑫
张超旭
张雷刚
佟子健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN 201010559233 priority Critical patent/CN102004782A/en
Publication of CN102004782A publication Critical patent/CN102004782A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a search result sequencing method and a sequencer. The search results are from a plurality of search engines. The method comprises the following steps of: basically sequencing the search results from the plurality of search engines; and editing and adjusting a basic sequence to acquire a final sequence of the search results. The search result sequencer comprises a sequencing module and an adjusting module, wherein the sequencing module is used for basically sequencing the search results from the plurality of search engines, and the adjusting module is used for editing and adjusting the basic sequence to acquire the final sequence of the search results. The invention is used for basically sequencing the search results according to the weights of the search engines and the weights of the sequencing positions on the search engines, and then, adjusting the basic sequence according to the conditions of co-occurrence information, and the like to acquire a secondary sequencing result, and ensures that the sequencing basis is more reasonable so as to provide users with more accurate search results, thereby improving the qualities of the search results and simplifying the user operation.

Description

A kind of search result ordering method and Search Results sorting unit
Technical field
The present invention relates to a kind of method for searching Internet information and instrument, particularly a kind of search result ordering method and Search Results sorting unit.
Background technology
Along with development of internet technology, search engine has obtained continuous perfect, can obtain various information from the internet by search engine.Search engine is that current internet helps the user to obtain one of main path of information fast.The user submits to a query word (Query) to give search engine, and search engine returns to user's Search Results relevant with this query word, and these results arrange from high to low by the degree relevant with query word.
Existing search engine technique comprises and uses web crawlers to grasp webpage from the internet, sets up index, and for the user provides inquiry service, and to the data of specific area grasp, index and search, and provide inquiry service for the user.For example, the search engine of professional domains such as news, music, picture, video, shopping, map.
Traditional search engine technique generally includes several sections such as webpage extracting, webpage processing, search service.Which family's search engine no matter all can not remove to grasp the full content of internet, so each tame search engine all is a subclass of the whole internet of index usually; In addition, traditional web crawlers grasps based on the link between the webpage, is difficult to grasp for the page that does not have link; At last, traditional search engines can't be accomplished real-time update from grasping, set up index, providing inquiry service to need some cycles to most contents.
Simultaneously, a kind of search engine possibly can't satisfy all search needs of everyone or a people.In some cases, in order to obtain reaching Search Results accurately comparatively comprehensively, people need use a plurality of search engines to search for usually, by being compared and screen, Search Results obtains Search Results, operate comparatively loaded down with trivial detailsly, reduced search efficiency, also improved the difficulty of search.
Further, each search engine can sort to Search Results according to the mode of making by oneself, generally can sort according to the degree of correlation with search word.But because the emphasis difference of different search engines, its result who searches out reaches also can be different for the decision procedure of degree of correlation, and therefore the sortord of different search engines may be different.When the user bigger difference may occur to the ordering in the Search Results of same search word in different search engines, the user need compare the result of a plurality of search engines and judge, comparatively inconvenience.
Summary of the invention
Technical matters to be solved by this invention provides a kind of search result ordering method and Search Results sorting unit, can integrate the Search Results of a plurality of search engines and Search Results is sorted.
In order to address the above problem, the invention discloses a kind of search result ordering method, described Search Results comes from a plurality of search engines, may further comprise the steps: carry out the basis ordering at the Search Results that comes from a plurality of search engines; Adjustment is revised in this basis ordering, obtained the final ordering of Search Results.
Preferably, describedly carry out basis ordering at the Search Results that comes from a plurality of search engines and comprise: the weight of determining each search engine; Determine the sorting position weight on the search engine; According to the weight and the sorting position weight of search engine, carry out the basis ordering at the Search Results that comes from a plurality of search engines.
Preferably, the step of the weight of described definite each search engine further comprises: the basic weight of determining each search engine; The classification of the query word that analysis user is submitted to, and, adjust the basic weight of each search engine according to analysis result, obtain the weight of each search engine.
Preferably, the step of the weight of described definite each search engine further comprises: the basic weight of determining each search engine; According to the degree of correlation of query word and each search engine, adjust the basic weight of each search engine, obtain the weight of each search engine.
Preferably, described the ordering of this basis is revised to adjust comprise: according to the co-occurrence information of Search Results, adjustment is revised in this basis ordering.
Preferably, described co-occurrence information comprises: the search result items that occurs having identical network address in a plurality of search engines; And/or, the identical or approximate title and the search result items of summary appear in a plurality of search engines; And/or, the search result items of same website appears belonging in a plurality of search engines; And/or, the search result items in same territory appears belonging in a plurality of search engines.
Preferably, describedly adjustment is revised in the ordering of this basis also comprised: carry out filter retry; Described filter retry is undertaken by the title of comparison Search Results and/or the similarity degree and/or the URL of summary.
Preferably, described correction adjustment also comprises: discern and filter low-quality Search Results, described filtration comprises falls power or deletion.
Preferably, the quality of described Search Results is assessed by the text level of coverage to query word.
Preferably, described method can also comprise: will revise adjusted Search Results sequencing information and be sent to particular search engine, and be used to improve the Search Results ordering of particular search engine.
According to another embodiment of the present invention, a kind of Search Results sorting unit is also disclosed, comprising:
Order module is carried out the basis ordering at the Search Results that comes from a plurality of search engines;
Adjusting module is used for adjustment is revised in the basis ordering, obtains the final ordering of Search Results.
Preferably, described Search Results sorting unit also comprises:
The first weight determination module is used for determining the weight of each search engine;
The second weight determination module is used for the sorting position weight on definite each search engine,
Described order module is carried out the basis ordering according to the weight and the sorting position weight of search engine at the Search Results that comes from a plurality of search engines.
Preferably, the described first weight determination module specifically comprises:, basic weight acquisition module, be used for determining the basic weight of each search engine;
The category analysis module is used for the classification of the query word that analysis user submits to, and according to analysis result, adjusts the basic weight of each search engine, obtains the weight of each search engine.
Preferably, the described first weight determination module specifically comprises:
Basic weight acquisition module is used for determining the basic weight of each search engine;
Degree of correlation analysis module is used for the degree of correlation according to query word and each search engine, adjusts the basic weight of each search engine, obtains the weight of each search engine.
Preferably, described Search Results sorting unit also comprises judge module, be used for the co-occurrence information of Search Results is judged, and the result that will judge sends adjusting module to.
Preferably, described co-occurrence information comprises: the search result items that occurs having identical network address in a plurality of search engines; And/or, the identical or approximate title and the search result items of summary appear in a plurality of search engines; And/or, the search result items of same website appears belonging in a plurality of search engines; And/or, the search result items in same territory appears belonging in a plurality of search engines.
Preferably, it is characterized in that described Search Results sorting unit also comprises:
Filter molality piece is used for Search Results is filtered retry; Described filter retry is undertaken by the title of comparison Search Results and/or the similarity degree and/or the URL of summary.
Preferably, described Search Results sorting unit also comprises:
Filtering module is used for identification and filters low-quality Search Results, and described filtration comprises falls power or deletion; The quality of described Search Results is assessed by the text level of coverage to query word.
Preferably, described Search Results sorting unit also comprises:
The sequencing information sending module is used for the adjusted Search Results sequencing information of correction is sent to particular search engine, is used to improve the Search Results ordering of particular search engine.
Compared with prior art, the present invention has the following advantages:
Search result ordering method of the present invention and Search Results sorting unit can be integrated and sort the Search Results of a plurality of search engines.After the present invention carries out the basis ordering to Search Results according to the weight of search engine and the sorting position weight on the search engine, come the basis ordering is adjusted according to situations such as co-occurrence informations again, obtain the result of two minor sorts, make the foundation of ordering more reasonable, so that Search Results more accurately to be provided to the user, improve the quality of Search Results, simplified user's operation.
Further, the present invention can also determine basic weight at each search engine earlier, and then the basic weight of search engine is adjusted based on the classification or the degree of correlation of the current query word of user, more accurately integrate Search Results at current inquiry search engine weight more accurately thereby can provide so that obtain to the user.
Description of drawings
Fig. 1 is the process flow diagram of a kind of search result ordering method embodiment one of the present invention;
Fig. 2 is the process flow diagram of a kind of search result ordering method embodiment two of the present invention;
Fig. 3 is the synoptic diagram of a kind of Search Results sorting unit embodiment one of the present invention;
Fig. 4 is the synoptic diagram of a kind of Search Results sorting unit embodiment two of the present invention.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
Search result ordering method of the present invention is used for the result from a plurality of search engines is sorted, and integrates Search Results more accurately to provide to the user.
With reference to Fig. 1, show a kind of search result ordering method embodiment one of the present invention, can may further comprise the steps:
Step 101 is to the Search Results formation base ordering of a plurality of search engines.
Step 102 is revised adjustment to this basis ordering, is finally sorted.
The purpose of step 102 just is based on the possible situation of Search Results, and the basis ordering is adjusted, and obtains more optimal ranking results, offers the user.The simplest, for example, filter weighs, falls operations such as power.
The mode of formation base ordering is varied, for example, can sort based on the gross of search engine, Baidu>Google>search dog>in search etc., successively with the sort result of these search engines.Again or, search engine is classified, if the user inquiring request is identical with the classification of search engine, then the Search Results of this search engine ordering is forward, for example, if the user searches music, search dog>Baidu>Google so.In a word, basic ordering can obtain based on variety of way, and the present invention does not need this to be limited.The embodiment of back has provided more preferred implementation.
Again for example, the basis ordering can also directly give Search Results to be carried out, and for example, sorts according to keyword hit rate (perhaps hitting accuracy), it is the most forward that keyword is all hit accurately ordering, and other Search Results gets final product according to the search engine rank order.
Please refer to Fig. 2, a kind of search result ordering method embodiment two of the present invention is shown, it is more more preferred than embodiment 1, specifically can may further comprise the steps:
Step 1011 is determined the weight of each search engine.
At present, the kind of search engine is more, and the search quality of different search engines can be different, by the Search Results of different search engines is added up and assessed, determines the basic weight of each search engine.
In addition, except general web page search engine, also have search engines such as news, music, picture, video, shopping, for different classes of query word, the Search Results quality of each search engine also can be different.The search engine that has is better to the inquiry effect of English query word, and the search engine that has is better to the inquiry effect of shopping class query word, and the search engine that has is better for the inquiry effect of the more query word of number of words.For example, search may be inquired about effect during a certain product preferably on the search engine of shopping class, and if when adopting a certain first song of this search engine searches, effect then may be not good.
Therefore, can set query word classification table in advance and reach the degree of correlation of determining of all categories and search engine.Behind the input inquiry speech, query word is carried out text analyzing, in conjunction with query word classification table query word is classified; Come the weight of each search engine is adjusted according to the degree of correlation of described classification of query word and search engine then, obtain the final weight of search engine.
Certainly, need to prove that the classification of query word can be more than one, such as, an inquiry may be English inquiry, is again a long inquiry (inquiry that number of words is many).This moment then, carry out the weight stack for the degree of correlation of each search engine of a plurality of classifications of this query word and get final product.
For example query word has two classifications of A, B, supposes that weight is 40% and 60%, and is other at category-A, engine 1#0.8, and engine 2#0.4, other at category-B, engine 2#0.9, engine 1#0.6.
The degree of correlation that calculates at this query word and each search engine is:
Engine 1:0.8*40%+0.6*60%=0.68
Engine 2:0.4*40%+0.9*60%=0.70
Under other application scenarios of the present invention, the user may import a plurality of query words, and the present invention also needs these a plurality of query words are carried out participle so.Behind the participle, can obtain the degree of correlation of each query word and each search engine earlier, and then add up according to weight or other modes and to obtain the degree of correlation of each search engine at whole query word.Concrete computation process and aforementioned weight mode are similar, belong to well-known to those skilled in the art, do not repeat them here.
In a preferred embodiment of the invention, can also adopt engine attribute list more accurately, described engine attribute list is used to characterize the degree of correlation of each query word and each search engine; Promptly than the more single-minded step of classification degree of correlation.
Following table 1 has provided a concrete example.
Table 1
Query word Engine 1# degree of correlation Engine 2# degree of correlation
Mobile phone search.taobao.com#0.8 so.youku.com#0.4
The C++ programming idea search.dangdang.com#0.7 search.taobao.com#0.4
The search dog input method www.gougou.com#0.7 www.skycn.com#0.5
For example, the degree of correlation of " mobile phone " query word and Taobao's search engine is higher, is 0.8, and is lower with the degree of correlation of excellent cruel search, is 0.4.Then,, adjust the basic weight of each search engine, obtain the weight of each search engine according to the degree of correlation of current query word and each search engine.
Wherein, top degree of correlation parameter can obtain by statistical information, when specific implementation, can at first add up the query word distributed data that obtains each search engine, statistics is obtained the click distributed data of query word at each search engine again, just can calculate the vector of each query word and each search engine degree of correlation based on these statistical informations.For example, at first adding up the user exists Www.taobao.com, the query word situation of being searched for such as www.gougou.com, and then, under specific full-text search engine, user inquiring speech and corresponding click daily record data, such as, the user exists Www.sogou.comDuring following inquiry " clothes ", click Www.taobao.comWeb page interlinkage under the website.We can obtain each query word in inquiry distribution statistics information under each Targeted Search Engine and the click distribution statistics information under the query word under the particular search engine and Targeted Search Engine or website thus.
Wherein, need to prove, www.gougou.com itself is exactly a search engine, and www.taobao.com, www.skycn.com etc. are to have search engine in its website, in the present invention for convenience of description, directly adopt expression search engines such as www.taobao.com, www.skycn.com, in this special instruction.
Sum up, at first, it is low that the search quality of each search source has height to have.(SEBW SE-Base-Weight) is used to assess the relative height of this search source quality to the basic weight of search source.The basic weight of search source obtains in advance by some assessment search source method for quality, as one of input parameter of sorting unit.
Secondly, in a series of Search Results that search source is returned, come the quality difference of the Search Results on the diverse location.See that on the whole the quality that comes more forward Search Results is than the quality height that comes the Search Results after leaning on.(RPW Rank-Position-Weight) is used to assess the relative height of the desired qualities of the Search Results on each sorting position to the weight of sorting position.The weight of sorting position obtains in advance by some assessment Search Results method for quality, as one of input parameter of sorting unit.
From top description as can be seen, these two weights are irrelevant with the current concrete query word of user's submission.But when using preferred realizations of the present invention, then need the basic weight of search engine to be adjusted according to the current concrete query word of user's submission.Referring to Fig. 3, show a design philosophy figure of the present invention, promptly pass through the basic weight of query word and each search source; The sorting position weight is integrated the Search Results of each search source of being obtained, and the Search Results after obtaining resequencing offers the user.
Step 1012 is determined the sorting position weight on the search engine.
Because the emphasis difference of each search engine can be distinguished to some extent for the considering then of ordering of Search Results.Generally, the quality that comes the Search Results of front is higher than the Search Results that comes the back.For example; the search engine of shopping class can increase the weight of the Search Results relevant with product; and reduce the weight of other Search Results, then the search engine searches by the shopping class to Search Results in, what come the front can be the Search Results relevant with product usually.Therefore, add up and assess, determine the sorting position weight on each search engine by ordering to the Search Results of each search engine.
For example, the sorting position weight of A search engine is followed successively by: the 1st #0.98; The 2nd #0.89; The 3rd #0.89; The 4th #0.80; The 5th #0.60; The 6th #0.30;
The sorting position weight of B search engine is followed successively by: the 1st #0.98; The 2nd #0.96; The 3rd #0.90; The 4th #0.85; The 5th #0.85; The 6th #0.85.
The basic weight and the sorting position weight on the search engine that are appreciated that each search engine draw by in advance search engine being added up and assessed, and when subsequent operation, need not to repeat statistics and assessment.
Step 1013, according to the weight and the sorting position weight of search engine, the formation base ordering.
Each search engine all can have Search Results, and can the Search Results that search be sorted, so each search result items all can comprise following information: from which search engine and the sorting position in the Search Results of this search engine.According to the weight of the definite search engine of preceding step and the sorting position weight on each search engine, can arrange all Search Results, obtain the basis ordering.
Step 102 is revised adjustment to this basis ordering, is finally sorted.
The purpose of step 102 just is based on the possible situation of Search Results, and the basis ordering is adjusted, and obtains more optimal ranking results, offers the user.
The mode of wherein revise adjusting is varied, below the present invention provide several feasible preferred exemplary and describe: based on the adjustment of co-occurrence information and based on the adjustment of text level of coverage.
Identical search result items may occur in the Search Results of a plurality of search engines, at this, be co-occurrence information with this information definition that occurs simultaneously, can utilize this part co-occurrence information to adjust sorting position.Such as, identical search result items appears in two or more search engines, can improve the fiduciary level of this search result items accordingly.
Identical search result items can adopt in the following multiple mode one or several to make up to determine.
For example, if the Search Results that occurs having identical network address in a plurality of search engines determines that then this Search Results is the co-occurrence item.Perhaps, if the identical or approximate title and the Search Results of summary in a plurality of search engines, occur, can determine that also this Search Results is the co-occurrence item.Perhaps, if the Search Results in same website or same territory in a plurality of search engines, occurs belonging to, can determine also that then this Search Results is the co-occurrence item.According to the co-occurrence information in the Search Results, adjust the score value of each search result items, Search Results is adjusted, rearrange, finally sorted.
These co-occurrence informations provide the information of each Search Results degree of reliability from the side.Can think that abundant more with other results' co-occurrence information, the reliability of the quality of this Search Results is high more,, can suitably improve its score value such result.
It will be appreciated by persons skilled in the art that above-mentioned various co-occurrence informations are combined use, can better improve the effect of ordering.
Because when same Search Results occurs in a plurality of search engines, these identical Search Results can occupy a plurality of sorting positions, therefore can not help the user to obtain information again, need filter heavily it, only keep one usually for the Search Results of identical content.Usually, each search source self can be filtered heavily Search Results, and what be primarily aimed at is to filter heavily from the Search Results of different search source herein.
Further,, also comprise among this method embodiment, the lower part of quality is discerned and filtered out to Search Results in order to improve search quality.
Generally, search engine can in the title of Search Results and the summary with the user inquiring speech in word or individual character (for example mark, mark red), can come to determine the text level of coverage of Search Results whereby, thereby pass judgment on the height of search quality query word.For the lower Search Results of text level of coverage, power and filtration are fallen accordingly.
Further, because the present invention has been with reference to a plurality of search engines and the preferable sortord that calculates for the ordering of the Search Results of a certain query word, therefore after the Search Results for a certain query word sorts, the sequencing information of the Search Results of this query word can be sent to one or more specific search engine and use.
When one or more specific search engine is searched for based on user request, can adjust oneself Search Results with reference to the adjusted sequencing information that is received, thereby optimize the ordering of the Search Results of single search engine, the raising search quality.
Below in conjunction with concrete example said method is described in detail.
Suppose to have M search engine, and the basic weight SEBW of this M search engine is expressed as follows by one-dimension array:
SEBW=[se 1=1.0,se 2=1.15,se 3=1.10,……,se M=1.0]
The classification set QueryClassSet of definition query word is:
QueryClassSet={Class-A,Class-B,Class-C,……,Class-N}
Suppose query word query, go out its vector by the classification set analysis and be:
ClassVector(query)=[0,0.6,0,……,0.4]
Suppose the adaptation Matrix C M of an above-mentioned M search engine to the query word classification M*N, promptly each search engine is to the degree of being good at of each classification, for:
CM M*N={
1.0,1.5,1.2,……,1.0,
1.3,1.2,1.0,……,1.0,
1.0,1.1,1.0,……,1.2,
}
The adjustment weight SEAW vector of search engine weight is so:
SEAW(query)=ClassVector(query)×CM M*N T
The final weight SEW of search engine is basic weight and adjusts the weight sum:
SEW=SEBW+SEAW
Suppose that the position ordering weight RPW that obtains M search engine by statistics and assessment is:
RPW=[1.0,0.95,0.92,0.90,0.88,0.86,……]
The information that every Search Results comprises has: from search engine (se); Sorting position on this search engine (rank_pos).
So, the score value Score of every Search Results calculates in the following way:
Score(Snippet se,rank_pos)=SEW[se]×RPW[rank_pos]
After calculating the score value of every Search Results, every Search Results is estimated that quality has just provided an entry evaluation.According to the size of score value, Search Results is arranged from high to low, form the sequence of basis ordering.
The Search Results of supposing certain network address (URL) appears in K the search engine simultaneously, and the sorting position in each search engine is as follows:
rank_pos se1,rank_pos se2,……,rank_pos seK
The score value of the Search Results of this URL can be adjusted into,
Score(URL)=(∑rank_pos sei)/K+d
Wherein, d is used for to the reliability weighting of Search Results, and K is big more, and d is also big more.Usually only keep one for a plurality of Search Results with identical URL.After calculating the score value of mentioned above searching results, need adjust, rearrange, obtain the Search Results of two minor sorts all Search Results.
In addition, in some cases, also need to utilize the title of Search Results and the text message of summary that Search Results is merged filter heavily.Suppose to have two Search Results, their title and summary are as follows:
Similarity(snippet a,snippet b)=(ED(title a,title b)×α+ED(summary a,summary b)×β)×same_site_weight
Wherein, ED (text a, text b) expression text aAnd text bEditing distance, normalize to [0,1] interval; Same_site_weight is the weight coefficient of " whether with the website result ".α, β are respectively the weight of title similarity and the weight of summary similarity.
When the Similarity of two Search Results reaches certain threshold value, think that their repeat, can select to delete one of them, thereby obtain final search result.
Further, can also assess the quality of Search Results to the text level of coverage of query word by Search Results, the Search Results that quality is lower falls power and filters, and is about to move or delete behind the sorting position.By the text level of coverage of query word being assessed whether inferior quality of a Search Results, fall power (reduction score value) accordingly and filter; The text coverage is low more, and quality is poor more usually.
Because the collections of web pages difference of each search source institute index.To some inquiry, certain search source may not have all webpages that comprises relevant information basically, and whole search quality is very poor.Each search source also may provide the low especially Search Results of indivedual quality to some inquiry because of the defective of its sort algorithm simultaneously.Therefore, low-quality Search Results is discerned and filtered, help to improve the whole structure of Search Results.
The red information of the mark of Search Results has characterized the text level of coverage of query word to a certain extent, and it can provide some foundations for identification inferior quality result.The mark of Search Results is red, refers to that search engine in the title of Search Results and summary, uses the word (term) in the red font explicit user query word (query) usually.Such as looking into " song of Zhou Jielun ", in the title of Search Results and the summary if " Zhou Jielun " and " song " these two speech, usually can be shown in red.
Adopt simultaneously based on the adjustment of co-occurrence information and based on the adjustment of text level of coverage, can obtain preferable ordering accuracy.
The present invention also provides a kind of Search Results sorting unit to comprise order module and adjusting module, and wherein, order module can be arranged Search Results, obtains the basis ordering; Adjusting module is used for according to other considerations Search Results being adjusted, and is finally sorted.Other the factor of considering comprises weight that improves relevant search result according to co-occurrence information etc.
Please refer to Fig. 4, provided a preferred embodiment of Search Results sorting unit of the present invention further, it can comprise search engine weight determination module 10 and search engine sorting position weight determination module 20, order module 30 and adjusting module 40.
Search engine weight determination module 10 is used for determining the weight of each search engine.When preferred the realization, can adjust in real time the basic weight of search engine based on the query word that the user submits to, obtain the weight of each search engine.
The basic weight of search engine is to draw according to the Search Results quality of each search engine is added up and assessed, and can be used as in the set parameter inputted search engine weight determination module.Search engine weight determination module 10 also can recalculate the weight of each search engine according to the degree of correlation of user inquiring speech and each search engine, i.e. the final weight of search engine.
Search engine sorting position weight determination module 20 is used for the sorting position weight on definite each search engine.Sorting position weight on each search engine also is by the Search Results of each search engine being added up and assessment draws, as set parameter input in search engine sorting position weight determination module 20.
Order module 30 according to each search engine searches to the attribute information of every Search Results in can comprise its from search engine and the sorting position on this search engine, with reference to the final weight of search engine and the sorting position weight on the search engine, Search Results is arranged, obtained the basis ordering.
Adjusting module 40 is adjusted Search Results according to other considerations, is finally sorted.Other the factor of considering comprises weight that improves relevant search result according to co-occurrence information etc.
In preferred embodiments more of the present invention, search engine weight determination module 10 may further include: basic weight acquisition module is used for determining the basic weight of each search engine; The category analysis module is used for the classification of the query word that analysis user submits to, and according to analysis result, adjusts the basic weight of each search engine, obtains the weight of each search engine.
Perhaps, search engine weight determination module 10 also may further include: basic weight acquisition module is used for determining the basic weight of each search engine; Degree of correlation analysis module is used for the degree of correlation according to query word and each search engine, adjusts the basic weight of each search engine, obtains the weight of each search engine.Because this part correlation content is described in detail in front, therefore, does not repeat them here.
Because during a plurality of search engine searches, have the situation that the Search Results with co-occurrence information appears at a plurality of search engines, therefore generally, in Search Results sorting unit 100, also can increase judge module, whether the Search Results in same network address, website, territory or Search Results with same or similar title and summary appeared in a plurality of search engines judge, and judgement information is passed to adjusting module 40.
Further, can take a plurality of sorting positions owing to have a plurality of Search Results of co-occurrence information, and can not help the user to obtain information, so Search Results sorting unit 100 can also comprise filter molality piece.Whether equal mutually operation merges filter retry to the same search result who appears in a plurality of search engines for text similarity degree by comparison title, summary and URL, only keeps one usually.
In a preferred embodiment of the invention, Search Results sorting unit 100 can also comprise filtering module, is used for identification and filters low-quality Search Results, and described filtration comprises falls power or deletion; The quality of described Search Results is assessed by the text level of coverage to query word.
Search result ordering method of the present invention and Search Results sorting unit are by integrating the Search Results of a plurality of search engines and sorting.After Search Results carried out the basis ordering according to the weight of search engine and the sorting position weight on the search engine, come the basis ordering is adjusted according to the co-occurrence information equal weight again, obtain the result of two minor sorts, make the foundation of ordering more reasonable, so that Search Results more accurately to be provided to the user, improve the quality of Search Results, simplified user's operation.
Further, search result ordering method of the present invention and Search Results sorting unit can realize by server end that but preferred, the present invention realizes that on client its effect is more outstanding.
The present invention promptly directly initiates searching request by client in the realization of client, links a plurality of search engines and obtains operations such as Search Results and ordering adjustment.
The ordering calculating and the broadband resource of complexity are taken the operating terminal of transferring to the user, can effectively utilize user's terminal resource, minimizing takies the server end resource, has improved search speed and efficient.Simultaneously, because there is the restriction of user inquiring Intra-request Concurrency number in server end, and the present invention is in the processing procedure of client, can be only at the active user, and promptly each user is special-purpose, has solved the problem of concurrent number fully.
For device embodiment, because it is similar substantially to method embodiment, so description is fairly simple, relevant part gets final product referring to the part explanation of method embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed all is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.
More than to the sort method and the Search Results sorting unit of a kind of Search Results provided by the present invention, be described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (19)

1. a search result ordering method is characterized in that, described Search Results comes from a plurality of search engines, may further comprise the steps:
Carry out the basis ordering at the Search Results that comes from a plurality of search engines;
Adjustment is revised in this basis ordering, obtained the final ordering of Search Results.
2. the method for claim 1 is characterized in that, describedly carries out basis ordering at the Search Results that comes from a plurality of search engines and comprises:
Determine the weight of each search engine;
Determine the sorting position weight on the search engine;
According to the weight and the sorting position weight of search engine, carry out the basis ordering at the Search Results that comes from a plurality of search engines.
3. method as claimed in claim 2 is characterized in that, the step of the weight of described definite each search engine further comprises:
Determine the basic weight of each search engine;
The classification of the query word that analysis user is submitted to, and, adjust the basic weight of each search engine according to analysis result, obtain the weight of each search engine.
4. method as claimed in claim 2 is characterized in that, the step of the weight of described definite each search engine further comprises:
Determine the basic weight of each search engine;
According to the degree of correlation of query word and each search engine, adjust the basic weight of each search engine, obtain the weight of each search engine.
5. the method for claim 1 is characterized in that, described the ordering of this basis is revised to adjust comprise:
According to the co-occurrence information of Search Results, adjustment is revised in this basis ordering.
6. method as claimed in claim 5 is characterized in that, described co-occurrence information comprises:
The search result items of identical network address appears having in a plurality of search engines;
And/or, the identical or approximate title and the search result items of summary appear in a plurality of search engines;
And/or, the search result items of same website appears belonging in a plurality of search engines;
And/or, the search result items in same territory appears belonging in a plurality of search engines.
7. as each described method of claim 3 to 5, it is characterized in that, describedly adjustment is revised in the ordering of this basis also comprised:
Carry out filter retry; Described filter retry is undertaken by the title of comparison Search Results and/or the similarity degree and/or the URL of summary.
8. the method for claim 1 is characterized in that, described correction adjustment also comprises:
Identification is also filtered low-quality Search Results, and described filtration comprises falls power or deletion.
9. method as claimed in claim 8 is characterized in that, the quality of described Search Results is assessed by the text level of coverage to query word.
10. the method for claim 1 is characterized in that, also comprises:
To revise adjusted Search Results sequencing information and be sent to particular search engine, be used to improve the Search Results ordering of particular search engine.
11. a Search Results sorting unit is characterized in that, comprising:
Order module is carried out the basis ordering at the Search Results that comes from a plurality of search engines;
Adjusting module is used for adjustment is revised in the basis ordering, obtains the final ordering of Search Results.
12. Search Results sorting unit as claimed in claim 11 is characterized in that, described Search Results sorting unit also comprises:
The first weight determination module is used for determining the weight of each search engine;
The second weight determination module is used for the sorting position weight on definite each search engine,
Described order module is carried out the basis ordering according to the weight and the sorting position weight of search engine at the Search Results that comes from a plurality of search engines.
13. Search Results sorting unit as claimed in claim 12 is characterized in that, the described first weight determination module specifically comprises:
Basic weight acquisition module is used for determining the basic weight of each search engine;
The category analysis module is used for the classification of the query word that analysis user submits to, and according to analysis result, adjusts the basic weight of each search engine, obtains the weight of each search engine.
14. method as claimed in claim 12 is characterized in that, the described first weight determination module specifically comprises:
Basic weight acquisition module is used for determining the basic weight of each search engine;
Degree of correlation analysis module is used for the degree of correlation according to query word and each search engine, adjusts the basic weight of each search engine, obtains the weight of each search engine.
15. Search Results sorting unit as claimed in claim 11 is characterized in that, described Search Results sorting unit also comprises judge module, be used for the co-occurrence information of Search Results is judged, and the result that will judge sends adjusting module to.
16. Search Results sorting unit as claimed in claim 15 is characterized in that, described co-occurrence information comprises:
The search result items of identical network address appears having in a plurality of search engines;
And/or, the identical or approximate title and the search result items of summary appear in a plurality of search engines;
And/or, the search result items of same website appears belonging in a plurality of search engines;
And/or, the search result items in same territory appears belonging in a plurality of search engines.
17., it is characterized in that described Search Results sorting unit also comprises as each described Search Results sorting unit of claim 13 to 15:
Filter molality piece is used for Search Results is filtered retry; Described filter retry is undertaken by the title of comparison Search Results and/or the similarity degree and/or the URL of summary.
18. Search Results sorting unit as claimed in claim 11 is characterized in that, described Search Results sorting unit also comprises:
Filtering module is used for identification and filters low-quality Search Results, and described filtration comprises falls power or deletion; The quality of described Search Results is assessed by the text level of coverage to query word.
19. Search Results sorting unit as claimed in claim 11 is characterized in that, also comprises:
The sequencing information sending module is used for the adjusted Search Results sequencing information of correction is sent to particular search engine, is used to improve the Search Results ordering of particular search engine.
CN 201010559233 2010-11-25 2010-11-25 Search result sequencing method and search result sequencer Pending CN102004782A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010559233 CN102004782A (en) 2010-11-25 2010-11-25 Search result sequencing method and search result sequencer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010559233 CN102004782A (en) 2010-11-25 2010-11-25 Search result sequencing method and search result sequencer

Publications (1)

Publication Number Publication Date
CN102004782A true CN102004782A (en) 2011-04-06

Family

ID=43812144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010559233 Pending CN102004782A (en) 2010-11-25 2010-11-25 Search result sequencing method and search result sequencer

Country Status (1)

Country Link
CN (1) CN102004782A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841904A (en) * 2011-06-24 2012-12-26 阿里巴巴集团控股有限公司 Searching method and searching device
CN102890725A (en) * 2012-11-02 2013-01-23 瑞庭网络技术(上海)有限公司 Result ranking method for search engine
CN102902806A (en) * 2012-10-17 2013-01-30 深圳市宜搜科技发展有限公司 Method and system for performing inquiry expansion by using search engine
CN102902755A (en) * 2012-09-21 2013-01-30 北京百度网讯科技有限公司 Method and device for adjusting sequencing of search result items
CN103092839A (en) * 2011-10-28 2013-05-08 腾讯科技(深圳)有限公司 Management method and device for recording historical information
CN103870607A (en) * 2014-04-08 2014-06-18 北京奇虎科技有限公司 Sequencing method and device of search results of multiple search engines
CN104077306A (en) * 2013-03-28 2014-10-01 阿里巴巴集团控股有限公司 Search engine result sequencing method and search engine result sequencing system
CN104516887A (en) * 2013-09-27 2015-04-15 腾讯科技(深圳)有限公司 Webpage data search method, device and system
CN104572717A (en) * 2013-10-18 2015-04-29 腾讯科技(深圳)有限公司 Information searching method and device
CN104636383A (en) * 2013-11-14 2015-05-20 腾讯科技(深圳)有限公司 Method and device for achieving comparison searching
CN105247517A (en) * 2013-04-23 2016-01-13 谷歌公司 Ranking signals in mixed corpora environments
CN105302898A (en) * 2015-10-23 2016-02-03 天津车之家科技有限公司 Click model-based searching and ranking method and device
US9262513B2 (en) 2011-06-24 2016-02-16 Alibaba Group Holding Limited Search method and apparatus
CN105335373A (en) * 2014-06-17 2016-02-17 阿里巴巴集团控股有限公司 Information searching method and apparatus
CN105849730A (en) * 2016-03-25 2016-08-10 马岩 Data capture method and system
CN106294807A (en) * 2016-08-15 2017-01-04 马岩 The searching method of big data and system in LAN
CN106709353A (en) * 2016-10-27 2017-05-24 腾讯科技(深圳)有限公司 Safety detection method and device of search engine
WO2018027927A1 (en) * 2016-08-12 2018-02-15 深圳市博信诺达经贸咨询有限公司 Webpage data searching method and system
WO2018032246A1 (en) * 2016-08-15 2018-02-22 马岩 Search method and system for big data in local area network
WO2018032247A1 (en) * 2016-08-15 2018-02-22 马岩 Search method and system for big data of videos
WO2018032253A1 (en) * 2016-08-15 2018-02-22 马岩 Secure search method and system for big data of images
WO2018032254A1 (en) * 2016-08-15 2018-02-22 马岩 Method and system for fetching trusted video in big data
WO2018032251A1 (en) * 2016-08-15 2018-02-22 马岩 Method and system for applying security level to data fetching of big data
WO2018032249A1 (en) * 2016-08-15 2018-02-22 马岩 Audio data fetching method and system
CN108009235A (en) * 2017-11-29 2018-05-08 福建中金在线信息科技有限公司 Data capture method and device
CN108140029A (en) * 2015-09-18 2018-06-08 三星电子株式会社 The automatic depth that stacks checks card
CN108334575A (en) * 2018-01-23 2018-07-27 北京三快在线科技有限公司 A kind of recommendation results sequence modification method and device, electronic equipment
CN108573067A (en) * 2018-04-27 2018-09-25 福建江夏学院 A kind of the matching search system and method for merchandise news
CN109474832A (en) * 2018-11-28 2019-03-15 深圳市酷开网络科技有限公司 A kind of information search sort method, intelligent terminal and storage medium
CN110413763A (en) * 2018-04-30 2019-11-05 国际商业机器公司 Automatic selection of search ranker

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101233513A (en) * 2005-07-29 2008-07-30 雅虎公司 System and method for reordering a result set
US20100153357A1 (en) * 2003-06-27 2010-06-17 At&T Intellectual Property I, L.P. Rank-based estimate of relevance values
CN101751434A (en) * 2008-12-16 2010-06-23 北大方正集团有限公司 Meta search engine ranking method and Meta search engine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153357A1 (en) * 2003-06-27 2010-06-17 At&T Intellectual Property I, L.P. Rank-based estimate of relevance values
CN101233513A (en) * 2005-07-29 2008-07-30 雅虎公司 System and method for reordering a result set
CN101751434A (en) * 2008-12-16 2010-06-23 北大方正集团有限公司 Meta search engine ranking method and Meta search engine

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841904A (en) * 2011-06-24 2012-12-26 阿里巴巴集团控股有限公司 Searching method and searching device
CN102841904B (en) * 2011-06-24 2016-05-04 阿里巴巴集团控股有限公司 A kind of searching method and equipment
US9262513B2 (en) 2011-06-24 2016-02-16 Alibaba Group Holding Limited Search method and apparatus
CN103092839A (en) * 2011-10-28 2013-05-08 腾讯科技(深圳)有限公司 Management method and device for recording historical information
CN102902755A (en) * 2012-09-21 2013-01-30 北京百度网讯科技有限公司 Method and device for adjusting sequencing of search result items
CN102902806A (en) * 2012-10-17 2013-01-30 深圳市宜搜科技发展有限公司 Method and system for performing inquiry expansion by using search engine
CN102902806B (en) * 2012-10-17 2016-02-10 深圳市宜搜科技发展有限公司 A kind of method and system utilizing search engine to carry out query expansion
CN102890725B (en) * 2012-11-02 2015-08-19 瑞庭网络技术(上海)有限公司 The result ordering method of search engine
CN102890725A (en) * 2012-11-02 2013-01-23 瑞庭网络技术(上海)有限公司 Result ranking method for search engine
CN104077306B (en) * 2013-03-28 2018-05-11 阿里巴巴集团控股有限公司 The result ordering method and system of a kind of search engine
CN104077306A (en) * 2013-03-28 2014-10-01 阿里巴巴集团控股有限公司 Search engine result sequencing method and search engine result sequencing system
CN105247517A (en) * 2013-04-23 2016-01-13 谷歌公司 Ranking signals in mixed corpora environments
CN105247517B (en) * 2013-04-23 2019-05-14 谷歌有限责任公司 Mix the ranking signal in corpus lab environment
CN104516887B (en) * 2013-09-27 2019-08-30 腾讯科技(深圳)有限公司 A kind of web data searching method, device and system
CN104516887A (en) * 2013-09-27 2015-04-15 腾讯科技(深圳)有限公司 Webpage data search method, device and system
CN104572717A (en) * 2013-10-18 2015-04-29 腾讯科技(深圳)有限公司 Information searching method and device
CN104636383A (en) * 2013-11-14 2015-05-20 腾讯科技(深圳)有限公司 Method and device for achieving comparison searching
CN104636383B (en) * 2013-11-14 2019-09-20 腾讯科技(深圳)有限公司 A kind of method and apparatus for realizing comparison search
WO2015154679A1 (en) * 2014-04-08 2015-10-15 北京奇虎科技有限公司 Method and device for ranking search results of multiple search engines
CN103870607A (en) * 2014-04-08 2014-06-18 北京奇虎科技有限公司 Sequencing method and device of search results of multiple search engines
CN105335373A (en) * 2014-06-17 2016-02-17 阿里巴巴集团控股有限公司 Information searching method and apparatus
CN108140029A (en) * 2015-09-18 2018-06-08 三星电子株式会社 The automatic depth that stacks checks card
CN105302898A (en) * 2015-10-23 2016-02-03 天津车之家科技有限公司 Click model-based searching and ranking method and device
CN105302898B (en) * 2015-10-23 2019-02-19 车智互联(北京)科技有限公司 A kind of search ordering method and device based on click model
CN105849730A (en) * 2016-03-25 2016-08-10 马岩 Data capture method and system
WO2017161578A1 (en) * 2016-03-25 2017-09-28 马岩 Method and system for data capturing
WO2018027927A1 (en) * 2016-08-12 2018-02-15 深圳市博信诺达经贸咨询有限公司 Webpage data searching method and system
WO2018032254A1 (en) * 2016-08-15 2018-02-22 马岩 Method and system for fetching trusted video in big data
WO2018032251A1 (en) * 2016-08-15 2018-02-22 马岩 Method and system for applying security level to data fetching of big data
WO2018032249A1 (en) * 2016-08-15 2018-02-22 马岩 Audio data fetching method and system
WO2018032253A1 (en) * 2016-08-15 2018-02-22 马岩 Secure search method and system for big data of images
WO2018032247A1 (en) * 2016-08-15 2018-02-22 马岩 Search method and system for big data of videos
WO2018032246A1 (en) * 2016-08-15 2018-02-22 马岩 Search method and system for big data in local area network
CN106294807A (en) * 2016-08-15 2017-01-04 马岩 The searching method of big data and system in LAN
CN106709353A (en) * 2016-10-27 2017-05-24 腾讯科技(深圳)有限公司 Safety detection method and device of search engine
CN108009235A (en) * 2017-11-29 2018-05-08 福建中金在线信息科技有限公司 Data capture method and device
CN108334575A (en) * 2018-01-23 2018-07-27 北京三快在线科技有限公司 A kind of recommendation results sequence modification method and device, electronic equipment
CN108334575B (en) * 2018-01-23 2022-04-26 北京三快在线科技有限公司 Recommendation result sorting correction method and device and electronic equipment
CN108573067A (en) * 2018-04-27 2018-09-25 福建江夏学院 A kind of the matching search system and method for merchandise news
CN110413763A (en) * 2018-04-30 2019-11-05 国际商业机器公司 Automatic selection of search ranker
CN109474832A (en) * 2018-11-28 2019-03-15 深圳市酷开网络科技有限公司 A kind of information search sort method, intelligent terminal and storage medium
CN109474832B (en) * 2018-11-28 2021-02-02 深圳市酷开网络科技有限公司 Information searching and sorting method, intelligent terminal and storage medium

Similar Documents

Publication Publication Date Title
CN102004782A (en) Search result sequencing method and search result sequencer
CN102043833B (en) Search method and device based on query word
CN101551806B (en) Personalized website navigation method and system
CN101079064B (en) Web page sequencing method and device
WO2017121251A1 (en) Information push method and device
CN102054003B (en) Methods and systems for recommending network information and creating network resource index
CN102279851B (en) Intelligent navigation method, device and system
CN103226578B (en) Towards the website identification of medical domain and the method for webpage disaggregated classification
CN111708740A (en) Mass search query log calculation analysis system based on cloud platform
CN105912669B (en) Method and device for complementing search terms and establishing individual interest model
US10691765B1 (en) Personalized search results
CN103955529A (en) Internet information searching and aggregating presentation method
CN101520878A (en) Method, device and system for pushing advertisements to users
CN102999560A (en) Improvement of relevance of search engine result page between name and other search queries by using social network features
CN101441636A (en) Hospital information search engine and system based on knowledge base
EP1665101A1 (en) Systems and methods for clustering search results
CN101401062A (en) Method and system for determining relevant sources, querying and merging results from multiple content sources
CN103064880B (en) A kind of methods, devices and systems providing a user with website selection based on search information
CN102982153A (en) Information retrieval method and device
US9971828B2 (en) Document tagging and retrieval using per-subject dictionaries including subject-determining-power scores for entries
CN103902597A (en) Method and device for determining search relevant categories corresponding to target keywords
US7421416B2 (en) Method of managing web sites registered in search engine and a system thereof
CN107122447A (en) The network searching system and control method of a kind of multi-data source fusion based on preference
CN105912662A (en) Coreseek-based vertical search engine research and optimization method
CN101739429B (en) Method for optimizing cluster search results and device thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110406