CN105808601A - Calculation method and device for evaluating recording loss of search engine - Google Patents

Calculation method and device for evaluating recording loss of search engine Download PDF

Info

Publication number
CN105808601A
CN105808601A CN201410854198.XA CN201410854198A CN105808601A CN 105808601 A CN105808601 A CN 105808601A CN 201410854198 A CN201410854198 A CN 201410854198A CN 105808601 A CN105808601 A CN 105808601A
Authority
CN
China
Prior art keywords
search engine
query result
query
marking
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410854198.XA
Other languages
Chinese (zh)
Other versions
CN105808601B (en
Inventor
陶哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410854198.XA priority Critical patent/CN105808601B/en
Publication of CN105808601A publication Critical patent/CN105808601A/en
Application granted granted Critical
Publication of CN105808601B publication Critical patent/CN105808601B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a calculation method for evaluating the recording loss of a search engine. The calculation method comprises the following steps: extracting one or a plurality of query strings in a target search engine; according to a query result of each query string, independently carrying out correlation marking, and determining the first theoretical maximum score value S2 of a first-page query result of each query string; according to the correlation marking value of the query result of each query string in M reference search engines, determining a second theoretical maximum score value S2' of the first-page query result of each query string; and according to the first theoretical maximum score value S2 and the second theoretical maximum score value S2', calculating the resource recording loss S1 of each query string in the target search engine. The resource recording loss situation of the target search engine can be more truly, objectively and accurately evaluated so as to conveniently guide subsequent work, improve the resource recording situation of the search engine and provide a better search engine.

Description

Assessment search engine resource includes computational methods and the device of loss
Technical field
The present invention relates to search technique field, particularly relate to a kind of search engine resource of assessing and include computational methods and the device of loss.
Background technology
Along with the high speed development of Internet technology, internet data is already in volatile growth trend, and normal network users searches required information just like looking for a needle in a haystack from the Internet.Search for service for this phenomenon to arise at the historic moment, by improving search engine technique, webpage is carried out overall ranking by the factor such as technorati authority according to the dependency of webpage, website, in order to when user search information needed, dependency is reasonable, high-quality Search Results is preferentially presented to user.
When traditional search engine relevance is evaluated, it is usually and goes to evaluate top n result to the query string (Query) extracted, for picture, evaluate front 21 pictures, give a mark respectively, the score according to each query string, and the score of the score of last time or competing product compares.For 21 pictures that a query string is evaluated, its score can not reach the reason of full marks would generally pay close attention to following three kinds, shown in Figure 1:
One is the No Assets of this query string own, and such as " XXXsdfdsf123 " this Query substantially not having implication is likely to just not have related resource, the resource disappearance S0 of Query as shown in Figure 1 own.
Two is that this query string has included loss, is not indexed to whole valid data, as shown in Figure 1, includes loss S1.
Three is that the valid data that this query string is included are not sorted up, sequence loss S2.
The resource of search engine can be carried out a degree of evaluation according to S1 and S2, obtain Rank score R1 or Rank and lose the parameters such as R0.
Existing evaluation methodology is typically all and as a whole removing, search engine is carried out relativity evaluation, although can quality between comparison search engine to a certain extent, but the evaluation to search engine itself can not be realized;And due to the uncertainty of resource quantity, generally can not determine for a query string actually should have how many resources, therefore, at present academicly assessment for including loss does not compare effective ways, what cannot well evaluate this search engine includes damaed cordition, and the directive significance of follow-up work is also little.
Summary of the invention
In view of the above problems, it is proposed that the present invention is to provide a kind of and overcome the problems referred to above or solve the assessment search engine resource of the problems referred to above at least in part and include computational methods and the device of loss.
The present invention provides a kind of search engine resource of assessing to include the computational methods of loss, including:
Extract the one or more query strings in Targeted Search Engine;
Query Result according to each query string carries out dependency marking respectively, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
According to M with reference to the dependency marking value of the Query Result of each query string in search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource of each query string in Targeted Search Engine and include loss S1.
In some optional embodiments, it is determined that the process of described first theoretical maximum score value S2 includes:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
In some optional embodiments, the dependency marking value according to other except page 1 page Query Result, adjust described marking list, marking list after being adjusted and described first theoretical maximum score value S2, including:
Set the Query Result of quantity from described other page of each crawl first, Query Result is carried out dependency marking, obtain the marking list of each page;
When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and described first theoretical maximum score value S2.
In some optional embodiments, it is determined that the process of the second theoretical maximum score value S2 ' including:
Obtain M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjust the marking list after described adjustment, obtain the second theoretical maximum score value S2 '.
In some optional embodiments, adjust the marking list after described adjustment, including:
When in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after described adjustment with the Query Result in reference search engine.
In some optional embodiments, calculate the resource of each query string in Targeted Search Engine and include loss S1, including:
Calculate S2 '-S2, obtain the resource of each query string in described Targeted Search Engine and include loss S1.
In some optional embodiments, above-mentioned method, also include:
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.
In some optional embodiments, wherein, calculate the resource efficiency index of each query string in Targeted Search Engine, including:
Calculate S2/S2 ', obtain the resource efficiency index of each query string in described Targeted Search Engine.
The present invention provides a kind of search engine resource of assessing to include the calculation element of loss, including:
Abstraction module, for extracting the one or more query strings in Targeted Search Engine;
First determines module, carries out dependency marking respectively for the Query Result according to each query string, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
Second determines module, for according to the dependency marking value of the Query Result of each query string in M reference search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
3rd determines module, for according to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculating the resource of each query string in Targeted Search Engine and include loss S1.
In some optional embodiments, described first determines module, specifically for:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
In some optional embodiments, described first determines module, specifically for:
Set the Query Result of quantity from described other page of each crawl first, Query Result is carried out dependency marking, obtain the marking list of each page;
When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and described first theoretical maximum score value S2.
In some optional embodiments, described second determines module, specifically for:
Obtain M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjust the marking list after described adjustment, obtain the second theoretical maximum score value S2 '.
In some optional embodiments, described second determines module, specifically for:
When in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after described adjustment with the Query Result in reference search engine.
In some optional embodiments, the described 3rd determines module, is additionally operable to:
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.
The assessment search engine resource that the embodiment of the present invention provides includes the computational methods of loss, to Targeted Search Engine to be assessed, therefrom extract at least one query string, the Query Result according to each query string, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;And according to M with reference to the Query Result of each query string in search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;Query Result according to Targeted Search Engine to be assessed self, and with reference to other M the Query Result with reference to search engine, determine that in Targeted Search Engine to be assessed, the resource of each query string includes loss S1, the method is with multiple search engines for reference, the resource that Targeted Search Engine is included is estimated, it is thus possible to evaluate in Targeted Search Engine the damaed cordition including resource preferably, it assesses accuracy, verity compares better, follow-up work can be instructed better, improve the resource collection situation of search engine, provide the user better Search Results.
Further, the method of the present invention is when obtaining the first theoretical maximum score value S2, according to the Query Result of some pages, back in Targeted Search Engine to be assessed, adjust the maximum score value of page 1 Query Result, making to obtain the first theoretical maximum score value S2 can the maximum score value of the actual collection situation of real embodiment Targeted Search Engine, same, the second theoretical maximum score value S2 ' is obtained according to M the Query Result adjustment with reference to search engine, make to obtain the second theoretical maximum score value S2 ' closer to real maximum score value, so that evaluate search engine resource to include the extent of damage closer to real damaed cordition.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, and can be practiced according to the content of description, and in order to above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
According to below in conjunction with the accompanying drawing detailed description to the specific embodiment of the invention, those skilled in the art will understand the above-mentioned of the present invention and other purposes, advantage and feature more.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit those of ordinary skill in the art be will be clear from understanding.Accompanying drawing is only for illustrating the purpose of preferred implementation, and is not considered as limitation of the present invention.And in whole accompanying drawing, it is denoted by the same reference numerals identical parts.In the accompanying drawings:
Fig. 1 is that in prior art, factor schematic diagram is paid close attention in search engine relevance evaluation;
Fig. 2 assesses search engine resource to include the computational methods flow chart of loss in the embodiment of the present invention one;
Fig. 3 assesses search engine resource to include the computational methods flow chart of loss in the embodiment of the present invention two;
Fig. 4 assesses the structural representation that search engine resource includes the calculation element of loss in the embodiment of the present invention.
Detailed description of the invention
It is more fully described the exemplary embodiment of the disclosure below with reference to accompanying drawings.Although accompanying drawing showing the exemplary embodiment of the disclosure, it being understood, however, that may be realized in various forms the disclosure and should do not limited by embodiments set forth here.On the contrary, it is provided that these embodiments are able to be best understood from the disclosure, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
The resource that can not well assess search engine in order to solve exist in prior art includes the problem of loss, and the embodiment of the present invention provides a kind of search engine resource of assessing to include the computational methods of loss.By search engine to be assessed and other reference search engine include resource, comprehensive measurement, determine that the resource of search engine to be assessed includes loss situation, it is with high accuracy, the resource that can be good at analyzing search engine to be assessed includes the extent of damage, to instruct follow-up improvement.
Embodiment one
The assessment search engine resource that the embodiment of the present invention one provides includes the computational methods of loss, and its flow process is as in figure 2 it is shown, comprise the steps:
Step S201: extract the one or more query strings in Targeted Search Engine.
From Targeted Search Engine to be assessed, randomly draw at least one query string, respectively can extract one or more from the query string of high, medium and low frequency when extracting.
Step S202: carry out dependency marking according to the Query Result of each query string respectively, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result.
This step Query Result by Targeted Search Engine self, optimize the marking value of page 1 Query Result, can theoretical maximum, determine the first theoretical maximum score value S2, Query Result according to each query string obtained in Targeted Search Engine is determined: obtain the Query Result of the first setting quantity in the page 1 Query Result of Targeted Search Engine, dependency marking value according to the first Query Result setting quantity, obtains marking list;Dependency marking value according to other except page 1 page Query Result, adjusts marking list, marking list after being adjusted and the first theoretical maximum score value S2.
Namely the first theoretical maximum score value S2 determined is the theoretical maximum score of the Query Result that Targeted Search Engine is shown on page 1, this theoretical maximum point is the score according to page 1 Query Result, obtain after being adjusted by other page of Query Result, it is in Targeted Search Engine, the real maximum score value of the Query Result that query string is corresponding.Such as: can according to the dependency marking value of page 2 to the Query Result of K page, adjust marking list, wherein K be approximately equal to 2 integer, K page can be all pages of Query Result, it is also possible to is front some pages.
When adjusting marking list, it is possible to be adjusted according to the whole Query Results on other each pages, it is also possible to be adjusted according to the Query Result of quantity selected on other each pages, for instance can select identical with the quantity selected in page 1 Query Result, it is also possible to different.The process adjusting big point list may include that the Query Result setting quantity from other page of each crawl first, Query Result is carried out dependency marking, obtains the marking list of each page;When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and the first theoretical maximum score value S2.
Step S203: according to M with reference to the dependency marking value of the Query Result of each query string in search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result.
This step realizes the Query Result corresponding according to each query string in several reference search engines, adjusts the marking list after adjusting in previous step, it is achieved the Query Result marking value in marking list optimized further so that it is maximize.
When determining the second theoretical maximum score value S2 ', obtain M with reference to the Query Result of each query string in search engine, the dependency marking value according to Query Result, adjust the marking list after adjusting in above-mentioned steps, obtain the second theoretical maximum score value S2 '.
Wherein, when adjusting the marking list after adjusting in above-mentioned steps, also it is be adjusted by the marking value of the Query Result higher with reference to marking value in search engine, concrete, when in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after above-mentioned adjustment with the Query Result in reference search engine.
Step S204: according to the first theoretical maximum score value S2 determined and the second theoretical maximum score value S2 ', calculates the resource of each query string in Targeted Search Engine and includes loss S1.
According to S2 and S2 ' obtained above, it may be determined that the index of correlation of the stock assessment of Targeted Search Engine, for instance resource includes loss S1, wherein, it is both differences that resource includes loss S1, namely calculates S2 '-S2, obtains the resources loss S1 of each query string in Targeted Search Engine.Thus from obtain score maximum in theory Targeted Search Engine and some search engines consider after theoretical maximum score between difference, what be Targeted Search Engine includes damaed cordition.
Optionally, according to S2 and S2 ' obtained above, it is also possible to determine other indexs of correlation of the stock assessment of Targeted Search Engine, for instance: resource efficiency index.Namely according to the first theoretical maximum score value S2 and the second theoretical maximum score value S2 ', the resource efficiency index of each query string in Targeted Search Engine is calculated.Wherein resource efficiency index is both ratios, namely calculates S2/S2 ', obtains the resource efficiency index of each query string in Targeted Search Engine.
Embodiment two
The assessment search engine resource that the embodiment of the present invention two provides includes the computational methods of loss, implements process with what a specific embodiment was described in detail method described in embodiment one, and its flow process is as it is shown on figure 3, comprise the steps:
Step S301: extract at least one query string in Targeted Search Engine, perform the operation of the following step for each query string.
Such as: to Targeted Search Engine to be assessed, such as photographic search engine, in Query randomly draw, it is assumed that the Query of high, medium and low frequency respectively extracts N number of.
Step S302: the Query Result of the first setting quantity in the page 1 Query Result of each query string in acquisition Targeted Search Engine.
To each Query, capture some Query Results in page 1, for instance, capture 21 Query Results.
Step S303: the dependency marking value according to the obtain first Query Result setting quantity, obtains the marking list of page 1.
The Query Result captured is carried out dependency marking, for instance the available marking value of each Query Result is 0,1,2 third gear, obtain marking list (s1, s2, the s3 of this Query, ..., s21), the marking statistical value R1 of page 1 Query Result can also be obtained simultaneously.
Step S304: obtain the page 2 of each query string in Targeted Search Engine and, to K page Query Result, Query Result is carried out dependency marking, obtains the page 2 marking list to K page.
For each Query, (taking 3 with K, 21 results of each page are example to K page to capture page 2,) Query Result, Query Result is carried out dependency marking, obtains the page 2 marking list (s22 to K page of each Query, s23 ..., s63).
Step S305: the dependency marking value according to page 2 to K page Query Result, adjusts the marking list of page 1, the first theoretical maximum score value S2 of give a mark list and marking statistical value after being adjusted.
This step realizes the dependency marking value according to other page of Query Result except page 1, adjusts marking list, marking list after being adjusted and the first theoretical maximum score value S2.Wherein other pages can be the part or all of Query Result except page 1, in theory, use other page of whole Query Result effects optimum, but when Query Result number of pages is too many, when in each page, Query Result quantity is too many, other pages a number of can be selected, it is possible to select other pages of upper a number of Query Results to realize adjusting the marking list of page 1.
This step by carrying out score replacement, the Query Result to page 1, if the little marking value of several pages later of marking value, just carry out result replacement, by replacing, thus obtain the maximum score of rank in theory, i.e. S2.This step by replacing Query Result bad on page 1, allow 21 Query Results setting quantity chosen on page 1 as 21 best Query Results.
After this step obtains the first theoretical maximum score value S2 of Targeted Search Engine page 1 Query Result, some relevant evaluation indexs of sort algorithm can be determined according to R1 and S2, such as, determine the efficiency index of sort algorithm according to S2 and R1, or determine the improvement index of sort algorithm according to S2 and R1.Such as can calculating R1/S2 and obtain the efficiency index of sort algorithm, weigh the significant degree of sort algorithm, it is possible to calculate S2-R1 and obtain the improvement index of sort algorithm, the improvement weighing sort algorithm promotes leeway.
After this step estimates the first theoretical maximum score value S2 of Targeted Search Engine page 1 Query Result, it is possible to for the calculating of the index of correlation that the resource allocation of subsequent evaluation Targeted Search Engine is assessed.
Step S306: choose M with reference to search engine, carry out dependency marking to M with reference to the Query Result of each query string in search engine.
Owing to total resources S0 is unknown, namely for a Query, on earth the Internet there are how many resources, are unknown in fact, thus for referencial use by the resource in other search engines herein, assess the resource of Targeted Search Engine.
Choosing M (such as M takes 3) with reference to search engine, be carried out the process of above-mentioned steps S302-step S305, this obtains each with reference to search engine first theoretical maximum score value Sm2 of respective page 1 Query Result under the premise of current resource.For reference search engine, the process of above-mentioned steps S302-step S305 can be performed, the Query Result first filtered out, changing for follow-up carrying, each page Query Result of reference search engine can also be made directly dependency marking, without the Query Result that it is good is screened on page 1, the directly dependency marking value according to each Query Result, it is determined whether be used for replacing.
Step S307: according to M the dependency marking value with reference to the Query Result of search engine, adjust the marking list after adjusting in above-mentioned steps, obtain the second theoretical maximum score value S2 '.
Marking list S2 after adjusting in above-mentioned steps (s1 ', s2 ' ..., s21 '), carry out marking value replacement by other several data with reference to search engines, replace with step S305.
This step is by carrying out score replacement, the marking list of the Query Result of the page 1 after Targeted Search Engine is replaced, if marking value is less than the marking value of Query Result in reference search engine, just carry out result replacement, by replacing, thus obtain in theory rank relative to the maximum score of multiple search engines, i.e. S2 '.This step by replacing Query Result bad on Targeted Search Engine page 1, allow 21 Query Results setting quantity chosen on page 1 as 21 Query Results best in multiple search engines.
Step S308: according to the first theoretical maximum score value S2 determined and the second theoretical maximum score value S2 ', calculates the resource of each query string in Targeted Search Engine and includes loss S1.
After obtaining the second theoretical maximum score value S2 ' of each Query, owing to S2 ' is the equal of, in+the S2 obtained in Targeted Search Engine and Targeted Search Engine, resource includes the sum of loss S1, therefore, just can, according to S2 ' and S2, obtain showing that the resource of Targeted Search Engine includes loss S1.Difference the S2 '-S2 that must divide of 21 Query Results best in the score of 21 namely best in all engines Query Results and Targeted Search Engine to be evaluated, the resource being Targeted Search Engine to be evaluated includes loss amount, is also that the improvement of Targeted Search Engine data aspect to be evaluated promotes leeway.
After obtaining the second theoretical maximum score value S2 ' of each Query, all right S2/S2 ', obtain the ratio shared by score of 21 best in all engines Query Results of the score of 21 Query Results best in Targeted Search Engine to be evaluated, be in Targeted Search Engine to be evaluated the significant degree of resource.
Said method, can for query string, the resource assessing each query string respectively includes the parameter such as loss, resource efficiency index, can also the analysis result of comprehensive each query string, by process such as the S2 ' and S2 of each query string being weighted, adds up, the resource obtaining whole search engine includes the evaluating such as loss, resource efficiency index.
Based on same inventive concept, the embodiment of the present invention also provides for a kind of search engine resource of assessing and includes the calculation element of loss, the structure of this device as shown in Figure 4, including: abstraction module 401, first is determined module 402, second is determined that module 403 and the 3rd determines module 404.
Abstraction module 401, for extracting the one or more query strings in Targeted Search Engine.
First determines module 402, carries out dependency marking respectively for the Query Result according to each query string, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result.
Second determines module 403, for according to the dependency marking value of the Query Result of each query string in M reference search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result.
3rd determines module 404, for according to the first theoretical maximum score value S2 and the second theoretical maximum score value S2 ', calculating the resource of each query string in Targeted Search Engine and include loss S1.
Preferably, above-mentioned first determines module 401, sets the Query Result of quantity specifically in the page 1 Query Result of acquisition Targeted Search Engine first, and the dependency marking value according to the first Query Result setting quantity obtains marking list;Dependency marking value according to other except page 1 page Query Result, adjusts marking list, marking list after being adjusted and the first theoretical maximum score value S2.
Preferably, above-mentioned first determines module 401, specifically for setting the Query Result of quantity from other page of each crawl first, Query Result is carried out dependency marking, obtains the marking list of each page;When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and the first theoretical maximum score value S2.
Preferably, above-mentioned second determines module 402, specifically for obtaining M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjusts the marking list after above-mentioned adjustment, obtains the second theoretical maximum score value S2 '.
Preferably, above-mentioned second determines module 402, specifically for when in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after above-mentioned adjustment with the Query Result in reference search engine.
Preferably, the above-mentioned 3rd determines module 403, is additionally operable to, according to the first theoretical maximum score value S2 and the second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.
The above-mentioned assessment search engine resource that the embodiment of the present invention provides includes computational methods and the device of loss, by the separate evaluation to data and sort algorithm, the index evaluated can better instruct the work that dependency promotes, better effects if in accuracy rate and the guidance to follow-up work.Can to the assessment carrying out significant degree and lifting leeway of sort algorithm, resource can also be carried out significant degree and include the assessment of loss, by the lateral comparison with other search engines, the deficiency comparing Targeted Search Engine to be assessed that can be more deep, and then follow-up correlation activities can well be instructed, be conducive to the improvement to search engine and lifting.
In description mentioned herein, describe a large amount of detail.It is to be appreciated, however, that embodiments of the invention can be put into practice when not having these details.In some instances, known method, structure and technology it are not shown specifically, in order to do not obscure the understanding of this description.
Similarly, it is to be understood that, one or more in order to what simplify that the disclosure helping understands in each inventive aspect, herein above in the description of the exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or descriptions thereof sometimes.But, the method for the disclosure should be construed to and reflect an intention that namely the present invention for required protection requires feature more more than the feature being expressly recited in each claim.More precisely, as the following claims reflect, inventive aspect is in that all features less than single embodiment disclosed above.Therefore, it then follows claims of detailed description of the invention are thus expressly incorporated in this detailed description of the invention, wherein each claim itself as the independent embodiment of the present invention.
Those skilled in the art are appreciated that, it is possible to carry out the module in the equipment in embodiment adaptively changing and they being arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit excludes each other, it is possible to adopt any combination that all processes or the unit of all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed any method or equipment are combined.Unless expressly stated otherwise, each feature disclosed in this specification (including adjoint claim, summary and accompanying drawing) can be replaced by the alternative features providing purpose identical, equivalent or similar.
In addition, those skilled in the art it will be appreciated that, although embodiments more described herein include some feature included in other embodiments rather than further feature, but the combination of the feature of different embodiment means to be within the scope of the present invention and form different embodiments.Such as, in detail in the claims, the one of any of embodiment required for protection can mode use in any combination.
The all parts embodiment of the present invention can realize with hardware, or realizes with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that microprocessor or digital signal processor (DSP) can be used in practice to realize assessment search engine resource according to embodiments of the present invention includes the some or all functions of the some or all parts in the calculating of loss.The present invention is also implemented as part or all the equipment for performing method as described herein or device program (such as, computer program and computer program).The program of such present invention of realization can store on a computer-readable medium, or can have the form of one or more signal.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described rather than limits the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment without departing from the scope of the appended claims.In the claims, any reference marks that should not will be located between bracket is configured to limitations on claims.Word " comprises " and does not exclude the presence of the element or step not arranged in the claims.Word "a" or "an" before being positioned at element does not exclude the presence of multiple such element.The present invention by means of including the hardware of some different elements and can realize by means of properly programmed computer.In the unit claim listing some devices, several in these devices can be through same hardware branch and specifically embody.Word first, second and third use do not indicate that any order.Can be title by these word explanations.
So far, those skilled in the art will recognize that, although the detailed multiple exemplary embodiments illustrate and describing the present invention herein, but, without departing from the spirit and scope of the present invention, still can directly determine according to present disclosure or derive other variations or modifications many meeting the principle of the invention.Therefore, the scope of the present invention is it is understood that cover all these other variations or modifications with regarding as.
The invention discloses A1. mono-kind to assess search engine resource and include the computational methods of loss, including:
Extract the one or more query strings in Targeted Search Engine;
Query Result according to each query string carries out dependency marking respectively, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
According to M with reference to the dependency marking value of the Query Result of each query string in search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource of each query string in Targeted Search Engine and include loss S1.
A2. the method according to A1, wherein it is determined that the process of described first theoretical maximum score value S2 includes:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
A3. the method according to A2, wherein, the dependency marking value according to other except page 1 page Query Result, adjust described marking list, marking list after being adjusted and described first theoretical maximum score value S2, including:
Set the Query Result of quantity from described other page of each crawl first, Query Result is carried out dependency marking, obtain the marking list of each page;
When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and described first theoretical maximum score value S2.
A4. the method according to A2, wherein it is determined that the process of the second theoretical maximum score value S2 ' including:
Obtain M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjust the marking list after described adjustment, obtain the second theoretical maximum score value S2 '.
A5. the method according to A4, wherein, adjusts the marking list after described adjustment, including:
When in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after described adjustment with the Query Result in reference search engine.
A6. the method according to A1, wherein, calculates the resource of each query string in Targeted Search Engine and includes loss S1, including:
Calculate S2 '-S2, obtain the resource of each query string in described Targeted Search Engine and include loss S1.
A7. according to the arbitrary described method of A1-A6, also include:
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.
A8. the method according to A7, wherein, calculates the resource efficiency index of each query string in Targeted Search Engine, including:
Calculate S2/S2 ', obtain the resource efficiency index of each query string in described Targeted Search Engine.
The invention also discloses B9. mono-kind to assess search engine resource and include the calculation element of loss, including:
Abstraction module, for extracting the one or more query strings in Targeted Search Engine;
First determines module, carries out dependency marking respectively for the Query Result according to each query string, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
Second determines module, for according to the dependency marking value of the Query Result of each query string in M reference search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
3rd determines module, for according to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculating the resource of each query string in Targeted Search Engine and include loss S1.
B10. the device according to B9, wherein, described first determines module, specifically for:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
B11. the device according to B10, wherein, described first determines module, specifically for:
Set the Query Result of quantity from described other page of each crawl first, Query Result is carried out dependency marking, obtain the marking list of each page;
When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and described first theoretical maximum score value S2.
B12. the device according to B10, wherein, described second determines module, specifically for:
Obtain M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjust the marking list after described adjustment, obtain the second theoretical maximum score value S2 '.
B13. the device according to B12, wherein, described second determines module, specifically for:
When in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after described adjustment with the Query Result in reference search engine.
B14. according to the arbitrary described device of B10-B13, wherein, the described 3rd determines module, is additionally operable to:
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.

Claims (10)

1. assess search engine resource and include computational methods for loss, including:
Extract the one or more query strings in Targeted Search Engine;
Query Result according to each query string carries out dependency marking respectively, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
According to M with reference to the dependency marking value of the Query Result of each query string in search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource of each query string in Targeted Search Engine and include loss S1.
2. method according to claim 1, wherein it is determined that the process of described first theoretical maximum score value S2 includes:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
3. the method according to any one of claim 1-2, wherein, the dependency marking value according to other except page 1 page Query Result, adjust described marking list, marking list after being adjusted and described first theoretical maximum score value S2, including:
Set the Query Result of quantity from described other page of each crawl first, Query Result is carried out dependency marking, obtain the marking list of each page;
When in other pages the dependency marking value of Query Result more than page 1 in the dependency marking value of Query Result time, the Query Result in page 1 is replaced with the Query Result in other pages, realize adjusting marking list, marking list after being adjusted and described first theoretical maximum score value S2.
4. the method according to any one of claim 1-3, wherein it is determined that the process of the second theoretical maximum score value S2 ' including:
Obtain M with reference to the Query Result of each query string, the dependency marking value according to Query Result in search engine, adjust the marking list after described adjustment, obtain the second theoretical maximum score value S2 '.
5. the method according to any one of claim 1-4, wherein, adjusts the marking list after described adjustment, including:
When in reference search engine, the dependency marking value of Query Result is more than the dependency marking value of the Query Result in Targeted Search Engine, the Query Result in Targeted Search Engine is replaced, it is achieved adjust the marking list after described adjustment with the Query Result in reference search engine.
6. the method according to any one of claim 1-5, wherein, calculates the resource of each query string in Targeted Search Engine and includes loss S1, including:
Calculate S2 '-S2, obtain the resource of each query string in described Targeted Search Engine and include loss S1.
7., according to the arbitrary described method of claim 1-6, also include:
According to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculate the resource efficiency index of each query string in Targeted Search Engine.
8. the method according to any one of claim 1-7, wherein, calculates the resource efficiency index of each query string in Targeted Search Engine, including:
Calculate S2/S2 ', obtain the resource efficiency index of each query string in described Targeted Search Engine.
9. assess search engine resource and include a calculation element for loss, including:
Abstraction module, for extracting the one or more query strings in Targeted Search Engine;
First determines module, carries out dependency marking respectively for the Query Result according to each query string, it is determined that the first theoretical maximum score value S2 of each query string page 1 Query Result;
Second determines module, for according to the dependency marking value of the Query Result of each query string in M reference search engine, it is determined that the second theoretical maximum score value S2 ' of each query string page 1 Query Result;
3rd determines module, for according to described first theoretical maximum score value S2 and described second theoretical maximum score value S2 ', calculating the resource of each query string in Targeted Search Engine and include loss S1.
10. device according to claim 9, wherein, described first determines module, specifically for:
Obtain the Query Result of the first setting quantity, the dependency marking value according to the described first Query Result setting quantity in the page 1 Query Result of described Targeted Search Engine, obtain marking list;
Dependency marking value according to other except page 1 page Query Result, adjusts described marking list, marking list after being adjusted and described first theoretical maximum score value S2.
CN201410854198.XA 2014-12-31 2014-12-31 Assessment search engine resource includes the calculation method and device of loss Expired - Fee Related CN105808601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410854198.XA CN105808601B (en) 2014-12-31 2014-12-31 Assessment search engine resource includes the calculation method and device of loss

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410854198.XA CN105808601B (en) 2014-12-31 2014-12-31 Assessment search engine resource includes the calculation method and device of loss

Publications (2)

Publication Number Publication Date
CN105808601A true CN105808601A (en) 2016-07-27
CN105808601B CN105808601B (en) 2019-07-23

Family

ID=56464880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410854198.XA Expired - Fee Related CN105808601B (en) 2014-12-31 2014-12-31 Assessment search engine resource includes the calculation method and device of loss

Country Status (1)

Country Link
CN (1) CN105808601B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101355457A (en) * 2008-06-19 2009-01-28 腾讯科技(北京)有限公司 Test method and test equipment
CN103544307A (en) * 2013-11-04 2014-01-29 北京中搜网络技术股份有限公司 Multi-search-engine automatic comparison and evaluation method independent of document library
CN103593411A (en) * 2013-10-23 2014-02-19 江苏大学 Method for testing combination properties of evaluation indexes of search engines and testing device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101355457A (en) * 2008-06-19 2009-01-28 腾讯科技(北京)有限公司 Test method and test equipment
CN103593411A (en) * 2013-10-23 2014-02-19 江苏大学 Method for testing combination properties of evaluation indexes of search engines and testing device
CN103544307A (en) * 2013-11-04 2014-01-29 北京中搜网络技术股份有限公司 Multi-search-engine automatic comparison and evaluation method independent of document library

Also Published As

Publication number Publication date
CN105808601B (en) 2019-07-23

Similar Documents

Publication Publication Date Title
CN105701216B (en) A kind of information-pushing method and device
CN105159930B (en) The method for pushing and device of search key
US10229160B2 (en) Search results based on a search history
US10185771B2 (en) Method and system for scheduling web crawlers according to keyword search
JP2015537259A (en) Ranking search results based on click-through rate
RU2014126774A (en) SEARCH RESULTS RANGE
CN106021418B (en) The clustering method and device of media event
CN109104421B (en) Website content tampering detection method, device, equipment and readable storage medium
WO2013025828A1 (en) Synthesizing directories, domains, and subdomains
RU2015156410A (en) SYSTEM AND METHOD FOR RANKING SEARCH RESULTS
CN110019660A (en) A kind of Similar Text detection method and device
CN104391953B (en) Detect the method and device of webpage renewal
JP2005322165A (en) Retrieval keyword presentation method, device, and program
CN102541946B (en) Method and equipment for determining recommendation degree of hyperlink based on recommendation attribute of hyperlink
US20140059062A1 (en) Incremental updating of query-to-resource mapping
CN104951476B (en) Method and device for confirming link rank in website
CN106815277A (en) The appraisal procedure and device of search engine optimization
CN106937173A (en) Video broadcasting method and device
CN109036510A (en) Training for vision correction scheduling method, apparatus and system
CN103530392B (en) Determine the method and apparatus of crawl flow
CN106528569B (en) Calculate the method and device of search in Website availability
CN106168962A (en) Searching method and the device of accurate viewpoint are provided based on natural Search Results
CN104317903B (en) The recognition methods of the chapters and sections integrality of chapters and sections formula text and device
CN104050273B (en) For recording newest network file, the installation method for changing search result
CN103544278B (en) Method and equipment for identifying website capturing flow quota

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190723

Termination date: 20211231