CN102156746A - Method for evaluating performance of search engine - Google Patents

Method for evaluating performance of search engine Download PDF

Info

Publication number
CN102156746A
CN102156746A CN2011100983786A CN201110098378A CN102156746A CN 102156746 A CN102156746 A CN 102156746A CN 2011100983786 A CN2011100983786 A CN 2011100983786A CN 201110098378 A CN201110098378 A CN 201110098378A CN 102156746 A CN102156746 A CN 102156746A
Authority
CN
China
Prior art keywords
inquiry
user
value
click
ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011100983786A
Other languages
Chinese (zh)
Inventor
朱彤
刘奕群
马少平
张敏
金奕江
张阔
茹立云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Original Assignee
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Sogou Technology Development Co Ltd filed Critical Tsinghua University
Priority to CN2011100983786A priority Critical patent/CN102156746A/en
Publication of CN102156746A publication Critical patent/CN102156746A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method for evaluating the performance of a search engine. The method comprises the following steps of: pre-processing a user log, and acquiring a search set to be evaluated from the user log; aiming at the search set, extracting corresponding search classification characteristics from the user log; classifying the search set into a navigation type search set and an information work type search set according to the search classification characteristics; acquiring user behavior characteristics of the classified search sets; and performing user satisfaction judgment on the navigation type search set and the information work type search set respectively according to the user behavior characteristics. In the method for evaluating the performance of the search engine, the model structure and parameters are simple, the algorithm complexity is low, the data is comprehensive and objective, and the evaluation is real and reliable.

Description

The method of evaluating performance of search engine
Technical field
The present invention relates to communication technical field, particularly a kind of method of evaluating performance of search engine.
Background technology
Search engine is meant according to certain strategy, the specific computer program of utilization collects the information on the internet, after information being organized and handled, the information after handling is shown to the user, is the system that retrieval service is provided for the user.
At present, the method of service that most of search engine provided is still and realizes by keyword query, be that the user utilizes search engine web site, submit to the relevant inquiry (being generally several words, speech) of self-demand, information feedback on the internet that grasped of the search engine utilization correlated results tabulation of arriving then to user inquiring, every page of webpage generally has ten normal return results, be a series of Webpages that sort according to the size of inquiry degree of correlation, the resource that degree of correlation is high is emitted on forward position.
Just and sound correct, comprehensively objective rating of merit of search engine has very strong guiding function, can further improve the quality of retrieval service etc., so the performance evaluation of search engine is subjected to extensive concern always.
Because search engine system belongs to the scope of networked information retrieval system to a great extent, therefore the traditional information retrieval evaluation method of basic at present application is evaluated and tested the performance of search engine.In the information retrieval evaluation method, evaluation and test is two indispensable factors with the model answer set of query set and corresponding these inquiries.And in the existing evaluating method, the two is established a capital really and need expend the great amount of manpower work, and the manual subjectivity influence that marks the mark personnel that bring is difficult to avoid.At the problems referred to above, propose recently to estimate the thought of search engine performance with user satisfaction, but do not propose rational automatic Evaluation flow process yet from user perspective.
Summary of the invention
Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.
For achieving the above object, the present invention proposes a kind of method of evaluating performance of search engine, may further comprise the steps: A: user journal is carried out pre-service, and obtain query set to be evaluated from described pretreated user journal; B:, in described user journal, extract corresponding inquiry characteristic of division at described query set; C:, described query set is categorized into navigation type query set and information transaction class query set according to described inquiry characteristic of division; D: the user behavior feature of obtaining described sorted query set; And E:, respectively described navigation type query set and information transaction class query set are carried out user satisfaction and determine according to described user behavior feature.
In one embodiment of the invention, described steps A further comprises: carry out the user journal code conversion and convert the Chinese characters of the national standard coded format to the coded format with server record; User journal after the described conversion is put in order to remove the information outside the predetermined content item, and wherein said predetermined content item comprises the current inquiry of user ID, user's submission, result, user behavior content, the user behavior incident that the user clicks; Filter the noise information in the current inquiry that described user submits to; And according to user query frequency, Automatic sieve is selected described query set from described pretreated user journal.
In one embodiment of the invention, described inquiry characteristic of division comprises: precedingly click the URL representative that the rate of meeting consumers' demand, user are clicked concentration degree, link information and inquiry correspondence for N time.
Wherein,
Figure BDA0000056183730000021
Figure BDA0000056183730000022
Figure BDA0000056183730000023
The corresponding URL of described inquiry is represented as ratio and accounts for URL more than 10%.
According to one embodiment of present invention, described step C further comprises: C1: judge whether the corresponding URL representative of described inquiry only is one and is the Type of website, if the corresponding URL representative of described inquiry only is one and is the Type of website, judge that then described inquiry is the navigation type inquiry, otherwise continue step C2; C2: judge whether described link information is not more than the first link information value,, then continue step C3, if described link information then continues step C5 greater than the described first link information value if described link information is not more than the described first link information value; C3: judge that described user clicks concentration degree and whether is not more than the first concentration degree value, if described user clicks concentration degree and is not more than the described first concentration degree value, judge that then described inquiry is the inquiry of information transaction class, if described user clicks concentration degree greater than the described first concentration degree value, then continue step C4; C4: judge whether described link information is not more than the second link information value, if described link information is not more than the described second link information value, judge that then described inquiry is the inquiry of information transaction class, if it is the navigation type inquiry that described link information, is then judged described inquiry greater than the described second link information value; C5: judge that whether described user clicks concentration degree greater than the second concentration degree value, if described user clicks concentration degree greater than the described second concentration degree value, judge that then described inquiry is the navigation type inquiry, be not more than the described second concentration degree value, then continue step C6 if described user clicks concentration degree; C6: judge described before N click whether meet consumers' demand rate greater than predetermined demand factor value, if N the click rate of meeting consumers' demand is not more than described predetermined demand factor value before described, judge that then described inquiry is the inquiry of information transaction class, if N click met consumers' demand rate greater than described predetermined demand factor value before described, then continue step C7; And C7: judge that whether described link information is greater than the 3rd link information value, if described link information is greater than described the 3rd link information value, judge that then described inquiry is the navigation type inquiry, if described link information is not more than described the 3rd link information value, judge that then described inquiry is the inquiry of information transaction class.
In one embodiment of the invention, described user behavior feature comprises: average click information for the first time, click ratio, average last click information, average number of clicks, the average log bar number that inquiry recommends and click the ratio of search again.
Wherein,
Figure BDA0000056183730000031
Figure BDA0000056183730000032
Figure BDA0000056183730000033
Figure BDA0000056183730000035
Figure BDA0000056183730000036
According to one embodiment of present invention, if described inquiry is the navigation type inquiry, then described step e further comprises: E11: judge that whether described average first time click location is greater than the first predetermined click location value, if described average first time, click location was greater than described predetermined click location value, then continue step e 12, if average click location for the first time is not more than described predetermined click location value, then continue step e 13; E12: judge that whether described average number of clicks is greater than predetermined number of clicks value, if described average number of clicks is greater than described predetermined times value, it is dissatisfied then to be judged as the user, if described average number of clicks is not more than described predetermined times value, then is judged as user's satisfaction; E13: whether the ratio of judging described click inquiry recommendation is greater than first ratio value, if the ratio that described click inquiry is recommended is greater than described first ratio value, it is dissatisfied then to be judged as the user, if the ratio that described click inquiry is recommended is not more than described first ratio value, then is judged as user's satisfaction.
According to one embodiment of present invention, if described inquiry is the inquiry of information transaction class, then described step e further comprises: E21: whether the ratio of judging described click inquiry recommendation is greater than second ratio value, if the ratio that described click inquiry is recommended is greater than described second ratio value, then continue step e 22, if the ratio that described click inquiry is recommended is not more than described second ratio value, then continue step e 23; E22: judge that whether described average last click location is greater than the second predetermined click location value, if described average last click location is greater than the described second predetermined click location value, it is dissatisfied then to be judged as the user, if described average last click location is not more than the described second predetermined click location value, then be judged as user's satisfaction; E23: judge that whether described average log bar number is greater than predetermined bar numerical value, if described average log bar number is not more than described predetermined bar numerical value, it is satisfied then to be judged as the user, if described average log bar number then continues step e 24 greater than described predetermined bar numerical value; E24: judge that whether ratio that described click searches for again is greater than the predetermined ratio value of search again, if the ratio that described click is searched for again is greater than the described predetermined ratio value of search again, it is dissatisfied then to be judged as the user, if the ratio that described click is searched for again is not more than the described predetermined ratio value of search again, then be judged as user's satisfaction.
Method of evaluating performance according to the search engine of the embodiment of the invention, automatically go out query set to be evaluated from the search engine logs extracting data, and the extraction series of features relevant with user inquiring is to classify to these inquiries and the user satisfaction evaluation, objective comprehensively, true and reliable, and model structure and parameter are simple, and algorithm complex is low.
Aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
Fig. 1 is the process flow diagram of method of evaluating performance of the search engine of the embodiment of the invention;
Fig. 2 is the process flow diagram of the inquiry sorting technique of one embodiment of the invention;
Fig. 3 is the process flow diagram of evaluation method of the navigation type inquiry of one embodiment of the invention; And
Fig. 4 is the process flow diagram of evaluation method of the information transaction class inquiry of one embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of described embodiment is shown in the drawings, and wherein identical from start to finish or similar label is represented identical or similar elements or the element with identical or similar functions.Below by the embodiment that is described with reference to the drawings is exemplary, only is used to explain the present invention, and can not be interpreted as limitation of the present invention.
Be illustrated in figure 1 as the process flow diagram of method of evaluating performance of the search engine of the embodiment of the invention, may further comprise the steps:
Step S11 carries out pre-service to user journal, and obtains query set to be evaluated from pretreated user journal.
The query set that the search engine evaluation is used comes from the search engine user daily record, and for the user journal of certain search engine, it should comprise content as shown in table 1 at least, just can be used to extract query set and corresponding feature:
Journal entry Recorded content
User ID System assignment is given user's unique identification
Query The current inquiry that the user submits to
Result The result that the user clicks
Action The user behavior content comprises click location, clicks one page, modification inquiry etc. down
Time The user behavior time
The content that the user journal of table 1 search engine comprises
General search engine service provider can obtain the information shown in the table 1 by search engine web server easily, thereby has guaranteed the feasibility of the inventive method.User journal is carried out pretreated step can be comprised:
At first, carry out the user journal code conversion, the coded format (being generally the generic resource identifier, i.e. the URI form) of server record is converted to the GBK form of Chinese characters of the national standard coding.
Then, utilize the content item of listing in the table 1 that user journal is put in order, remove the information outside table 1 content item, and user journal is organized into the form of above content item character string.
At last, the noise information in the current inquiry that filter user is submitted to, the query words that the query word of for example violating a ban, some online product promotion use etc. only keep the content item that directly reflects search engine common user query demand and behavior.
Through the data preprocessing process, can from the daily record of search engine original user, extract the content in the table 1, through simple statistics, just can obtain all query sets of user, in all set, randomly draw some inquiries and can become query set to be evaluated, preferably, extract several high inquiries of user query frequency as query set to be evaluated.After query set generates, need to extract the demand classification that series of features is done inquiry, concrete grammar is seen following steps.
Step S12 at query set to be evaluated, extracts corresponding inquiry characteristic of division in user journal.
In one embodiment of the invention, described inquiry characteristic of division can comprise: precedingly click the URL representative that the rate of meeting consumers' demand, user are clicked concentration degree, link information and inquiry correspondence for N time.
Particularly, the user inquiring and the click information that provide by table 1, can calculate " preceding N time click meet consumers' demand rate " at certain inquiry Q, promptly only need the result that search engine is returned to be less than or equal to user's ratio that its information requirement is just satisfied in N click, concrete computing formula is as follows:
Wherein, " total number of users of inquiry Q " can count to get by the different user ID to inquiry Q, " inquiry user's during Q clicks " can count to get by the user's clicks to the different user ID correspondence of inquiry Q, and then can add up " clicking the number of users that is less than or equal to N time during inquiry Q ".According to definition, because when Q " inquiry number of clicks be less than or equal to N time user " must be the part of " user of inquiry Q ", therefore the span of " preceding N click meet consumers' demand rate " must be between 0 to 1.
With the compute classes of " preceding N time click meet consumers' demand rate " seemingly, can calculate " user clicks concentration degree " of inquiring about Q at certain by user inquiring and click information that table 1 provides, promptly at certain inquiry Q, the user is for the click intensity of search engine return results.For certain inquiry Q, we can at first define " user clicks the most concentrated inquiry answer ": in the inquiry at Q, the inquiry answer URL that the number of times of being clicked by different user is maximum then for Q " user clicks concentration degree " concrete computing formula is:
Figure BDA0000056183730000052
Wherein, " inquiry total clicks of the Q user " can obtain by the user's click-through count to inquiry Q, and " user clicks the most concentrated clicked number of times of inquiry answer " then can obtain by to inquiry Q the time in user's click-through count of " the inquiry answer that user's click is the most concentrated ".According to definition, because " user clicks the most concentrated clicked number of times of Query Result " must be less than or equal to " total clicks of inquiry Q user ", therefore the span of " user clicks concentration degree " is inevitable between 0 to 1.
Similarly, can add up the URL that certain user who inquires about the Q correspondence clicks by user inquiring and click information that table 1 provides, each inquires about corresponding some URL, and in an example of the present invention, the URL that withdrawal ratio accounts for more than 10% inquires about the representative of Q as certain.
In addition, the data that the application searches engine provides can be calculated " link information " according to following formula, and wherein, data content comprises original web page, target web and pairing link text thereof,
Figure BDA0000056183730000061
Wherein, " with the total entry number of Q as the link text appearance " can be by adding up the pairing webpage number of link text of inquiry Q, a few webpage repeats, want cumulative statistics, " with Q as the corresponding entry number of maximum webpage of link text quantity " can check which webpage is that accumulated quantity is maximum on the basis of the above.According to definition, because " with Q as the corresponding entry number of maximum webpage of link text quantity " is inevitable smaller or equal to " the total entry number that occurs as link text with Q ", therefore the span of " link information " is inevitable between 0 to 1.
Step S13 according to the inquiry characteristic of division, is categorized into navigation type query set and information transaction class query set with query set.
Be illustrated in figure 2 as the process flow diagram of the inquiry sorting technique of one embodiment of the invention, may further comprise the steps:
Step S201 represents tentatively according to the corresponding URL of inquiry and to judge.Whether the URL representative of judging certain inquiry Q correspondence only is one and is the Type of website, if judge that then this inquiry Q is the navigation type inquiry, otherwise continue step S202.
Step S202 judges whether the link information of certain inquiry Q is not more than the first link information value, if, then continue step S203, if not, step S205 then continued.
In an example of the present invention, the first link information value is made as 0.9.
Step S203 judges that certain user who inquires about Q clicks concentration degree and whether is not more than the first concentration degree value, if, judge that then this inquiry Q is the inquiry of information transaction class, if not, step S204 then continued.
In an example of the present invention, the first concentration degree value is made as 0.85.
Step S204 judges whether the link information of certain inquiry Q is not more than the second link information value, if, judge that then this inquiry Q is the inquiry of information transaction class, if not, judge that then this inquiry Q is the navigation type inquiry.
In an example of the present invention, the second link information value is made as 0.6.
Step S205 judges that whether the user of certain inquiry Q clicks concentration degree greater than the second concentration degree value, if, judge that then this inquiry Q is the navigation type inquiry, if not, step S206 then continued.
In examples more of the present invention, the second concentration degree value is made as 0.75.
Step S206 judges that whether preceding N the click of certain inquiry Q meets consumers' demand rate greater than predetermined demand factor value, if not, judge that then this inquiry Q is for the inquiry of information transaction class, if then continue step S207.
In one embodiment of the invention, Yu Ding demand factor value is 0.815.
Whether step S207 judges certain link information of inquiring about Q greater than the 3rd link information value, if, judge that then this inquiry Q is the navigation type inquiry, if not, judge that then this inquiry Q is the inquiry of information transaction class.
In some embodiments of the invention, the 3rd link information value is made as 0.977.
Step S14 obtains the user behavior feature of sorted query set.
During user search information, be the unit with a session, the search content that defines the same user ID of same ip is same inquiry.S12 is similar with step, and the information of application searches engine user journal is all extracted the user behavior feature at all inquiries of having classified.
In one embodiment of the invention, the user behavior feature can comprise: average click information for the first time, click ratio, average last click information, average number of clicks, the average log bar number that inquiry recommends and click the ratio of search again.
Particularly, according to user inquiring and the click information that table 1 provides, can calculate " average click information for the first time " at certain inquiry Q, promptly at certain inquiry Q, how many average positions of clicking for the first time of user is, concrete computing formula is as follows:
Figure BDA0000056183730000071
Wherein, " click location summation " and " inquiry times " can count to get by all click location to inquiry Q.According to definition, average click location for the first time should be one more than or equal to 1 numeral.
Similar with " average click information for the first time ", can utilize the user inquiring and the click information of table 1, calculating is at certain inquiry Q " clicking the ratio that inquiry is recommended ", promptly at certain inquiry Q, the user who clicks the inquiry recommendation accounts for the ratio that all inquired about the user who inquires about Q, and concrete computing formula is as follows:
Figure BDA0000056183730000072
Wherein, " clicking user's number that inquiry is recommended " can obtain by user's click information statistics, and " all user's numbers " can obtain by the information of user inquiring.According to definition, clicked user's number of inquiry recommendation and must inquire about the user of this inquiry Q smaller or equal to all, therefore the span of " ratio that the click inquiry is recommended " information must be between 0 to 1.
Similarly, can also calculate " average last click location " at certain inquiry Q, promptly at inquiry Q, how many average last positions of clicking of user is, concrete computing formula is as follows:
Figure BDA0000056183730000073
Wherein, " last click location summation " and " inquiry times " can obtain by the click location message count to inquiry Q.According to definition, average last click location should be one more than or equal to 1 numeral.
Similarly, can also calculate " average number of clicks " at certain inquiry Q, promptly at inquiry Q, average each session user's number of clicks, concrete computing formula is as follows:
Figure BDA0000056183730000074
Wherein, " number of clicks summation " can obtain by user's click information statistics, and " inquiry times " can obtain by the Information Statistics of user inquiring.According to definition, average number of clicks should be one more than or equal to 1 numeral.
Similarly, can also calculate " average log bar number " at certain inquiry Q, promptly at certain inquiry Q, the daily record bar number of average each all information of session user, concrete computing formula is as follows:
Wherein, " the daily record bar is counted summation " can obtain by the user journal Information Statistics, and " inquiry times " can obtain by the Information Statistics of user inquiring.According to definition, average log bar number should be one more than or equal to 1 numeral.
Similarly, can also calculate " clicking the ratio of search again " at certain inquiry Q, promptly at inquiry Q, the user who clicks search again accounts for the ratio that all inquired about the user who inquires about Q, and concrete computing formula is as follows:
Figure BDA0000056183730000082
Wherein, " clicking user's number of overweight new search " can obtain by user's click information statistics, and " all user's numbers " can obtain by the Information Statistics of user inquiring.According to definition, click user's number of overweight new search and must inquire about the user of this inquiry Q smaller or equal to all, therefore the span of " click is the ratio of search again " information is inevitable between 0 to 1.
Step S15 according to the user behavior feature, carries out user satisfaction to navigation type query set and information transaction class query set respectively and determines.
Extracted after the user behavior feature,, carried out corresponding user satisfaction respectively and judge according to the classification of inquiry Q.
Be illustrated in figure 3 as the process flow diagram of evaluation method of the navigation type inquiry of one embodiment of the invention, may further comprise the steps:
Step S301, judge the inquiry of certain navigation type average first time click location whether greater than the first predetermined click location value, if, then continue step S302, if not, then continue step S303.
In an example of the present invention, the first predetermined click location value is made as 2.
Step S302, whether the average number of clicks of judging the inquiry of certain navigation type greater than predetermined number of clicks value, if it is dissatisfied then to be judged as the user, if not, it is satisfied then to be judged as the user.
In an example of the present invention, predetermined number of clicks value is made as 1.73.
Whether step S303 judges ratio that the click inquiry of certain navigation type inquiry recommends greater than first ratio value, if it is dissatisfied then to be judged as the user, if not, it is satisfied then to be judged as the user.
In an example of the present invention, first ratio value is made as 0.27.
Be illustrated in figure 4 as the process flow diagram of evaluation method of the information transaction class inquiry of one embodiment of the invention, may further comprise the steps:
Whether step S401 judges ratio that the click inquiry of certain information transaction class inquiry recommends greater than second ratio value, if, then continue step S402, if not, step S403 then continued.
In an example of the present invention, second ratio value is made as 0.29.
Step S402, whether the average last click location of judging the inquiry of certain information transaction class greater than the second predetermined click location value, if it is dissatisfied then to be judged as the user, if not, it is satisfied then to be judged as the user.
In an example of the present invention, the second predetermined click location value is 2.3.
Whether step S403, the average log bar number of judging the inquiry of certain information transaction class greater than predetermined bar numerical value, if it is satisfied then to be judged as the user, if not, then continue step S404.
In an example of the present invention, predetermined bar numerical value is 4.7.
Whether step S404 judges ratio that the click of certain information transaction class inquiry searches for again greater than the predetermined ratio value of search again, if it is dissatisfied then to be judged as the user, if not, it is satisfied then to be judged as the user.
In an example of the present invention, the predetermined ratio value of search again is 0.00097.
The number by adding up customer satisfaction system inquiry and the number of the unsatisfied inquiry of user just can obtain user satisfaction.
Need to prove; the example of the first click location value above-mentioned, first ratio value, the first link information value, the first concentration degree value etc. is schematically; be not limited to the present invention; those skilled in the art can change the setting of each parameter value according to practical application, and these changes all should be included in protection scope of the present invention.
In order to verify effect of the present invention, also carried out the correlation test of performance evaluating.Do contrast by the result with artificial mark, the accuracy whether method of evaluating performance judgement user of the present invention as can be known is satisfied with is very high.
The method of evaluating performance of the search engine of the embodiment of the invention, automatically go out query set to be evaluated from the search engine logs extracting data, and the extraction series of features relevant with user inquiring is to classify to these inquiries and the user satisfaction evaluation, objective comprehensively, true and reliable, and model structure and parameter are simple, and algorithm complex is low.Therefore, the present invention has popularized type and adaptability preferably.
Although illustrated and described embodiments of the invention, for the ordinary skill in the art, be appreciated that without departing from the principles and spirit of the present invention and can carry out multiple variation, modification, replacement and modification that scope of the present invention is by claims and be equal to and limit to these embodiment.

Claims (10)

1. the method for evaluating performance of a search engine is characterized in that, may further comprise the steps:
A: user journal is carried out pre-service, and from described pretreated user journal, obtain query set to be evaluated;
B:, in described user journal, extract corresponding inquiry characteristic of division at described query set;
C:, described query set is categorized into navigation type query set and information transaction class query set according to described inquiry characteristic of division;
D: the user behavior feature of obtaining described sorted query set; And
E:, respectively described navigation type query set and information transaction class query set are carried out the satisfied judgement of user according to described user behavior feature.
2. the method for evaluating performance of search engine according to claim 1 is characterized in that, described steps A further comprises:
Carry out the user journal code conversion and convert the Chinese characters of the national standard coded format to coded format with server record;
User journal after the described conversion is put in order to remove the information outside the predetermined content item, and wherein said predetermined content item comprises the current inquiry of user ID, user's submission, result, user behavior content, the user behavior incident that the user clicks;
Filter the noise information in the current inquiry that described user submits to; And
According to user query frequency, Automatic sieve is selected described query set from described pretreated user journal.
3. the method for evaluating performance of search engine according to claim 1 is characterized in that, described inquiry characteristic of division comprises:
Click the rate of meeting consumers' demand preceding N time;
The user clicks concentration degree;
Link information; With
The URL representative that inquiry is corresponding.
4. the method for evaluating performance of search engine according to claim 3 is characterized in that, wherein,
N the click rate of meeting consumers' demand obtains by following formula before described,
Figure FDA0000056183720000011
Described user clicks concentration degree and obtains by following formula,
Figure FDA0000056183720000012
Described link information obtains by following formula,
Figure FDA0000056183720000013
Wherein, Q is certain inquiry.
5. the method for evaluating performance of search engine according to claim 3 is characterized in that, the corresponding URL of described inquiry is represented as ratio and accounts for URL more than 10%.
6. according to the method for evaluating performance of each described search engine in the claim 1 to 5, it is characterized in that described step C further comprises:
C1: judge whether the corresponding URL representative of described inquiry only is one and is the Type of website, if the corresponding URL representative of described inquiry only be one and be the Type of website, judge that then described inquiry is the navigation type inquiry, otherwise continue step C2;
C2: judge whether described link information is not more than the first link information value,, then continue step C3, if described link information then continues step C5 greater than the described first link information value if described link information is not more than the described first link information value;
C3: judge that described user clicks concentration degree and whether is not more than the first concentration degree value, if described user clicks concentration degree and is not more than the described first concentration degree value, judge that then described inquiry is the inquiry of information transaction class, if described user clicks concentration degree greater than the described first concentration degree value, then continue step C4;
C4: judge whether described link information is not more than the second link information value, if described link information is not more than the described second link information value, judge that then described inquiry is the inquiry of information transaction class, if it is the navigation type inquiry that described link information, is then judged described inquiry greater than the described second link information value;
C5: judge that whether described user clicks concentration degree greater than the second concentration degree value, if described user clicks concentration degree greater than the described second concentration degree value, judge that then described inquiry is the navigation type inquiry, be not more than the described second concentration degree value, then continue step C6 if described user clicks concentration degree;
C6: judge described before N click whether meet consumers' demand rate greater than predetermined demand factor value, if N the click rate of meeting consumers' demand is not more than described predetermined demand factor value before described, judge that then described inquiry is the inquiry of information transaction class, if N click met consumers' demand rate greater than described predetermined demand factor value before described, then continue step C7; And
C7: judge that whether described link information is greater than the 3rd link information value, if described link information is greater than described the 3rd link information value, judge that then described inquiry is the navigation type inquiry, if described link information is not more than described the 3rd link information value, judge that then described inquiry is the inquiry of information transaction class.
7. the method for evaluating performance of search engine according to claim 1 is characterized in that, described user behavior feature comprises:
Average click information for the first time;
Click the ratio that inquiry is recommended;
Average last click information;
Average number of clicks;
Average log bar number; With
Click the ratio of search again.
8. the method for evaluating performance of search engine according to claim 7 is characterized in that, wherein,
Described average first time, click information obtained by following formula,
Figure FDA0000056183720000031
The ratio that described click inquiry is recommended obtains by following formula,
Figure FDA0000056183720000032
Described average last click information obtains by following formula,
Described average number of clicks obtains by following formula,
Described average log bar number obtains by following formula,
The ratio that described click is searched for again obtains by following formula,
Figure FDA0000056183720000036
9. according to the method for evaluating performance of claim 7 or 8 described search engines, it is characterized in that if described inquiry is the navigation type inquiry, then described step e further comprises:
E11: judge that whether described average first time click location is greater than the first predetermined click location value, if described average first time, click location was greater than described predetermined click location value, then continue step e 12, if average click location for the first time is not more than described predetermined click location value, then continue step e 13;
E12: judge that whether described average number of clicks is greater than predetermined number of clicks value, if described average number of clicks is greater than described predetermined times value, it is dissatisfied then to be judged as the user, if described average number of clicks is not more than described predetermined times value, then is judged as user's satisfaction;
E13: whether the ratio of judging described click inquiry recommendation is greater than first ratio value, if the ratio that described click inquiry is recommended is greater than described first ratio value, it is dissatisfied then to be judged as the user, if the ratio that described click inquiry is recommended is not more than described first ratio value, then is judged as user's satisfaction.
10. according to the method for evaluating performance of claim 7 or 8 described search engines, it is characterized in that if described inquiry is the inquiry of information transaction class, then described step e further comprises:
E21: whether the ratio of judging described click inquiry recommendation is greater than second ratio value, if the ratio that described click inquiry is recommended is greater than described second ratio value, then continue step e 22,, then continue step e 23 if the ratio that described click inquiry is recommended is not more than described second ratio value;
E22: judge that whether described average last click location is greater than the second predetermined click location value, if described average last click location is greater than the described second predetermined click location value, it is dissatisfied then to be judged as the user, if described average last click location is not more than the described second predetermined click location value, then be judged as user's satisfaction;
E23: judge that whether described average log bar number is greater than predetermined bar numerical value, if described average log bar number is not more than described predetermined bar numerical value, it is satisfied then to be judged as the user, if described average log bar number then continues step e 24 greater than described predetermined bar numerical value;
E24: judge that whether ratio that described click searches for again is greater than the predetermined ratio value of search again, if the ratio that described click is searched for again is greater than the described predetermined ratio value of search again, it is dissatisfied then to be judged as the user, if the ratio that described click is searched for again is not more than the described predetermined ratio value of search again, then be judged as user's satisfaction.
CN2011100983786A 2011-04-19 2011-04-19 Method for evaluating performance of search engine Pending CN102156746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011100983786A CN102156746A (en) 2011-04-19 2011-04-19 Method for evaluating performance of search engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011100983786A CN102156746A (en) 2011-04-19 2011-04-19 Method for evaluating performance of search engine

Publications (1)

Publication Number Publication Date
CN102156746A true CN102156746A (en) 2011-08-17

Family

ID=44438245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011100983786A Pending CN102156746A (en) 2011-04-19 2011-04-19 Method for evaluating performance of search engine

Country Status (1)

Country Link
CN (1) CN102156746A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020289A (en) * 2012-12-25 2013-04-03 浙江鸿程计算机系统有限公司 Method for providing individual needs of search engine user based on log mining
CN103294727A (en) * 2012-03-05 2013-09-11 阿里巴巴集团控股有限公司 Filtering method and system for recommended objects
CN103577464A (en) * 2012-08-02 2014-02-12 百度在线网络技术(北京)有限公司 Method and device for excavating badcase of search engine
CN103593411A (en) * 2013-10-23 2014-02-19 江苏大学 Method for testing combination properties of evaluation indexes of search engines and testing device
CN104077555A (en) * 2013-03-29 2014-10-01 百度在线网络技术(北京)有限公司 Method and device for identifying badcase in image search
CN105095334A (en) * 2014-05-06 2015-11-25 雅虎公司 Method and system for evaluating user satisfaction with respect to a user session
CN109582744A (en) * 2017-09-29 2019-04-05 高德信息技术有限公司 A kind of user satisfaction methods of marking and device
CN114254179A (en) * 2020-09-23 2022-03-29 北京达佳互联信息技术有限公司 Search request processing method and device and search platform

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294727A (en) * 2012-03-05 2013-09-11 阿里巴巴集团控股有限公司 Filtering method and system for recommended objects
CN103577464A (en) * 2012-08-02 2014-02-12 百度在线网络技术(北京)有限公司 Method and device for excavating badcase of search engine
CN103577464B (en) * 2012-08-02 2018-07-10 百度在线网络技术(北京)有限公司 A kind of method for digging and device of search engine bad example
CN103020289B (en) * 2012-12-25 2015-08-05 浙江鸿程计算机系统有限公司 A kind of search engine user individual demand supplying method based on Web log mining
CN103020289A (en) * 2012-12-25 2013-04-03 浙江鸿程计算机系统有限公司 Method for providing individual needs of search engine user based on log mining
CN104077555B (en) * 2013-03-29 2019-01-15 百度在线网络技术(北京)有限公司 The method and apparatus of bad example in a kind of identification picture searching
CN104077555A (en) * 2013-03-29 2014-10-01 百度在线网络技术(北京)有限公司 Method and device for identifying badcase in image search
CN103593411A (en) * 2013-10-23 2014-02-19 江苏大学 Method for testing combination properties of evaluation indexes of search engines and testing device
CN105095334A (en) * 2014-05-06 2015-11-25 雅虎公司 Method and system for evaluating user satisfaction with respect to a user session
US10599659B2 (en) 2014-05-06 2020-03-24 Oath Inc. Method and system for evaluating user satisfaction with respect to a user session
CN109582744A (en) * 2017-09-29 2019-04-05 高德信息技术有限公司 A kind of user satisfaction methods of marking and device
CN109582744B (en) * 2017-09-29 2021-08-10 阿里巴巴(中国)有限公司 User satisfaction scoring method and device
CN114254179A (en) * 2020-09-23 2022-03-29 北京达佳互联信息技术有限公司 Search request processing method and device and search platform

Similar Documents

Publication Publication Date Title
CN102156746A (en) Method for evaluating performance of search engine
CN100440224C (en) Automatization processing method of rating of merit of search engine
CN101819573B (en) Self-adaptive network public opinion identification method
CN107800591B (en) Unified log data analysis method
CN103049440B (en) A kind of recommendation process method of related article and disposal system
CN103365839B (en) The recommendation searching method and device of a kind of search engine
US20140143012A1 (en) Method and system for predictive marketing campigns based on users online behavior and profile
CN100507918C (en) Automatic positioning method of network key resource page
CN102073684B (en) Method and device for excavating search log and page search method and device
CN103051637A (en) User identification method and device
CN103336766A (en) Short text garbage identification and modeling method and device
WO2010036013A3 (en) Apparatus and method for extracting and analyzing opinions in web documents
CN104834668A (en) Position recommendation system based on knowledge base
CN101394311A (en) Network public opinion prediction method based on time sequence
CN102819580B (en) Internet third party online media sites broadcast monitoring method and system
CN102542474A (en) Method for sorting inquiry results and device
CN108600790B (en) Method and device for detecting stuck-in fault
CN104281622A (en) Information recommending method and information recommending device in social media
CN103577413A (en) Search result ordering method and system and search result ordering optimization method and system
CN105260913A (en) CTR estimation method and system, and DSP server used for Internet advertisement putting
CN101477552A (en) Website user rank division method
CN104077407A (en) System and method for intelligent data searching
CN101071445A (en) Classified sample set optimizing method and content-related advertising server
CN104750760A (en) Application software recommending method and device
CN101101599A (en) Method for extracting advertisement main information from web page

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110817