CN107704467B - Search quality evaluation method and device - Google Patents

Search quality evaluation method and device Download PDF

Info

Publication number
CN107704467B
CN107704467B CN201610645103.2A CN201610645103A CN107704467B CN 107704467 B CN107704467 B CN 107704467B CN 201610645103 A CN201610645103 A CN 201610645103A CN 107704467 B CN107704467 B CN 107704467B
Authority
CN
China
Prior art keywords
search
search result
result
click
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610645103.2A
Other languages
Chinese (zh)
Other versions
CN107704467A (en
Inventor
曹皓
张亮
齐志宏
贾晋康
覃安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610645103.2A priority Critical patent/CN107704467B/en
Priority to PCT/CN2016/108422 priority patent/WO2018028099A1/en
Publication of CN107704467A publication Critical patent/CN107704467A/en
Application granted granted Critical
Publication of CN107704467B publication Critical patent/CN107704467B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method and a device for evaluating search quality, wherein the method comprises the following steps: constructing a search quality assessment database based on user historical search record data, wherein the search quality assessment database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item; for each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result; and evaluating the search quality of the search strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result. The embodiment of the invention fully automates the whole evaluation process so as to reduce the labor cost and the manual interference degree and improve the evaluation accuracy of the search quality.

Description

Search quality evaluation method and device
Technical Field
The embodiment of the invention relates to the technical field of information search, in particular to a search quality evaluation method and a search quality evaluation device.
Background
In the current internet era, the information amount on the network is rapidly expanded and increased, and a user wants to directly find a needed fishing needle which is basically a sea fishing needle; the search engine can accurately and quickly help people to acquire required information from mass network data. The sequencing effect of the search engine is optimized, and the user can quickly find the mission required by the search engine. Therefore, how to quickly evaluate the search quality of a search engine offline becomes a popular research field, and plays an important role in research and development of search technology.
Whether the search engine results are good or not is referred to in the industry as Relevance (Relevance). The narrow interpretation is: the degree of relevance of the search results to the user query. Currently, the off-line evaluation commonly used in the industry includes manual annotation calculation recall rate, recall accuracy, search engine quality index (DCG), and rank quality index (NDCG). Because the indexes are evaluated based on manual marking data and are influenced by subjective factors and data quantity, the evaluation result is far from the on-line actual effect, and the labor cost is high.
Disclosure of Invention
The embodiment of the invention provides a method and a device for evaluating search quality, which enable the whole evaluation process to be completely automatic, so as to reduce labor cost and manual interference degree and improve the accuracy of evaluation of search quality.
In a first aspect, an embodiment of the present invention provides a search quality evaluation method, including:
constructing a search quality assessment database based on user historical search record data, wherein the search quality assessment database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item;
for each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result;
and evaluating the search quality of the search strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result.
In a second aspect, an embodiment of the present invention further provides a search quality evaluation apparatus, including:
the system comprises an evaluation database construction module, a search quality evaluation database and a search result item generation module, wherein the evaluation database construction module is used for constructing a search quality evaluation database based on historical search record data of a user, and the search quality evaluation database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item;
the ranking module is used for ranking corresponding search result items in the search quality assessment database respectively to obtain a first ranking result and a second ranking result aiming at each query keyword in the at least one query keyword based on a baseline search strategy and a search strategy to be assessed;
and the evaluation module is used for evaluating the search quality of the search strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result.
According to the method and the device, a search quality evaluation database comprising at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item is constructed based on user historical search record data, the corresponding search result items in the search quality evaluation database are respectively sorted based on a baseline search strategy and a search strategy to be evaluated according to each query keyword in the at least one query keyword to obtain a first sorting result and a second sorting result, the search quality of the search strategy to be evaluated is evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result, the whole evaluation process is fully automated, the manual cost and the manual interference degree are reduced, and the search quality evaluation accuracy is improved.
Drawings
FIG. 1 is a flowchart of a method for evaluating search quality according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a search quality evaluation method according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a search quality evaluation method according to a third embodiment of the present invention;
FIG. 4 is a flowchart of a search quality evaluation method according to a fourth embodiment of the present invention;
fig. 5 is a block diagram of a search quality evaluation apparatus in a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a search quality evaluation method according to an embodiment of the present invention, where the present embodiment is applicable to a case where search quality of a search policy to be evaluated is evaluated when a search engine is developed, and the method may be executed by a search quality evaluation device according to an embodiment of the present invention, where the device may be integrated in a fixed terminal (e.g., a desktop computer or a laptop computer) or a server, as shown in fig. 1, and specifically includes:
s101, constructing a search quality evaluation database based on historical search record data of a user, wherein the search quality evaluation database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item.
The user history search log data includes, but is not limited to, a user click log. The user click log records the user clicks of the search result items under each query keyword, including but not limited to the number of clicks. The user operational characteristic data includes, but is not limited to, user click data (e.g., number of clicks) for the search result item. The search result item may be, but is not limited to, any of the following: a connection address URL extracted from a web page, a browsing web page, or other web page features extracted from a web page.
Specifically, a crawler technology may be adopted to extract a web page from the internet, and based on a historical search behavior of a user, a historical click log of the user on the extracted web page is acquired, a query keyword with a preset flow (for example, 2%) is randomly extracted from the user click log, user operation feature data of a search result item included in the web page corresponding to the query keyword is recalled, and feature information of the search result item is extracted based on the web page. And constructing a search quality evaluation database based on the corresponding relation among the extracted query key words, the search result items, the feature information of the search result items and the user operation feature data. Wherein, the characteristic information of the search result item can be obtained in advance through machine learning and is used as one of the reference factors when the search result item is ordered.
S102, aiming at each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result.
The baseline search strategy is a search strategy running on the existing line, and the search strategy to be evaluated is a search strategy to be evaluated on the basis of the search strategy running on the existing line.
Specifically, firstly, a search is performed in a search engine based on each query keyword in the at least one query keyword, a search result item corresponding to each query keyword and feature information of the search result item are obtained from the search quality evaluation database, then the search result items are ranked based on the feature information of the search result item and the baseline search strategy to obtain a first ranking result, and the search result items are ranked based on the feature information of the search result item and the search strategy to be evaluated to obtain a second ranking result.
S103, evaluating the searching quality of the searching strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result.
Specifically, when the first sorting result is consistent with the second sorting result, it is indicated that the search quality of the search strategy to be evaluated is not improved. And when the first sequencing result is inconsistent with the second sequencing result, evaluating the searching quality of the searching strategy to be evaluated further according to the historical operation characteristic data of the user.
In the embodiment, a search quality assessment database comprising at least one query keyword, at least one corresponding search result item and user operation characteristic data for the search result item is constructed based on user historical search record data, the corresponding search result items in the search quality assessment database are respectively ranked based on a baseline search strategy and a search strategy to be assessed for each query keyword in the at least one query keyword to obtain a first ranking result and a second ranking result, and the search quality of the search strategy to be assessed is assessed based on the user historical search characteristic data, the first ranking result and the second ranking result, so that the whole assessment process is fully automated, the labor cost and the manual interference degree are reduced, and the search quality assessment accuracy is improved.
On the basis of the above-described embodiment, constructing a search quality assessment database based on user history search record data includes:
extracting the at least one query keyword from user historical search record data;
extracting user operation characteristic data of a search result item corresponding to the extracted query keyword from the historical search record data of the user;
and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data.
Wherein the user historical search record data includes, but is not limited to, a user click log, and the user operation characteristic data includes, but is not limited to, user historical click data.
Specifically, a search quality evaluation database is constructed by mining historical search record data of a user and extracting browsing behaviors of the user from the historical search record data. For example, 2% of flow of query keywords are randomly extracted, user click logs corresponding to the query keywords are filtered, at least one search result item and user operation characteristic data of the search result item under each query keyword are recalled, and the query keywords and the corresponding search result items thereof and the user operation characteristic data of the search result items are used as a search quality evaluation database. When search quality evaluation is carried out, firstly, search is carried out in a search engine based on a query keyword, at least one search result item corresponding to the query keyword is obtained from a search quality evaluation database, and then, based on a baseline search strategy and a search strategy to be evaluated, corresponding search result items in the search quality evaluation database are respectively sorted to obtain a first sorting result and a second sorting result.
On the basis of the above embodiment, after extracting the user operation feature data of the search result item corresponding to the extracted query keyword from the user history search record data, the method further includes:
and screening the search result items based on a preset search result item screening strategy, wherein the search result item screening strategy is associated with the user operation characteristic data.
Wherein the predetermined search result item screening policy is preferably a recall policy.
Specifically, the present embodiment will be described by taking the user operation characteristic data as the user history click data as an example. Extracting query keywords with a preset proportion (for example, 2%) from the historical search record data of the user, and obtaining a user click log, where for example, the historical click behavior of the user is recorded in the click log as follows:
query key terms query-search result item url-whether there are points
query-url1-click
query-url2-no_click
query-url3-click
query-url4-click
query-url5-no_click
query-url6-no_click
And according to the recorded user click behaviors and the recall strategy, counting to obtain historical click data of the position difference search result pairs. Wherein the recall policy includes, but is not limited to, at least one of: (1) recall u1> u2, u1> u3 only when the first bar is clicked; (2) the nth point is dotted, and when the (n + 1) th point is not dotted, the un > un +1 is recalled; (3) clicking better than skipping, having a point at the nth point, and recalling 1 un > un-1 when the n-1 th point is not clicked; the nth point, the (n + 1) th point is not clicked, the (n + 2) th point is clicked, and 1 un +2> un +1 is recalled; (4) continuous clicking or skipping is to strike a flat, with point n, point n-1, point n-2, recall un-2 +1, point n +1, point n +2, recall un +1, and un +1 + 2. And obtaining at least one search result item corresponding to the query keyword according to the recall strategy, and sequentially counting the click behavior and the click times of each search result item. For example, the url may be clicked or not clicked, how many times it has been clicked, and so on. And constructing a position difference search result pair according to the click times of the search result items. For example, if search result item 1 is clicked 10 times and search result item 2 is clicked 20 times, then the constructed location difference search result pair is: search result item 2 clicks 10 more times than search result item 1.
Or constructing position difference search result pairs, and respectively counting the total times awin of the user history click times of more than one search result item in each position difference search result pair; counting the total times bwin of the occurrences of the user historical clicks of one search result item which are less than the other search result item; the total number sum of occurrences of the user historical clicks of one of the search result items equal to the other search result item is counted.
For example, the final constructed search quality assessment database is shown in table one below:
watch 1
Query key words Location difference search result pairs [awinbwin sum]
Query keyword 1 Search result item 1-search result item 2 [001]
Query keyword 1 Search result item 2-search result item 3 [100]
Query keyword 1 Search result item 3-search result item 4 [010]
Query keyword 1 Search result item 4-search result item 5 [001]
…… …… ……
On the basis of the above embodiment, for each query keyword of the at least one query keyword, respectively ranking the corresponding search result items in the search quality evaluation database based on the baseline search policy and the search policy to be evaluated to obtain a first ranking result and a second ranking result, including:
extracting page features corresponding to the search result items aiming at each query keyword in the at least one query keyword;
acquiring a baseline policy feature and a policy feature to be evaluated from the extracted page feature;
ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy features and the policy features to be evaluated and the corresponding baseline policy models and policy models to be evaluated.
On the basis of the above embodiment, evaluating the search quality of the search policy to be evaluated based on the user historical operation feature data, the first ranking result, and the second ranking result includes:
determining the sorting positions of the situations of the same sorting position corresponding to different search result items based on the first sorting result and the second sorting result;
forming different search result items on the sorting positions into position difference search result pairs;
and evaluating the search strategy to be evaluated based on the position difference search result pair and the corresponding user operation characteristic data.
Specifically, if all search result items at the same ranking position in the first ranking result and the second ranking result are the same, it is determined that the search quality of the search strategy to be evaluated is not improved. And if the search result items in at least one same sorting position are different in the first sorting result and the second sorting result, forming a position difference search result pair by the different search result items in the sorting positions, and evaluating the search strategy to be evaluated based on the position difference search result pair and corresponding user operation characteristic data.
For example, if a certain ranking position in the first ranking result is a search result item 1, and the same ranking position in the second ranking result is a search result item 2, the search result item 1 and the search result item 2 are combined into a position difference search result pair, historical click behaviors of the user on the search result item 1 and the search result item 2 are respectively counted, if the historical click behaviors of the search result item 1 are more than the historical click behaviors of the search result item 2, the search quality of the search policy to be evaluated is improved, otherwise, the search quality of the search policy to be evaluated is not improved.
On the basis of the above embodiment, evaluating the search policy to be evaluated based on the difference search result pair and the corresponding user operation feature data includes:
determining hit search results in the difference search result pairs at each click in a simulated search environment based on each position difference search result pair and corresponding user operation characteristic data;
counting click results of each search result in each position difference search result pair in the simulated search environment;
and evaluating the search strategy to be evaluated based on the counted click results.
One implementation manner of this embodiment is to count the click times of each search result in each position difference search result pair in a simulated search environment, and in the same ranking position, if the click times of the search result items in the first ranking result are less than the click times of the search result items in the second ranking result, it is determined that the search quality of the search policy to be evaluated is improved, otherwise, it is determined that the search quality of the search policy to be evaluated is not improved.
Another implementation manner of this embodiment is to count the total times G that the times of clicking on the search result items in the first ranking result are more than the times of clicking on the search result items in the second ranking result; counting the total times B that the times of clicking the search result items in the first sequencing result are less than the times of clicking the search result items in the second sequencing result; counting the number of times of clicking on the search result items in the first sorting result is equal to the total number of times of clicking on the search result items in the second sorting result. And calculating an evaluation coefficient of the search strategy to be evaluated based on the counted times G, B and S.
Specifically, the evaluation coefficient P may be calculated by a formula P of 0.5 × (G-B)/(G + S + B), and if P is a positive number, it indicates that the search quality of the search strategy to be evaluated is improved, otherwise, it is not improved.
And when the number of the query keywords is multiple, respectively summing the total times G, the total times S and the total times B of the same position difference search result pairs under all the query keywords, obtaining an evaluation coefficient P based on the summed total times G, the summed total times S and the summed total times B, and evaluating the search quality of the search strategy to be evaluated based on the obtained evaluation coefficient P.
And when the sorting positions containing different search result items are multiple, respectively summing the total times G, the total times S and the total times B of the position difference search result pairs at all the sorting positions, obtaining an evaluation coefficient P based on the summed total times G, the summed total times S and the summed total times B, and evaluating the search quality of the search strategy to be evaluated based on the obtained evaluation coefficient P.
On the basis of the foregoing embodiment, if determining, based on each position difference search result pair and corresponding user operation feature data, a hit search result in the difference search result pair at each click in the simulated search environment includes:
respectively counting the total times awin, bwin and sum of the occurrences of the situations that the user historical click times of one search result item in the position difference search result pair are more than, less than and equal to the user historical click times of the other search result item on the basis of the user operation characteristic data;
calculating the probability of clicking on the search result in the first sequencing result and the probability of clicking on the search result in the second sequencing result based on the awin, the bwin and the sum;
and determining the hit search result in the position difference search result pair at each click in the simulated search environment by utilizing a random number generation method based on the probability of clicking the search result in the first sequencing result and the probability of clicking the search result in the second sequencing result.
On the basis of the above embodiment, before determining a hit search result in each position difference search result pair at each click in a simulated search environment based on each position difference search result pair and corresponding user operation feature data, the method further includes:
and determining whether the position difference search result pair has click behavior or not by using a random number generation method based on the preset position click probability, and determining the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data when the click behavior occurs.
Specifically, if the search result item 1 is located at the second ranking position in the first ranking result, and the search result item 2 is located at the second ranking position in the second ranking result, a position difference search result pair is formed based on the search result item 1 and the search result item 2, and awin, bwin, and sum of the position difference search result pair are respectively counted based on the search quality evaluation data. If awin, bwin, and sum are 50, 150, and 200, respectively, then the probability of clicking on search result item 1 is 12.5% (50+150+200), and the probability of clicking on search result item 2 is 37.5% (50+150+200), then a set of random numbers is generated, if the generated random numbers are 1,2,3,4,5,6,7,8,9, 10; the random numbers are grouped according to a click probability of 12.5% and a click probability of 37.5%, since the ratio of 12.5% to 37.5% is 1:3, 1,2,3 is assigned to the corresponding random number of search result item 1, 4,5,6,7,8,9,10 is assigned to the corresponding random number of search result item 2, then a random number is randomly extracted from the group of random numbers 1,2,3,4,5,6,7,8,9,10, if the random number is the corresponding random number of search result item 1, the click result is the search result item 1 in the first ranked result of clicks, and if the random number is the corresponding random number of search result item 2, the click result is the search result item 2 in the second ranked result of clicks. And finally, counting the total clicks of the first sequencing result and the second sequencing result, and evaluating the search quality of the search strategy to be evaluated according to the total clicks. For example, if the total number of clicks of the second ranking result is more than that of the first ranking result, it is determined that the search quality of the search strategy to be evaluated is improved, otherwise, the search quality is not improved.
In the embodiment, the search quality assessment database comprising at least one query keyword, at least one corresponding search result item and user operation characteristic data for the search result item is constructed based on the user historical search record data, the corresponding search result items in the search quality assessment database are respectively ranked based on the baseline search strategy and the search strategy to be assessed for each query keyword in the at least one query keyword to obtain a first ranking result and a second ranking result, and the search quality of the search strategy to be assessed is assessed based on the user historical operation characteristic data, the first ranking result and the second ranking result, so that the whole assessment process is fully automated, the labor cost and the manual interference degree are reduced, and the search quality assessment accuracy is improved.
Example two
Fig. 2 is a flowchart of a search quality evaluation method according to a second embodiment of the present invention, where in this embodiment, on the basis of the above-mentioned second embodiment, it is preferable that a search quality evaluation database is constructed based on user history search record data, and the at least one query keyword is extracted from the user history search record data; extracting user operation characteristic data of a search result item corresponding to the extracted query keyword from the historical search record data of the user; and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data. After extracting the user operation feature data of the search result item corresponding to the extracted query keyword from the user history search record data, the method preferably further includes: and screening the search result items based on a preset search result item screening strategy, wherein the search result item screening strategy is associated with the user operation characteristic data. As shown in fig. 2, the method specifically includes:
s201, extracting the at least one query keyword from the historical search record data of the user, and extracting the user operation characteristic data of the search result item corresponding to the extracted query keyword.
Wherein the user historical search record data includes, but is not limited to, a user click log, and the user operation characteristic data includes, but is not limited to, user historical click data.
S202, based on a preset search result item screening strategy, screening the search result item, and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data.
Wherein the search result item screening policy is associated with user operational characteristic data.
S203, aiming at each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result.
S204, evaluating the searching quality of the searching strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result.
For detailed description of each step of this embodiment, refer to the above embodiments, which are not described herein again.
In the embodiment, a search quality evaluation database is constructed based on user historical search record data, corresponding search result items in the search quality evaluation database are respectively sorted based on the search quality evaluation database, a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result, and the search quality of the search strategy to be evaluated is evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result, so that the whole evaluation process is completely automatic, the labor cost and the manual interference degree are reduced, and the search quality evaluation accuracy is improved.
EXAMPLE III
Fig. 3 is a flowchart of a search quality evaluation method according to a third embodiment of the present invention, where on the basis of the foregoing embodiments, in this embodiment, for each query keyword in the at least one query keyword, based on a baseline search policy and a search policy to be evaluated, the search quality evaluation database is sorted to obtain a first sorting result and a second sorting result, and preferably: extracting page features corresponding to the search result items aiming at each query keyword in the at least one query keyword; acquiring a baseline policy feature and a policy feature to be evaluated from the extracted page feature; ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy features and the policy features to be evaluated and the corresponding baseline policy models and policy models to be evaluated. As shown in fig. 3, the method specifically includes:
s301, constructing a search quality evaluation database based on the historical search record data of the user, wherein the search quality evaluation database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item.
S302, aiming at each query keyword in the at least one query keyword, extracting page features corresponding to the search result items, and acquiring baseline policy features and policy features to be evaluated from the extracted page features.
S303, ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy characteristic and the policy characteristic to be evaluated and the corresponding baseline policy model and policy model to be evaluated.
S304, evaluating the searching quality of the searching strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result.
In this embodiment, based on the search quality assessment database, for each query keyword in at least one query keyword, a page feature corresponding to a search result item is extracted, a baseline policy feature and a policy feature to be assessed are obtained from the extracted page feature, and based on the baseline policy feature, the policy feature to be assessed, and a corresponding baseline policy model and a policy model to be assessed, the search result items are ranked to obtain the first ranking result and the second ranking result, so that the whole assessment process is fully automated, the labor cost and the degree of manual interference are reduced, and the accuracy of search quality assessment is improved.
Example four
Fig. 4 is a flowchart of a search quality evaluation method according to a fourth embodiment of the present invention, where in this embodiment, based on the foregoing embodiment, the search quality of the search policy to be evaluated is evaluated based on the user historical operation feature data, the first ranking result, and the second ranking result, and preferably, based on the first ranking result and the second ranking result, a ranking position where the same ranking position corresponds to different search result items is determined; forming different search result items on the sorting positions into position difference search result pairs; and evaluating the search strategy to be evaluated based on the position difference search result pair and the corresponding user operation characteristic data. Further, it is preferable that, based on the difference search result pair and the corresponding user operation feature data, the evaluation of the search policy to be evaluated is performed by: determining hit search results in the difference search result pairs at each click in a simulated search environment based on each position difference search result pair and corresponding user operation characteristic data; counting click results of each search result in each position difference search result pair in the simulated search environment; and evaluating the search strategy to be evaluated based on the counted click results. As shown in fig. 4, the method specifically includes:
s401, constructing a search quality evaluation database based on the historical search record data of the user, wherein the search quality evaluation database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item.
S402, aiming at each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result.
S403, determining the sorting positions of the situations where the same sorting position corresponds to different search result items based on the first sorting result and the second sorting result, and forming position difference search result pairs by the different search result items on the sorting positions.
S404, determining the hit search result in each position difference search result pair in the simulated search environment when the user clicks each time based on each position difference search result pair and the corresponding user operation characteristic data.
Specifically, based on the user operation characteristic data, respectively counting the total times awin, bwin and sum of the occurrences of the cases that the user history click times of one search result item in the position difference search result pair is more than, less than or equal to the user history click times of the other search result item; calculating the probability of clicking on the search result in the first sequencing result and the probability of clicking on the search result in the second sequencing result based on the awin, the bwin and the sum; and determining the hit search result in the position difference search result pair at each click in the simulated search environment by utilizing a random number generation method based on the probability of clicking the search result in the first sequencing result and the probability of clicking the search result in the second sequencing result.
S405, counting click results of each search result in each position difference search result pair in the simulated search environment.
S406, evaluating the search strategy to be evaluated based on the counted click results.
Specifically, the total number G of times that the search result item in the first ranking result is clicked is greater than the number of times that the search result item in the second ranking result is clicked may be counted; counting the total times B that the times of clicking the search result items in the first sequencing result are less than the times of clicking the search result items in the second sequencing result; counting the number of times of clicking on the search result items in the first sorting result is equal to the total number of times of clicking on the search result items in the second sorting result. And calculating an evaluation coefficient of the search strategy to be evaluated based on the counted times G, B and S. And evaluating the search strategy to be evaluated according to the evaluation coefficient.
In the embodiment, the click results of each search result in each position difference search result pair in the simulated search environment are counted by simulating the search environment. And evaluating the search strategy to be evaluated based on the counted click results. The whole evaluation process is fully automated, so that the labor cost and the manual interference degree are reduced, and the accuracy of the search quality evaluation is improved.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a search quality evaluation apparatus according to a fifth embodiment of the present invention, where the apparatus may be implemented in a software or hardware manner, and the apparatus may be integrated in a fixed terminal or a server, and as shown in fig. 5, the specific structure of the apparatus is as follows: an evaluation database construction module 51, a sorting module 52, and an evaluation module 53;
the evaluation database construction module 51 is configured to construct a search quality evaluation database based on the user history search record data, where the search quality evaluation database includes at least one query keyword, at least one corresponding search result item, and user operation feature data for the search result item;
the ranking module 52 is configured to, for each query keyword of the at least one query keyword, rank, based on a baseline search policy and a search policy to be evaluated, corresponding search result items in the search quality evaluation database respectively to obtain a first ranking result and a second ranking result;
the evaluation module 53 is configured to evaluate the search quality of the search policy to be evaluated based on the user historical operation feature data, the first sorting result, and the second sorting result.
The search quality evaluation device described in this embodiment is used to execute the search quality evaluation method described in each of the above embodiments, and the technical principle and the generated technical effect are similar, which are not described herein again.
On the basis of the above embodiment, the evaluation database construction module 51 is specifically configured to extract the at least one query keyword from the user history search record data; extracting user operation characteristic data of a search result item corresponding to the extracted query keyword from the historical search record data of the user; and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data.
On the basis of the above embodiment, the apparatus further includes: a screening module 54;
the screening module 54 is configured to, after the evaluation database construction module 51 extracts the user operation feature data of the search result item corresponding to the extracted query keyword from the user history search record data, perform screening processing on the search result item based on a predetermined search result item screening policy, where the search result item screening policy is associated with the user operation feature data.
On the basis of the above embodiment, the user history search record data includes a user click log, and the user operation feature data includes user history click data.
On the basis of the foregoing embodiment, the sorting module 52 is specifically configured to, for each query keyword in the at least one query keyword, extract a page feature of a corresponding search result item; acquiring a baseline policy feature and a policy feature to be evaluated from the extracted page feature; ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy features and the policy features to be evaluated and the corresponding baseline policy models and policy models to be evaluated.
On the basis of the above embodiment, the evaluation module 53 includes: a position determination unit 531, a pairing unit 532, and an evaluation unit 533;
the position determination unit 531 is configured to determine, based on the first ranking result and the second ranking result, a ranking position where the same ranking position corresponds to a situation of different search result items;
the matching unit 532 is configured to combine different search result items at the sorting positions into position difference search result pairs;
the evaluating unit 533 is configured to evaluate the search policy to be evaluated based on the pair of location difference search results and the corresponding user operation feature data.
On the basis of the above embodiment, the evaluation unit 533 includes: a hit determination subunit 5331, a statistics subunit 5332, and an evaluation subunit 5333;
the hit determining subunit 5331 is configured to determine, based on each position difference search result pair and corresponding user operation feature data, a hit search result in the difference search result pair at each click in a simulated search environment;
the statistics subunit 5332 is configured to count click results of each search result in each position difference search result pair in the simulated search environment;
the evaluation subunit 5333 is configured to evaluate the search policy to be evaluated based on the counted click result.
On the basis of the foregoing embodiment, the statistics subunit 5332 is specifically configured to count a total number G of times that the number of times of clicking on the search result item in the first ranking result is greater than the number of times of clicking on the search result item in the second ranking result; counting the total times B that the times of clicking the search result items in the first sequencing result are less than the times of clicking the search result items in the second sequencing result; counting the number of times of clicking on the search result items in the first sorting result is equal to the total number of times of clicking on the search result items in the second sorting result.
On the basis of the foregoing embodiment, the statistics subunit 5332 is specifically configured to calculate an evaluation coefficient of the search policy to be evaluated based on the counted times G, B and S.
On the basis of the foregoing embodiment, the hit determining subunit 5331 is specifically configured to separately count total times awin, bwin, and sum of occurrences in which the user history click number of one search result item in the location difference search result pair is greater than, less than, and equal to the user history click number of another search result item, based on the user operation feature data; calculating the probability of clicking on the search result in the first sequencing result and the probability of clicking on the search result in the second sequencing result based on the awin, the bwin and the sum; and determining the hit search result in the position difference search result pair at each click in the simulated search environment by utilizing a random number generation method based on the probability of clicking the search result in the first sequencing result and the probability of clicking the search result in the second sequencing result.
On the basis of the above embodiment, the apparatus further includes: a click determination module 55;
the click determining module 55 is configured to determine whether a click action occurs to the position difference search result pair by using a random number generation method based on a preset position click probability before the hit determining subunit 5331 determines the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data, and determine the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data when the click action occurs.
The search quality evaluation device according to each of the above embodiments is used to execute the search quality evaluation method according to each of the above embodiments, and the technical principle and the generated technical effect are similar, and are not described herein again.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (20)

1. A search quality evaluation method, comprising:
constructing a search quality assessment database based on user historical search record data, wherein the search quality assessment database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item;
for each query keyword in the at least one query keyword, respectively sorting corresponding search result items in the search quality evaluation database based on a baseline search strategy and a search strategy to be evaluated to obtain a first sorting result and a second sorting result;
evaluating the search quality of the search strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result;
wherein evaluating the search quality of the search strategy to be evaluated based on the user historical operation feature data, the first ranking result and the second ranking result comprises:
determining the sorting positions of the situations of the same sorting position corresponding to different search result items based on the first sorting result and the second sorting result;
forming different search result items on the sorting positions into position difference search result pairs;
and evaluating the search strategy to be evaluated based on the position difference search result pair and the corresponding user operation characteristic data.
2. The method of claim 1, wherein constructing a search quality assessment database based on user historical search record data comprises:
extracting the at least one query keyword from user historical search record data;
extracting user operation characteristic data of a search result item corresponding to the extracted query keyword from the historical search record data of the user;
and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data.
3. The method of claim 2, wherein after extracting user operation feature data of the search result item corresponding to the extracted query keyword from the user history search log data, the method further comprises:
and screening the search result items based on a preset search result item screening strategy, wherein the search result item screening strategy is associated with the user operation characteristic data.
4. The method of any of claims 1-3, wherein the user historical search record data comprises a user click log and the user operational characteristic data comprises user historical click data.
5. The method of claim 1, wherein for each query keyword of the at least one query keyword, ranking the corresponding search result items in the search quality assessment database based on a baseline search policy and a search policy to be assessed, respectively, to obtain a first ranking result and a second ranking result comprises:
extracting page features corresponding to the search result items aiming at each query keyword in the at least one query keyword;
acquiring a baseline policy feature and a policy feature to be evaluated from the extracted page feature;
ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy features and the policy features to be evaluated and the corresponding baseline policy models and policy models to be evaluated.
6. The method of claim 1, wherein evaluating the search policy to be evaluated based on the differential search result pair and corresponding user operation characteristic data comprises:
determining hit search results in the difference search result pairs at each click in a simulated search environment based on each position difference search result pair and corresponding user operation characteristic data;
counting click results of each search result in each position difference search result pair in the simulated search environment;
and evaluating the search strategy to be evaluated based on the counted click results.
7. The method of claim 6, wherein statistically simulating click results for each search result in each pair of location-differentiated search results in the search environment comprises:
counting the total times G that the times of clicking the search result items in the first sequencing result are more than the times of clicking the search result items in the second sequencing result;
counting the total times B that the times of clicking the search result items in the first sequencing result are less than the times of clicking the search result items in the second sequencing result;
counting the number of times of clicking on the search result items in the first sorting result is equal to the total number of times of clicking on the search result items in the second sorting result.
8. The method of claim 7, wherein evaluating the search strategy to be evaluated based on the counted click results comprises:
and calculating an evaluation coefficient of the search strategy to be evaluated based on the counted times G, B and S.
9. The method of claim 6, wherein determining the hit search result in the differential search result pair at each click in the simulated search environment based on each location differential search result pair and corresponding user operational characteristic data comprises:
respectively counting the total times awin, bwin and sum of the occurrences of the situations that the user historical click times of one search result item in the position difference search result pair are more than, less than and equal to the user historical click times of the other search result item on the basis of the user operation characteristic data;
calculating the probability of clicking on the search result in the first sequencing result and the probability of clicking on the search result in the second sequencing result based on the awin, the bwin and the sum;
and determining the hit search result in the position difference search result pair at each click in the simulated search environment by utilizing a random number generation method based on the probability of clicking the search result in the first sequencing result and the probability of clicking the search result in the second sequencing result.
10. The method of claim 6, wherein prior to determining the hit search result in the pair of location differentiated search results per click in the simulated search environment based on each pair of location differentiated search results and corresponding user-manipulated feature data, the method further comprises:
and determining whether the position difference search result pair has click behavior or not by using a random number generation method based on the preset position click probability, and determining the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data when the click behavior occurs.
11. A search quality evaluation apparatus, characterized by comprising:
the system comprises an evaluation database construction module, a search quality evaluation database and a search result item generation module, wherein the evaluation database construction module is used for constructing a search quality evaluation database based on historical search record data of a user, and the search quality evaluation database comprises at least one query keyword, at least one corresponding search result item and user operation characteristic data aiming at the search result item;
the ranking module is used for ranking corresponding search result items in the search quality assessment database respectively to obtain a first ranking result and a second ranking result aiming at each query keyword in the at least one query keyword based on a baseline search strategy and a search strategy to be assessed;
the evaluation module is used for evaluating the search quality of the search strategy to be evaluated based on the user historical operation characteristic data, the first sorting result and the second sorting result;
wherein the evaluation module comprises:
the position determining unit is used for determining the sorting positions of the situations that the same sorting position corresponds to different search result items on the basis of the first sorting result and the second sorting result;
the matching unit is used for forming different search result items on the sorting positions into position difference search result pairs;
and the evaluation unit is used for evaluating the search strategy to be evaluated based on the position difference search result pair and the corresponding user operation characteristic data.
12. The apparatus of claim 11, wherein the evaluation database construction module is specifically configured to extract the at least one query keyword from user historical search record data; extracting user operation characteristic data of a search result item corresponding to the extracted query keyword from the historical search record data of the user; and taking the at least one query keyword, the corresponding search result item and the user operation characteristic data of the search result item as the search quality evaluation data.
13. The apparatus of claim 12, further comprising:
and the screening module is used for screening the search result items based on a preset search result item screening strategy after the evaluation database construction module extracts the user operation characteristic data of the search result items corresponding to the extracted query key words from the user historical search record data, wherein the search result item screening strategy is associated with the user operation characteristic data.
14. The apparatus of any of claims 11-13, wherein the user historical search record data comprises a user click log and the user operational characteristic data comprises user historical click data.
15. The apparatus according to claim 11, wherein the ranking module is specifically configured to, for each query keyword of the at least one query keyword, extract page features of a corresponding search result item; acquiring a baseline policy feature and a policy feature to be evaluated from the extracted page feature; ranking the search result items to obtain the first ranking result and the second ranking result based on the baseline policy features and the policy features to be evaluated and the corresponding baseline policy models and policy models to be evaluated.
16. The apparatus of claim 11, wherein the evaluation unit comprises:
a hit determining subunit, configured to determine, based on each position difference search result pair and corresponding user operation feature data, a hit search result in the difference search result pair at each click in a simulated search environment;
the statistical subunit is used for counting the click results of each search result in each position difference search result pair in the simulated search environment;
and the evaluation subunit is used for evaluating the search strategy to be evaluated based on the counted click results.
17. The apparatus according to claim 16, wherein the statistics subunit is specifically configured to count a total number G of times that the search result item in the first ranked result is clicked is greater than the number of times that the search result item in the second ranked result is clicked; counting the total times B that the times of clicking the search result items in the first sequencing result are less than the times of clicking the search result items in the second sequencing result; counting the number of times of clicking on the search result items in the first sorting result is equal to the total number of times of clicking on the search result items in the second sorting result.
18. The apparatus according to claim 17, wherein the statistics subunit is specifically configured to calculate an evaluation coefficient of the search policy to be evaluated based on the counted times G, B and S.
19. The apparatus according to claim 16, wherein the hit determining subunit is configured to count, based on the user operation feature data, total times awin, bwin, and sum of occurrences in which the number of user historical clicks of one search result item in the location difference search result pair is greater than, less than, and equal to the number of user historical clicks of another search result item, respectively; calculating the probability of clicking on the search result in the first sequencing result and the probability of clicking on the search result in the second sequencing result based on the awin, the bwin and the sum; and determining the hit search result in the position difference search result pair at each click in the simulated search environment by utilizing a random number generation method based on the probability of clicking the search result in the first sequencing result and the probability of clicking the search result in the second sequencing result.
20. The apparatus of claim 16, further comprising:
and the click determining module is used for determining whether the click behavior occurs to the position difference search result pair or not by utilizing a random number generating method based on a preset position click probability before the hit determining subunit determines the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data, and determining the hit search result in the position difference search result pair at each click in the simulated search environment based on each position difference search result pair and the corresponding user operation characteristic data when the click behavior occurs.
CN201610645103.2A 2016-08-09 2016-08-09 Search quality evaluation method and device Active CN107704467B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610645103.2A CN107704467B (en) 2016-08-09 2016-08-09 Search quality evaluation method and device
PCT/CN2016/108422 WO2018028099A1 (en) 2016-08-09 2016-12-02 Method and device for search quality assessment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610645103.2A CN107704467B (en) 2016-08-09 2016-08-09 Search quality evaluation method and device

Publications (2)

Publication Number Publication Date
CN107704467A CN107704467A (en) 2018-02-16
CN107704467B true CN107704467B (en) 2021-08-24

Family

ID=61161645

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610645103.2A Active CN107704467B (en) 2016-08-09 2016-08-09 Search quality evaluation method and device

Country Status (2)

Country Link
CN (1) CN107704467B (en)
WO (1) WO2018028099A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897685B (en) * 2018-06-28 2022-02-25 百度在线网络技术(北京)有限公司 Method, device, server and medium for evaluating quality of search result
CN109543940B (en) * 2018-10-12 2024-04-09 中国平安人寿保险股份有限公司 Activity evaluation method, activity evaluation device, electronic equipment and storage medium
CN109582751B (en) * 2018-11-29 2021-01-01 百度在线网络技术(北京)有限公司 Retrieval effect measuring method and server
CN110020209B (en) * 2019-04-18 2022-03-22 北京奇艺世纪科技有限公司 Method and system for determining correlation between content and search word and method and system for displaying correlation
CN112115344B (en) * 2019-06-20 2024-07-09 百度(中国)有限公司 Automatic evaluation method, device and system for search results and storage medium
CN110727865A (en) * 2019-10-09 2020-01-24 北京百度网讯科技有限公司 Problem positioning method and device of retrieval strategy, electronic equipment and storage medium
CN112784141B (en) * 2019-10-23 2023-10-31 腾讯科技(深圳)有限公司 Search result quality determination method, apparatus, storage medium and computer device
CN111367778B (en) * 2020-03-13 2023-07-07 百度在线网络技术(北京)有限公司 Data analysis method and device for evaluating search strategy
CN112115340A (en) * 2020-09-14 2020-12-22 深圳市欢太科技有限公司 Search strategy selection method, mobile terminal and readable storage medium
CN113781146A (en) * 2020-11-17 2021-12-10 北京沃东天骏信息技术有限公司 Recommendation method, device, equipment and storage medium of product information
CN113626715A (en) * 2021-08-26 2021-11-09 北京字跳网络技术有限公司 Query result display method, device, medium and electronic equipment
CN113849417A (en) * 2021-11-08 2021-12-28 杭州网易云音乐科技有限公司 Test method, medium, device and computing equipment

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020143759A1 (en) * 2001-03-27 2002-10-03 Yu Allen Kai-Lang Computer searches with results prioritized using histories restricted by query context and user community
WO2007130716A2 (en) * 2006-01-31 2007-11-15 Intellext, Inc. Methods and apparatus for computerized searching
CN100440224C (en) * 2006-12-01 2008-12-03 清华大学 Automatization processing method of rating of merit of search engine
CN102760138B (en) * 2011-04-26 2015-03-11 北京百度网讯科技有限公司 Classification method and device for user network behaviors and search method and device for user network behaviors
US20140059062A1 (en) * 2012-08-24 2014-02-27 Google Inc. Incremental updating of query-to-resource mapping
CN103593411A (en) * 2013-10-23 2014-02-19 江苏大学 Method for testing combination properties of evaluation indexes of search engines and testing device
CN103544307B (en) * 2013-11-04 2017-08-08 北京中搜云商网络技术有限公司 A kind of multiple search engine automation contrast evaluating method independent of document library
CN103870607A (en) * 2014-04-08 2014-06-18 北京奇虎科技有限公司 Sequencing method and device of search results of multiple search engines

Also Published As

Publication number Publication date
WO2018028099A1 (en) 2018-02-15
CN107704467A (en) 2018-02-16

Similar Documents

Publication Publication Date Title
CN107704467B (en) Search quality evaluation method and device
CN102760138B (en) Classification method and device for user network behaviors and search method and device for user network behaviors
CN105701216B (en) A kind of information-pushing method and device
CN107665444B (en) Network advertisement instant effect evaluation method and system based on user online behavior
CN107526807B (en) Information recommendation method and device
CN101321190B (en) Recommend method and recommend system of heterogeneous network
US8612435B2 (en) Activity based users' interests modeling for determining content relevance
Teevan et al. Understanding and predicting personal navigation
US8666990B2 (en) System and method for determining authority ranking for contemporaneous content
CN103365839A (en) Recommendation search method and device for search engines
US20080313115A1 (en) Behavioral Profiling Using a Behavioral WEB Graph and Use of the Behavioral WEB Graph in Prediction
CN105247507A (en) Influence score of a brand
CN104268142B (en) Based on the Meta Search Engine result ordering method for being rejected by strategy
CN104636407B (en) Parameter value training and searching request treating method and apparatus
CN104050197B (en) A kind of information retrieval system evaluating method and device
CN105701097A (en) Social-network-platform-based public opinion analysis method and system
CN102934110A (en) Research mission identification
CN110457595B (en) Emergency alarm method, device, system, electronic equipment and storage medium
CN107004200A (en) The evaluated off-line of ranking function
CN101268465B (en) Method for sorting a set of electronic documents
CN103678709B (en) Recommendation system attack detection method based on time series data
CN107577707B (en) Target data set generation method and device and electronic equipment
CN107590176B (en) Evaluation index obtaining method and device and electronic equipment
CN103312584A (en) Method and apparatus for releasing information in network community
CN104156492A (en) Method and device for prompting search content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant