WO2020135059A1 - 搜索引擎评测方法、装置、设备以及可读存储介质 - Google Patents

搜索引擎评测方法、装置、设备以及可读存储介质 Download PDF

Info

Publication number
WO2020135059A1
WO2020135059A1 PCT/CN2019/124637 CN2019124637W WO2020135059A1 WO 2020135059 A1 WO2020135059 A1 WO 2020135059A1 CN 2019124637 W CN2019124637 W CN 2019124637W WO 2020135059 A1 WO2020135059 A1 WO 2020135059A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
search
test
recommendation
search engine
Prior art date
Application number
PCT/CN2019/124637
Other languages
English (en)
French (fr)
Inventor
徐永泽
赖长明
Original Assignee
深圳Tcl新技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳Tcl新技术有限公司 filed Critical 深圳Tcl新技术有限公司
Publication of WO2020135059A1 publication Critical patent/WO2020135059A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present application relates to the technical field of search engines, and in particular, to a search engine evaluation method, device, device, and computer-readable storage medium.
  • search engines have become an essential part of people's lives.
  • web search engines made by large Internet companies such as Google and Baidu
  • thematic search engines specifically for various industries have also received more and more attention. Therefore, a large number of enterprises have created a demand for making search engines specifically for their own businesses.
  • the main purpose of the present application is to provide a search engine evaluation method, device, equipment and computer storage medium, aiming to solve the technical problem that the search engine performs a quantitative test with low accuracy in the prior art.
  • the present application provides a search engine evaluation method, which includes the following steps:
  • the present application also provides a search engine evaluation device, the search engine evaluation device includes:
  • An obtaining module configured to obtain a test question in the search engine to be tested, and obtain a search result list based on the test question;
  • a generating module used to obtain the relevant result set corresponding to the test question and the historical data of the preset search account, and input the relevant result set and the historical data into a recommendation algorithm, and generate a recommendation order table based on the recommendation algorithm;
  • a comparison module for comparing the search result list and the recommendation order table to obtain test values
  • the target value module is used to obtain a preset number of the preset search account, and obtain an average value of each test value based on the preset number, and use the average value as the evaluation value of the search engine to be tested.
  • the present application also provides a mobile terminal
  • the mobile terminal includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein:
  • this application also provides a computer storage medium
  • a computer program is stored on the computer storage medium, and when the computer program is executed by the processor, the steps of the search engine evaluation method described above are implemented.
  • This application obtains a list of search results based on the test questions by obtaining test questions in the search engine to be tested; obtains the related result set corresponding to the test question and historical data of the preset search account, and combines the related result sets Input the historical data into a recommendation algorithm, and generate a recommendation order table based on the recommendation algorithm; perform a comparison test on the search result list and the recommendation order table to obtain test values; obtain a preset number of the preset search accounts, And obtain an average value of each test value based on the preset number, and use the average value as the evaluation value of the search engine to be tested.
  • This solution simulates real user behavior by inputting the historical data of the preset search account and the relevant results corresponding to the test question into the recommendation algorithm, instead of expert evaluation, so as to achieve the purpose of saving costs and improving test confidence, and, Multiple preset search accounts are used for testing, and the average value corresponding to these preset search accounts is used as the target value, so it also avoids the accuracy of relying too much on the tester's business when hiring some industry experts for manual testing
  • the phenomenon of displaying the work quality occurs, which improves the accuracy of the search engine evaluation, makes the evaluation of the search engine more objective, and also solves the technical problem that the accuracy of the quantitative test of the search engine in the prior art is not high.
  • FIG. 1 is a schematic diagram of a terminal ⁇ device structure of a hardware operating environment involved in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a search engine evaluation method of this application
  • FIG. 3 is a schematic flowchart of a second embodiment of a search engine evaluation method of this application.
  • FIG. 4 is a schematic diagram of functional modules of a search engine evaluation equipment device of the present application.
  • FIG. 5 is a test flowchart of the search engine evaluation method of the present application.
  • FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment involved in a solution of an embodiment of the present application.
  • the terminal in the embodiment of the present application is a search engine evaluation device.
  • the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as disk storage.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • the terminal may further include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on.
  • sensors such as light sensors, motion sensors and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display screen according to the brightness of the ambient light, and the proximity sensor may turn off the display screen and/or when the terminal device moves to the ear Backlight.
  • the terminal device can also be configured with other sensors such as gyroscopes, barometers, hygrometers, thermometers, and infrared sensors, which will not be repeated here.
  • FIG. 1 does not constitute a limitation on the terminal, and may include more or less components than those illustrated, or combine certain components, or have different component arrangements.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and computer-readable instructions.
  • the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server;
  • the user interface 1003 is mainly used to connect to the client (user side) and perform data communication with the client; and the processor 1001 can be used to call computer-readable instructions stored in the memory 1005 and perform the following operations:
  • the present application provides a search engine evaluation method.
  • the search engine evaluation method includes the following steps:
  • Step S10 Obtain a test question in the search engine to be tested, and obtain a search result list based on the test question;
  • the search engine refers to the use of specific computer programs to collect information from the Internet according to certain strategies. After organizing and processing the information, it provides users with retrieval services. The user retrieves relevant information and presents it to the user's system, such as "Baidu” and "Google".
  • the test question in the search engine is obtained first, and the test question is randomly input by the tester. Therefore, the content of the test question can be content in any technical field. Then, the search engine will obtain the keywords in the test question, and then obtain the search results according to their keywords, and sort the search results to obtain the search result list.
  • the way to sort the search results may be to first determine the number of searches for this keyword, for example, when the received keywords are "mobile phone” and pencil, if the keyword "mobile phone” has not been received before When there is no user searching for “mobile phone”, the “mobile phone” is recorded, and the keyword “pencil” has been received before, that is, the user searches for “pencil”, then the number of searches for “pencil” is carried out To accumulate records, if the number of searches for "pencil” is 6 before then, add 1 to the number of 6 and record the number of searches for "pencil” to 7. As can be seen from the number of searches, keywords with fewer times indicate fewer users who follow this keyword, and keywords with more times indicate that more users follow this keyword.
  • the search corresponding to the keyword can be determined according to the ranking of the number of keyword searches
  • the result is sorted, that is, the search results corresponding to keywords with a higher number of times are sorted in front of the search result list, and the search results corresponding to keywords with a lower number of times are sorted behind the search result list.
  • the search result list may be an ordered search result that can be searched by the search engine to be tested according to the test question.
  • Step S20 Obtain the relevant result set corresponding to the test question and the historical data of the preset search account, and input the relevant result set and the historical data into a recommendation algorithm, and generate a recommendation order table based on the recommendation algorithm;
  • the recommendation algorithm is the core algorithm of the recommendation system, and the recommendation system has two main categories, the scheme based on collaborative filtering and the scheme based on product content.
  • the recommendation system based on the collaborative filtering scheme will use the user's historical behavior (click, purchase or rating, etc.) as the input of the model, calculate the similarity between different users to get the output.
  • the recommendation system based on the product content scheme takes the attribute information of the product as an input, and tries to recommend products with similar attributes to the user's favorite product. But no matter what kind of recommendation system, they all have consistent output data, that is, the specified user's evaluation prediction for the specified product, and this prediction is often a numerical value, the size of which reflects the recommendation algorithm thinks the user likes the product.
  • the related result set can be all search results related to it that can be detected through the test question. For example, when the test question is to search for war films, the related results are all war films at this time. It should be noted that the related result sets It can include a list of search results, but it is not limited to a list of search results, and the collection of related results can be obtained by means other than the search engine search method (such as offline collection, authoritative website acquisition, etc.). 'S search results are also not sorted.
  • the preset search account may be a search account preset in advance. For example, when an offline test is conducted on a search engine in a film and television field, if a user watches a movie on the film and television platform corresponding to the search engine, the user Account as the default search account. After the preset search account has been obtained, it is also necessary to obtain the historical data of the preset search account, and then enter the relevant result set and historical data into the recommendation algorithm, and output all the results predicted by the recommendation algorithm to the input user ( That is, a rating value of the attractiveness of the preset search account), and select the predicted rating value of only the relevant results corresponding to the test question from it, and then give a recommended order table of all relevant results according to the size of the rating value.
  • the historical data of the preset search account may be that the user has collected some behaviors of the user on the target result without using the search engine.
  • the search engine gives the qualified result (movie) according to the user’s search question, but in fact the user has already watched some movies or even used the search engine. Rate these movies.
  • we test a search engine in a social network The search engine gives a qualified result (other users in the network) according to the user's search question, but in fact, the user has already used the search engine without using the search engine. Some other users have established special contacts.
  • Step S30 performing a comparison test on the search result list and the recommendation order table to obtain test values
  • the test value can include the relevant quality evaluation value and the sequential quality evaluation value.
  • the relevant quality evaluation mainly considers the correlation between the results given by the search engine and the user’s questions, that is, how many of the results given by the search engine are related to the question and related How much is the overall result related to the problem? Generally, the indicators of accuracy and recall rate are used to quantify the measurement.
  • Accuracy is a measure of the search system's ability to exclude irrelevant documents
  • recall is a measure of the ability of a query to search for all relevant documents.
  • the order quality evaluation mainly considers the ranking quality of the results given by search engines. We know that when users are faced with a search result (a list of results), they hardly look at the entire list. Users often only care about the results in the top position or some special positions. Therefore, the ranking order of results given by search engines is also an important part of the comparison test.
  • Step S40 Acquire a preset number of the preset search account, and obtain an average value of each test value based on the preset number, and use the average value as the evaluation value of the search engine to be tested.
  • the search engine In order to avoid the inaccurate test caused by the occurrence of special circumstances, when testing the search engine, not only the initial account information of a user is often used, but multiple initial accounts are used to evaluate the test to improve the accuracy of the test. In other words, each time you enter a different preset search account history data into the recommendation algorithm, there is a different test value. The number of test values and the number of preset search accounts are the same. When enough tests are obtained After the value, obtain the average record of these test values (that is, the average value), and use this average value as the final evaluation value of the search engine to be tested, and use this evaluation value of the search engine as the evaluation result of the search engine .
  • the preset number may be any preset search account number set in advance by the test staff.
  • the target search engine For example, as shown in FIG. 5, first input the designed test question into the target search engine to be tested, and then output the search result list corresponding to the test question through the target search engine. Secondly, obtain all relevant results corresponding to the test question, and also need to obtain the historical data of a specific user, and then enter the relevant results and historical data of the specific user into the recommendation algorithm, the recommendation algorithm outputs all the results of its prediction to attract the input user A score value of the degree, and then select the predicted score value of the relevant result corresponding only to the test question, then give a recommended order list of all relevant results according to the size of the score, and finally recommend the search result list and the relevant results
  • the sequence table is compared with the test to obtain the test result. Repeat the above process until we have used enough user data to enter recommendations, and take the average of the test values, and continue the process until the test completes all the test questions.
  • a test question in the search engine to be tested is obtained, and a search result list is obtained based on the test question; a related result set corresponding to the test question and historical data of a preset search account are obtained, and all The relevant result set and the historical data are input into a recommendation algorithm, and a recommendation order table is generated based on the recommendation algorithm; the search result list and the recommendation order table are compared and tested to obtain test values; and the preset search account number is obtained.
  • This solution simulates real user behavior by inputting the historical data of the preset search account and the relevant results corresponding to the test question into the recommendation algorithm, instead of expert evaluation, so as to achieve the purpose of saving costs and improving test confidence, and, Multiple preset search accounts are used for testing, and the average value corresponding to these preset search accounts is used as the target value, so it also avoids the accuracy of relying too much on the tester's business when hiring some industry experts for manual testing
  • the phenomenon of displaying the work quality occurs, which improves the accuracy of the search engine evaluation, makes the evaluation of the search engine more objective, and also solves the technical problem that the accuracy of the quantitative test of the search engine in the prior art is not high.
  • a second embodiment of the search engine evaluation method of the present application is proposed. This embodiment is a refinement of step S30 of the first embodiment of the present application. Referring to FIG. 3, including :
  • Step S31 Acquire each recommendation result in the recommendation order table and each search result in the search result list
  • test value includes the relevant quality evaluation value.
  • the recommendation result may be a related result corresponding to the test question, but the recommended result has been sorted relative to the related result.
  • Each recommendation result is obtained in the recommendation order table, and each search result is obtained in the search result list. Among them, each search result does not necessarily include all relevant results of the test question.
  • Step S32 Perform a comparison test on each of the recommendation results and each of the search results, and calculate and obtain the first number of search results that match the recommendation results;
  • Each recommendation result and each search result need to be compared with each other to determine whether there is a search result matching each recommendation result in each search result. And when it is found that there is sometimes, it will automatically count and obtain the first number of search results matching the recommendation results (that is, the number of search results matching the recommendation results in the search results). For example, when the search results have four types: A, B, C, and D, and the recommended results have five types: A, D, E, R, and T, you need to sequentially select A in the search results and A, D in the recommended results.
  • Step S33 Obtain the second quantity of each search result and the third quantity of each recommendation result, and determine the proportion of the first quantity in the second quantity and the third quantity, respectively, The ratio value is used as the relevant quality evaluation value.
  • the second number may be all the numbers of search results.
  • the third quantity may be all the quantity of recommendation results. First obtain the second quantity of search results and the third quantity of recommended results, and determine the proportion of the first quantity in the second quantity, that is, the first quantity can be divided by the second quantity to obtain the proportion value. Then, it is necessary to determine the proportion occupied by the first quantity in the third quantity, so as to obtain the corresponding proportion value, and unify the two proportion values as the relevant quality evaluation value.
  • the search result list and the recommendation order table are compared to obtain the relevant quality evaluation value of the search engine, thereby ensuring the accuracy of testing the search engine, and because it is performed in a non-manual manner, Therefore, the intelligent effect of testing search engines is also improved.
  • the step of determining the proportion values occupied by the first quantity in the second quantity and the third quantity, and using the proportion value as the relevant quality evaluation value includes:
  • Step S331 Obtain the first proportion value of the first quantity and the second quantity, and use the first proportion value as an accuracy rate value;
  • the relevant quality evaluation value includes the accuracy rate value and the recall rate value; the ratio value includes the first ratio value and the second ratio value.
  • the second quantity may be the quantity of each search result in the search results.
  • the first ratio value can be used as the accuracy rate value.
  • the accuracy rate value is to calculate the proportion value of the accurate results occupying all results in the results searched by the search engine.
  • Step S332 Obtain a second ratio value between the first quantity and the third quantity, and use the second ratio value as a recall rate value.
  • the third quantity may be the quantity of each recommendation result in the recommendation order table.
  • the recall rate value may be a ratio of the number of retrieved related documents and the number of all related documents in the document library.
  • the search engine's search efficiency is determined by determining the search engine's accuracy rate and recall rate value, and since it is performed in a non-manual manner, the accuracy of the detection effect is also guaranteed.
  • the step of comparing the search result list and the recommended order table to obtain the test value further includes:
  • Step S34 Acquire the first sorted position of each primary result in the search result list
  • the test value includes a sequential quality evaluation value.
  • the first sorting position may be the sorting position of each primary result in the search result list. Since each search result in the search result list is already sorted, after obtaining each primary result, it is also necessary to determine the first sorting position of each primary result in the search result list.
  • Step S35 Obtain a second ranking position of each primary result in the recommended ranking table, and perform a comparison test between the first ranking position and the second ranking position to obtain a ranking quality evaluation value.
  • the second sorting position may be the sorting position of each primary result in the recommended order table. Since the recommendation results in the recommendation order table have been sorted, after obtaining each primary result, you also need to determine the second ranking position of each primary result in the recommendation order table, and then determine whether the ranking position between the two is the same And obtain the sequential quality evaluation value.
  • the order quality evaluation value mainly considers the ranking quality of the search results given by the search engine. Since the user hardly looks at the entire list when facing a search result (a search result list), the user often only cares about the ranking The search results in the top position or some special positions, so the ranking order of the search results given by the search engine is also an important part of its quality.
  • a third embodiment of the search engine evaluation method of the present application is proposed.
  • This embodiment is step S20 of the first embodiment of the present application, and the relevant results And inputting the historical data into a recommendation algorithm, and refining the steps of generating a recommendation order table based on the recommendation algorithm, including:
  • Step S21 input the relevant result set and the historical data into a recommendation algorithm, and obtain content information of the relevant result set;
  • Step S22 Obtain an application scenario corresponding to the content information and data conditions corresponding to the historical data, determine a recommended algorithm solution based on the application scenario and the data conditions, and generate a recommendation order table based on the recommended algorithm solution.
  • the recommendation algorithm solution may include a collaborative filtering solution, a product content-based recommendation solution, and a hybrid solution based on collaborative filtering and product content recommendation.
  • the data condition can be that the data contains a lot of multi-dimensional product information, and the product information can include historical user behavior information, such as the user's record of opening a webpage, or product attribute information, such as the length of a movie, the director, etc. It can also be the user's own account information, such as gender, age, etc. It should be noted that the data condition needs to include at least the user's historical behavior information.
  • the corresponding application scenario needs to be determined according to the content information.
  • the recommendation algorithm scheme is determined according to different application scenarios and data conditions, thereby improving the accuracy of search engine evaluation and improving the user's experience of use.
  • the steps of determining a recommended algorithm scheme based on the application scenario and the data conditions, and generating a recommended order table based on the recommended algorithm scheme include:
  • Step S221 judging whether the data condition meets the preset condition
  • Step S222 if the data condition does not satisfy the preset condition, obtain a collaborative filtering scheme in the recommendation algorithm based on the application scenario and the data condition, and sort the related result set based on the collaborative filtering scheme to generate Recommended sequence table.
  • the preset condition may be a condition set by the user in advance, and judging whether the data condition satisfies the preset condition may be to check whether our data contains a lot of multi-dimensional product information. If it does not contain a lot of multi-dimensional product information, You can use collaborative filtering scheme experiment; if there is a lot of multi-dimensional product information, you can use a hybrid scheme based on collaborative filtering and product content recommendation. And when it is found that the experimental result of the experiment using the collaborative filtering scheme is not good, you can use the hybrid scheme to conduct the experiment. However, if it is found that the experimental effect of the mixed scheme is still not good, you can use the scheme based on product content to conduct the experiment.
  • the so-called good or poor performance depends on the index corresponding to the evaluation scheme of the recommendation system, and if the original data itself does not support the establishment of collaborative filtering or a recommendation model based on product content, there is no need to consider the choice of recommendation algorithm scheme.
  • the usage data can support the established model.
  • the recommendation algorithm scheme is determined by determining whether the first data in the data condition meets the preset condition, thereby improving the accuracy of the recommendation order table and improving the user's experience of using.
  • the step of determining whether the data condition satisfies the preset condition includes:
  • Step S223 If the data condition satisfies the preset condition, obtain a mixed solution in a recommendation algorithm based on the application scenario and the data condition, and sort the related result set based on the mixed solution to generate a recommendation Sequence table.
  • the hybrid scheme may be a hybrid scheme based on collaborative filtering and product content recommendation.
  • the hybrid scheme of collaborative filtering and product content can be used for experiments, that is, the historical data and related result sets are input into this hybrid scheme, and then the relevant results are correlated according to the output of the hybrid scheme.
  • the result set is sorted to generate a recommendation order table, that is, the recommendation algorithm outputs a score value of the attractiveness of all the results predicted by it to the input user. We pick out the predicted scores of only the relevant results corresponding to the test questions, and give a list of recommended order of all relevant results according to the size of the score.
  • the hybrid solution in the recommendation algorithm is determined by satisfying the preset conditions according to the data conditions, thereby improving the accuracy of the search engine evaluation method and improving the user experience.
  • an embodiment of the present application also provides a search engine evaluation device, and the search engine evaluation device includes:
  • An obtaining module configured to obtain a test question in a search engine to be tested, and obtain a search result list based on the test question;
  • a generating module used to obtain the relevant result set corresponding to the test question and the historical data of the preset search account, and input the relevant result set and the historical data into a recommendation algorithm, and generate a recommendation order table based on the recommendation algorithm;
  • a comparison module for comparing the search result list and the recommendation order table to obtain test values
  • the target value module is used to obtain a preset number of the preset search account, and obtain an average value of each test value based on the preset number, and use the average value as the evaluation value of the search engine to be tested.
  • test value includes a relevant quality evaluation value
  • comparison module is further used to:
  • the relevant quality evaluation value includes an accuracy rate value and a recall rate value
  • the ratio value includes a first ratio value and a second ratio value
  • the comparison module is further configured to:
  • test value includes a sequential quality evaluation value
  • comparison module is further used to:
  • the generating module is also used to:
  • the generating module is also used to:
  • a collaborative filtering scheme in a recommendation algorithm is obtained based on the application scenario and the data condition, and related results are sorted based on the collaborative filtering scheme to generate a recommendation order table.
  • the generating module is also used to:
  • a hybrid scheme in a recommendation algorithm is obtained based on the application scenario and the data condition, and the related result set is sorted based on the hybrid scheme to generate a recommendation order table.
  • the present application also provides a terminal, which includes: a memory, a processor, a communication bus, and computer-readable instructions stored on the memory:
  • the communication bus is used to implement connection communication between the processor and the memory
  • the processor is configured to execute the computer-readable instructions to implement the steps of the above embodiments of the search engine evaluation method.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile readable storage medium.
  • the computer-readable storage medium stores one or more programs.
  • the one or More than one program may also be executed by one or more processors for implementing the steps of the above embodiments of the search engine evaluation method.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above , Disk, CD), including several instructions to make a terminal device (which can be a mobile phone, computer, server, air conditioner, or network equipment, etc.) to perform the method described in each embodiment of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种搜索引擎评测方法、装置、设备和计算机存储介质,该方法包括:获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表(S10);获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表(S20);将所述搜索结果列表和推荐次序表进行比照测试,获取测试值(S30);获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值(S40)。

Description

搜索引擎评测方法、装置、设备以及可读存储介质
本申请要求于2018年12月29日提交中国专利局、申请号为201811654429.7、发明名称为“搜索引擎评测方法、装置、设备以及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及搜索引擎技术领域,尤其涉及一种搜索引擎评测方法、装置、设备以及计算机可读存储介质。
背景技术
随着互联网行业的迅速发展,搜索引擎已然成为人们生活中必不可少的一部分。除了像Google、百度这样的大型互联网企业制作的网页搜索引擎,专门针对各行各业的专题类搜索引擎也得到了越来越多的重视。因此,大量的企业产生了制作专门针对于自家业务的搜索引擎的需求。
而在搜索引擎系统的测试流程中,往往需要线上测试来完成。但是,对于搜索引擎业务经验不足的企业来说,不经过线下测试直接上线一款搜索引擎产品是有很大风险的。所以我们往往需要对原本需要线上测试的性能增加一个线下的预演测试,即聘请一些行业专家进行人工测验。这在人工和时间成本上的消耗都是巨大的,且准确度十分依赖于测试员的业务和工作素质。
技术解决方案
本申请的主要目的在于提供一种搜索引擎评测方法、装置、设备和计算机存储介质,旨在解决现有技术中搜索引擎进行量化测试的准确率不高的技术问题。
为实现上述目的,本申请提供一种搜索引擎评测方法,所述搜索引擎评测方法包括以下步骤:
获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
此外,为实现上述目的,本申请还提供一种搜索引擎评测装置,所述搜索引擎评测装置包括:
获取模块,用于获取待测搜索引擎中的测试问句,并基于所述测试问句获取搜索结果列表;
生成模块,用于获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
对比模块,用于将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值;
目标值模块,用于获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
此外,为实现上述目的,本申请还提供一种移动终端;
所述移动终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中:
所述计算机程序被所述处理器执行时实现如上所述的搜索引擎评测方法的步骤。
此外,为实现上述目的,本申请还提供计算机存储介质;
所述计算机存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上述的搜索引擎评测方法的步骤。
本申请通过获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。本方案通过将预设搜索账号的历史数据和测试问句对应的相关结果输入到推荐算法来模拟真实用户行为,代替了专家评估,从而达到了节约成本、提升测试置信度的目的,并且,是采用多个预设搜索账号进行测试的,并以这些预设搜索账号对应的平均值作为目标值的,因此也避免了在聘请一些行业专家进行人工测验时,准确度过度依赖于测试员的业务和工作素质的显示现象发生,提高了搜索引擎评估的准确性,使得搜索引擎的评估更加客观,也解决了现有技术中搜索引擎进行量化测试的准确率不高的技术问题。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的终端\装置结构示意图;
图2为本申请搜索引擎评测方法第一实施例的流程示意图;
图3为本申请搜索引擎评测方法第二实施例的流程示意图;
图4为本申请搜索引擎评测设备装置的功能模块示意图;
图5为本申请搜索引擎评测方法的测试流程图。
本申请目的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
如图1所示,图1是本申请实施例方案涉及的硬件运行环境的终端结构示意图。
本申请实施例终端为搜索引擎评测设备。
如图1所示,该终端可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
可选地,终端还可以包括摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。其中,传感器比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示屏的亮度,接近传感器可在终端设备移动到耳边时,关闭显示屏和/或背光。当然,终端设备还可配置陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
本领域技术人员可以理解,图1中示出的终端结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及计算机可读指令。
在图1所示的终端中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端(用户端),与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的计算机可读指令,并执行以下操作:
获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
获取所述预设搜索账号的预设数量,并基于所述预设数量获取所述测试值的平均值,并将所述平均值作为待测搜索引擎的评测值。
参照图2,本申请提供一种搜索引擎评测方法,在搜索引擎评测方法第一实施例中,搜索引擎评测方法包括以下步骤:
步骤S10,获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
本申请主要应用于对待测试的搜索引擎的性能评估,搜索引擎是指根据一定的策略、运用特定的计算机程序从互联网上搜集信息,在对信息进行组织和处理后,为用户提供检索服务,将用户检索相关的信息展示给用户的系统,例如“百度”、“Google”等。
可理解地,先获取搜索引擎中的测试问句,而测试问句是由测试人员随机输入进去的,因此,测试问句的内容可以是任意一个技术领域的内容。然后,搜索引擎会获取测试问句中的关键词,再根据其关键词获取搜索结果,并将这些搜索结果进行排序,以获取到搜索结果列表。其中,对搜索结果进行排序的方式可以是,先确定此关键词的搜索次数,例如,当接收到的关键词分别为“手机”和铅笔时,若关键词“手机”在此之前没有被接收到,即没有用户对“手机”进行搜索,则将“手机”进行记录,而关键词“铅笔”在此之前接收过,即用户对“铅笔”进行搜索,则将“铅笔”的搜索次数进行累加记录,如在此之前“铅笔”的搜索次数为6次,则在次数6的基础上加1,记录“铅笔”的搜索次数为7。从搜索次数可知,次数少的关键词说明关注此关键词的用户少,次数多的关键词说明关注此关键词的用户多,因此可以根据关键词的搜索次数的排序来确定关键词对应的搜索结果的排序,即将次数多的关键词对应的搜索结果排序在搜索结果列表的前面,而将次数少的关键词对应的搜索结果排序在搜索结果列表的后面。其中,搜索结果列表可以是待测搜索引擎根据测试问句能够搜索到与之相关的已排好序的搜索结果。
步骤S20,获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
推荐算法是推荐系统的核心算法,而推荐系统主要有两大类别,基于协同滤波的方案和基于产品内容的方案。其中,基于协同滤波方案的推荐系统会将用户对产品的历史行为(点击、购买或评分等)作为模型的输入,计算不同用户之间的相似度来得出输出。而基于产品内容方案的推荐系统则以产品的属性信息作为输入,力图推荐和用户喜爱的产品属性相近的产品。但无论是怎样的推荐系统,它们都有着一致的输出数据,即指定指定用户对指定产品的评价预测,而这个预测往往是一个数值,其大小反映了推荐算法认为用户对产品的喜爱程度。
相关结果集合可以是通过测试问句能够检测得到的所有与之相关的搜索结果,例如测试问句是搜索战争片时,此时相关结果则是所有的战争片,需要说明的是,相关结果集合可以包含搜索结果列表,但可以不局限于搜索结果列表,并且相关结果集合的获取可以是采用待测搜索引擎搜索方式之外的其它方式获取(如线下收集、权威网站获取等),获取到的搜索结果也未进行排序。
预设搜索账号可以是提前预设的搜索账号,例如在对某一个影视领域的搜索引擎进行线下测试时,若有用户在此搜索引擎对应的影视平台上看电影了,则可以将此用户的账号作为预设搜索账号。当已获取到预设搜索账号后,还需要获取此预设搜索账号的历史数据,然后将相关结果集合和历史数据输入到推荐算法中,并通过推荐算法输出其预测的所有结果对于输入用户(即预设搜索账号)的吸引度的一个评分值,并从中挑选出仅与测试问句对应的相关结果的预测评分值,再根据评分值的大小给出全部相关结果的一份推荐次序表,并将此推荐次序表视为用户问出测试问句时,对他而言最理想的搜索结果(可以称为标准结果)。其中,预设搜索账号的历史数据可以是在用户没有使用搜索引擎的情况下,已经采集了用户对目标结果的一些行为。例如,我们测试一个电影垂直领域的搜索引擎,搜索引擎根据用户的搜索问句给出的符合条件的结果(电影),但是实际上用户在不使用搜索引擎的情况下,已经观看过一些电影甚至为这些电影打分。又比如,我们测试一个社交网络中的搜索引擎,搜索引擎根据用户的搜索问句给出符合条件的结果(网络中的其它用户),但是实际上用户在不使用搜索引擎的情况下,已经和其它某些用户建立了特殊联系。
步骤S30,将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
当获取到搜索结果列表和推荐次序表后,还需要将搜索结果列表和推荐次序表进行比照测试,从而得到测试结果(即测试值),而进行比照测试需要将两张列表的比较结果量化,可以采用MAP(Mean Average Precison,平均准确率)方法,即求每个相关文档检索出后的准确率的平均值等方案。其中,测试值可以包括相关质量评估值和顺序质量评估值,相关质量评估主要考虑搜索引擎给出的结果与使用者提问的相关比较,即搜索引擎给出的全部结果有多少与问题相关和与问题相关的全部结果有多少,一般采用准确率和召回率的指标来量化衡量。准确率是衡量搜索系统排除不相关文档的能力,而召回率是衡量一个查询搜索到所有相关文档的能力。而顺序质量评估主要考虑搜索引擎给出结果的排序质量。我们知道用户在面对一个搜索结果(一份结果列表)的时候几乎不会看完整个列表,用户往往只关心排在前位或者一些特殊位置的结果。因此,搜索引擎给出结果的排序次序也是比照测试的一个重要组成部分。
步骤S40,获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
为避免特殊情况的发生而导致测试不准确,在对搜索引擎进行测试时,往往不只采用一个用户的初始账号信息,而是采用多个初始账号进行评估测试,以提高测试的准确性。也就是说,每往推荐算法输入一个不同的预设搜索账号的历史数据,就有一个不同的测试值,测试值的数量和预设搜索账号的数量是相同的,当获取到足够多的测试值后,获取这些测试值的平均记录(即平均值),并此平均值作为最终的待测搜索引擎的评测值,并将此待测搜索引擎的评测值作为此次对搜索引擎的评测结果。其中,预设数量可以是测试工作人员提前设置的任意预设搜索账号数量。
为辅助理解本申请对搜索引擎的测试流程,下面进行举例说明。
例如,如图5所示,先将设计好的测试问句输入需要测试的目标搜索引擎中,再通过目标搜索引擎输出与测试问句对应的搜索结果列表。其次,获取与测试问句对应的所有相关结果,并且还需要获取特定用户的历史数据,然后将相关结果和特定用户历史数据输入到推荐算法,推荐算法输出其预测的所有结果对于输入用户的吸引度的一个评分值,再挑选出仅与测试问句对应的相关结果的预测评分值,再根据分值的大小给出全部相关结果的一份推荐次序表,最后将搜索结果列表和相关结果推荐次序表进行比照测试,从而得到测试结果。重复上述过程直至我们使用了足够多的用户数据输入推荐,并取测试值的平均值,继续流程直至测试完成全部的测试问句。
在本实施例中,通过获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。本方案通过将预设搜索账号的历史数据和测试问句对应的相关结果输入到推荐算法来模拟真实用户行为,代替了专家评估,从而达到了节约成本、提升测试置信度的目的,并且,是采用多个预设搜索账号进行测试的,并以这些预设搜索账号对应的平均值作为目标值的,因此也避免了在聘请一些行业专家进行人工测验时,准确度过度依赖于测试员的业务和工作素质的显示现象发生,提高了搜索引擎评估的准确性,使得搜索引擎的评估更加客观,也解决了现有技术中搜索引擎进行量化测试的准确率不高的技术问题。
进一步地,在本申请第一实施例的基础上,提出了本申请搜索引擎评测方法的第二实施例,本实施例是本申请第一实施例的步骤S30的细化,参照图3,包括:
步骤S31,获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
需要说明的是,在本实施例中,测试值包括相关质量评估值。
推荐结果可以是与测试问题对应的相关结果,但推荐结果相对于相关结果已经排好序列。在推荐次序表中获取各个推荐结果,在搜索结果列表中获取各个搜索结果。其中,各个搜索结果不一定包含测试问题的全部相关结果。
步骤S32,将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
需要将各个推荐结果和各个搜索结果进行比照测试,从而来判断各个搜索结果中是否存在有和各个推荐结果匹配的搜索结果。并当发现存在有时,则会自动统计并获取和推荐结果匹配的搜索结果的第一数量(即在搜索结果中和推荐结果匹配的搜集结果的数量)。例如,当搜索结果有A、B、C、D四种时,而推荐结果有A、D、E、R、T五种时,需要依次将搜索结果中的A和推荐结果中的A、D、E、R、T五种进行比照测试,确定A是否为初级结果,同理,也将搜索结果中的B、C、D和推荐结果中的A、D、E、R、T分别进行比照测试,从而确定初级结果,如当发现搜索结果中的A和推荐结果中的A匹配时,搜索结果中的D和推荐结果中的D匹配时,则可以将A和D作为初级结果,此时初级结果的数量就有两个。
步骤S33,获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
第二数量可以是搜索结果的所有数量。第三数量可以是推荐结果的所有数量。先获取搜索结果的第二数量和推荐结果的第三数量,并确定第一数量在第二数量中所占据的比例,即可以将第一数量除以第二数量,从而得到的比例值。然后还需要确定第一数量在第三数量中所占据的比例,从而得到对应的比例值,将这两个比例值统一作为相关质量评估值。
在本实施例中,通过将搜索结果列表和推荐次序表进行比照测试来获取搜索引擎的相关质量评估值,从而保证了测试搜索引擎的准确率,并且由于是采用非人工的方式来进行的,因此也提高了测试搜索引擎的智能化效果。
具体地,确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值的步骤,包括:
步骤S331,获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
需要说明的是,在本实施例中,相关质量评估值包括准确率值和召回率值;比例值包括第一比例值和第二比例值。
第二数量可以是搜索结果中的各个搜索结果的数量,当获取到各个搜索结果的第二数量和初级结果的第一数量时,还需要确定第一数量占据第二数量的第一比例值,并可以将此第一比例值作为准确率值。例如,当第一数量为5个时,第二数量为10个时,则可以确定比例值为0.5,此时的准确率值也为0.5。其中,准确率值就是计算搜索引擎搜索到的结果中准确的结果占据所有的结果的比例值。
步骤S332,获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
第三数量可以是推荐次序表中的各个推荐结果的数量,当获取到各个推荐结果的第三数量和初级结果的第一数量时,还需要确定第一数量占据第三数量的第二比例值,并可以将此第二比例值作为召回率值。召回率值可以是检索出的相关文档数和文档库中所有的相关文档数的比率。
在本实施例中,通过确定搜索引擎的准确率值和召回率值,从而确定了搜索引擎的搜索效率,并且由于是采用非人工的方式进行,因此也保证了检测效果的准确性。
具体地,将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,还包括:
步骤S34,获取各所述初级结果在所述搜索结果列表中的第一排序位置;
需要说明的是,在本实施例中,测试值包括顺序质量评估值。
第一排序位置可以是各个初级结果在搜索结果列表中的排序位置。由于搜索结果列表中各个搜索结果已经排好序,因此当获取到各个初级结果后,还需要确定各个初级结果在搜索结果列表中的第一排序位置。
步骤S35,获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
第二排序位置可以是各个初级结果在推荐次序表中的排序位置。由于推荐次序表中各个推荐结果已经排好序,因此当获取到各个初级结果后,还需要确定各个初级结果在推荐次序表中的第二排序位置,然后确定两者之间的排序位置是否相同,并获取到顺序质量评估值。其中,顺序质量评估值主要是考虑搜索引擎给出搜索结果的排序质量,由于用户在面对一个搜索结果(一份搜索结果列表)的时候几乎不会看完整个列表,用户往往只关心排在前位或者某些特殊位置的搜索结果,因此搜索引擎给出搜索结果的排列次序也是其质量的一个重要组成部分。
在本实施例中,通过确定搜索引擎的顺序质量评估值,从而可以测试出此搜索引擎的效率是否高效,并且由于是采用非人工的方式进行的,因此也保证了搜索引擎的评估更加客观。
进一步地,在本申请第一至第二实施例的基础上,提出了本申请搜索引擎评测方法的第三实施例,本实施例是本申请第一实施例的步骤S20,将所述相关结果和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表的步骤的细化,包括:
步骤S21,将所述相关结果集合和所述历史数据输入进推荐算法,并获取所述相关结果集合的内容信息;
当获取到与测试问题对应的所有相关结果后,需要将所有相关结果和预设搜索账号的历史数据一起输入到推荐算法中,并在各个相关结果中确定相关结果的内容信息,也可以说是关键词。
步骤S22,获取所述内容信息对应的应用场景和所述历史数据对应的数据条件,基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表。
推荐算法方案可以包括协同滤波方案、基于产品内容推荐方案、基于协同滤波和产品内容推荐的混合方案。数据条件可以是数据中含有大量多维度的产品信息,而产品信息可以包括用户历史行为信息,如用户开启某网页的记录等,也可以是产品属性信息,如某电影的播放时长,导演等,还可以是用户的自身账号信息,如性别、年龄等,需要说明的是,数据条件最少需要包括用户历史行为信息。
当获取到相关结果集合对应的内容信息后,还需要根据内容信息来确定对应的应用场景。与此同时,还需要获取历史数据对应的数据条件,然后再根据应用场景和数据条件来确定推荐系统中的推荐算法方案,再根据推荐算法方案对相关结果集合进行排序,以生成推荐次序表,即推荐算法输出其预测的所有结果对于输入用户的吸引度的一个评分值。我们挑选出仅与测试问句对应的相关结果的预测评分值,根据分值的大小给出全部相关结果的一份推荐次序表。
在本实施例中,通过根据不同的应用场景和数据条件来确定推荐算法方案,从而提高了搜索引擎评测的准确性,提高了用户的使用体验感。
具体地,基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表的步骤,包括:
步骤S221,判断所述数据条件是否满足预设条件;
步骤S222,若所述数据条件不满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的协同滤波方案,并基于所述协同滤波方案对相关结果集合进行排序,以生成推荐次序表。
预设条件可以是用户提前设置的条件,而判断数据条件是否满足预设条件,即可以是查看我们的数据是否是含有大量多维度的产品信息的情况,如果不含有大量多维度的产品信息,则可以采用协同滤波方案实验;如果含有大量多维度的产品信息,则可以采用基于协同滤波和产品内容推荐的混合方案。并且当发现采用协同滤波方案进行实验的实验结果不佳时,则可以采用混合方案进行实验。但是如果有发现混合方案的实验效果仍然不佳时,则可以采用采用基于产品内容的方案进行实验,若基于产品内容的方案实验结果如果较佳就采用该方案,否则若没有实验协同滤波方案的情况,增加协同滤波方案的实验,最终取所有实验方案中最好情况实际采用。其中,所谓效果佳或是不佳,取决于推荐系统的评估方案对应的指标,并且,若原始数据本身不支持建立协同滤波或基于产品内容的推荐模型,则不需考虑推荐算法方案的选择,使用数据能够支持建立的模型即可。
在本实施例中,通过确定数据条件中的第一数据是否满足预设条件来确定推荐算法方案,从而提高了推荐次序表的准确性,提高了用户的使用体验感。
具体地,所述判断所述数据条件是否满足预设条件的步骤之后,包括:
步骤S223,若所述数据条件满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的混合方案,并基于所述混合方案对所述相关结果集合进行排序,以生成推荐次序表。
混合方案可以是基于协同滤波和产品内容推荐的混合方案。当经过判断判断发现数据条件满足预设条件时,则可以采用协同滤波和产品内容的混合方案进行实验,即将历史数据和相关结果集合输入到此混合方案中,再根据混合方案的输出结果对相关结果集合进行排序,以生成推荐次序表,即推荐算法输出其预测的所有结果对于输入用户的吸引度的一个评分值。我们挑选出仅与测试问句对应的相关结果的预测评分值,根据分值的大小给出全部相关结果的一份推荐次序表。
在本实施例中,通过根据数据条件满足预设条件,来确定推荐算法中的混合方案,从而提高了搜索引擎评测方法的准确度,提高了用户的使用体验感。
此外,参照图4,本申请实施例还提出一种搜索引擎评测装置,所述搜索引擎评测装置包括:
获取模块,用于获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
生成模块,用于获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
对比模块,用于将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
目标值模块,用于获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
可选地,所述测试值包括相关质量评估值,所述对比模块,还用于:
获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
可选地,所述相关质量评估值包括准确率值和召回率值,所述比例值包括第一比例值和第二比例值,所述对比模块,还用于:
获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
可选地,所述测试值包括顺序质量评估值,所述对比模块,还用于:
获取各所述初级结果在所述搜索结果列表中的第一排序位置;
获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
可选地,所述生成模块,还用于:
将所述相关结果集合和所述历史数据输入进推荐算法,并获取所述相关结果集合的内容信息;
获取所述内容信息对应的应用场景和所述历史数据对应的数据条件,基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表。
可选地,所述生成模块,还用于:
判断所述数据条件是否满足预设条件;
若所述数据条件不满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的协同滤波方案,并基于所述协同滤波方案对相关结果进行排序,以生成推荐次序表。
可选地,所述生成模块,还用于:
若所述数据条件满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的混合方案,并基于所述混合方案对所述相关结果集合进行排序,以生成推荐次序表。
其中,搜索引擎评测装置的各个功能模块实现的步骤可参照本申请搜索引擎评测方法的各个实施例,此处不再赘述。
本申请还提供一种终端,所述终端包括:存储器、处理器、通信总线以及存储在所述存储器上的计算机可读指令:
所述通信总线用于实现处理器和存储器之间的连接通信;
所述处理器用于执行所述计算机可读指令,以实现上述搜索引擎评测方法各实施例的步骤。
本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质可以为非易失性可读存储介质,所述计算机可读存储介质存储有一个或者一个以上程序,所述一个或者一个以上程序还可被一个或者一个以上的处理器执行以用于实现上述搜索引擎评测方法各实施例的步骤。
本申请计算机可读存储介质具体实施方式与上述搜索引擎评测方法各实施例基本相同,在此不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种搜索引擎评测方法,其中,所述搜索引擎评测方法包括以下步骤:
    获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
    获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
    将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
    获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
  2. 如权利要求1所述的搜索引擎评测方法,其中,所述测试值包括相关质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,包括:
    获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
    将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
    获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
  3. 如权利要求2所述的搜索引擎评测方法,其中,所述相关质量评估值包括准确率值和召回率值,所述比例值包括第一比例值和第二比例值,
    所述确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值的步骤,包括:
    获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
    获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
  4. 如权利要求2所述的搜索引擎评测方法,其中,所述测试值包括顺序质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,还包括:
    获取各所述初级结果在所述搜索结果列表中的第一排序位置;
    获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
  5. 如权利要求1所述的搜索引擎评测方法,其中,所述将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表的步骤,包括:
    将所述相关结果集合和所述历史数据输入进推荐算法,并获取所述相关结果集合的内容信息;
    获取所述内容信息对应的应用场景和所述历史数据对应的数据条件,基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表。
  6. 如权利要求5所述的搜索引擎评测方法,其中,所述基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表的步骤,包括:
    确定所述数据条件不满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的协同滤波方案,并基于所述协同滤波方案对相关结果集合进行排序,以生成推荐次序表。
  7. 如权利要求5所述的搜索引擎评测方法,其中,所述基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表的步骤,包括:
    确定所述数据条件满足预设条件,则基于所述应用场景和所述数据条件获取推荐算法中的混合方案,并基于所述混合方案对所述相关结果集合进行排序,以生成推荐次序表。
  8. 一种搜索引擎评测装置,其中,所述搜索引擎评测装置包括:
    获取模块,用于获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
    生成模块,用于获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
    对比模块,用于将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
    目标值模块,用于获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
  9. 如权利要求8所述的搜索引擎评测装置,其中,所述测试值包括相关质量评估值,所述对比模块,还包括:
    用于获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
    用于将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
    用于获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
  10. 如权利要求9所述的搜索引擎评测装置,其中,所述相关质量评估值包括准确率值和召回率值,所述比例值包括第一比例值和第二比例值,所述对比模块,还包括:
    用于获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
    用于获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
  11. 如权利要求9所述的搜索引擎评测装置,其中,所述测试值包括顺序质量评估值,所述对比模块,还包括:
    用于获取各所述初级结果在所述搜索结果列表中的第一排序位置;
    用于获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
  12. 一种搜索引擎评测设备,其中,所述搜索引擎评测设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现如下步骤:
    获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
    获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
    将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
    获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
  13. 如权利要求12所述的搜索引擎评测设备,其中,所述测试值包括相关质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,包括:
    获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
    将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
    获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
  14. 如权利要求13所述搜索引擎评测设备,其中,所述相关质量评估值包括准确率值和召回率值,所述比例值包括第一比例值和第二比例值,
    所述确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值的步骤,包括:
    获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
    获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
  15. 如权利要求13所述搜索引擎评测设备,其中,所述测试值包括顺序质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,还包括:
    获取各所述初级结果在所述搜索结果列表中的第一排序位置;
    获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下步骤:
    获取待测搜索引擎中的测试问句,基于所述测试问句获取搜索结果列表;
    获取测试问句对应的相关结果集合和预设搜索账号的历史数据,并将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表;
    将所述搜索结果列表和推荐次序表进行比照测试,获取测试值;
    获取所述预设搜索账号的预设数量,并基于所述预设数量获取各所述测试值的平均值,将所述平均值作为待测搜索引擎的评测值。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述测试值包括相关质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,包括:
    获取所述推荐次序表中的各推荐结果和搜索结果列表中的各搜索结果;
    将各所述推荐结果和各所述搜索结果进行比照测试,统计并获取与所述推荐结果匹配的搜索结果的第一数量;
    获取各所述搜索结果的第二数量和各所述推荐结果的第三数量,并确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值。
  18. 如权利要求17所述的计算机可读存储介质,其中,所述相关质量评估值包括准确率值和召回率值,所述比例值包括第一比例值和第二比例值,
    所述确定所述第一数量分别在所述第二数量和所述第三数量中占据的比例值,并将所述比例值作为相关质量评估值的步骤,包括:
    获取所述第一数量和所述第二数量的第一比例值,并将所述第一比例值作为准确率值;
    获取所述第一数量和所述第三数量之间的第二比例值,并将所述第二比例值作为召回率值。
  19. 如权利要求17所述的计算机可读存储介质,其中,所述测试值包括顺序质量评估值,
    所述将所述搜索结果列表和推荐次序表进行比照测试,以获取测试值的步骤,还包括:
    获取各所述初级结果在所述搜索结果列表中的第一排序位置;
    获取各所述初级结果在所述推荐次序表中的第二排序位置,并将所述第一排序位置和所述第二排序位置进行比照测试,以获取顺序质量评估值。
  20. 如权利要求16所述的计算机可读存储介质,其中,所述将所述相关结果集合和所述历史数据输入进推荐算法,基于所述推荐算法生成推荐次序表的步骤,包括:
    将所述相关结果集合和所述历史数据输入进推荐算法,并获取所述相关结果集合的内容信息;
    获取所述内容信息对应的应用场景和所述历史数据对应的数据条件,基于所述应用场景和所述数据条件确定推荐算法方案,并基于所述推荐算法方案生成推荐次序表。
PCT/CN2019/124637 2018-12-29 2019-12-11 搜索引擎评测方法、装置、设备以及可读存储介质 WO2020135059A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811654429.7 2018-12-29
CN201811654429.7A CN109739768B (zh) 2018-12-29 2018-12-29 搜索引擎评测方法、装置、设备以及可读存储介质

Publications (1)

Publication Number Publication Date
WO2020135059A1 true WO2020135059A1 (zh) 2020-07-02

Family

ID=66363076

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/124637 WO2020135059A1 (zh) 2018-12-29 2019-12-11 搜索引擎评测方法、装置、设备以及可读存储介质

Country Status (2)

Country Link
CN (1) CN109739768B (zh)
WO (1) WO2020135059A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109739768B (zh) * 2018-12-29 2021-03-30 深圳Tcl新技术有限公司 搜索引擎评测方法、装置、设备以及可读存储介质
CN110196796B (zh) * 2019-05-15 2023-04-28 无线生活(杭州)信息科技有限公司 推荐算法的效果评价方法及装置
CN110415101A (zh) * 2019-06-19 2019-11-05 深圳壹账通智能科技有限公司 产品推荐测试方法、装置、计算机设备及存储介质
CN110472034B (zh) * 2019-08-21 2022-11-15 北京百度网讯科技有限公司 问答系统的检测方法、装置、设备及计算机可读存储介质
CN111444438B (zh) * 2020-03-24 2023-09-01 北京百度网讯科技有限公司 召回策略的准召率的确定方法、装置、设备及存储介质
CN111475409B (zh) * 2020-03-30 2023-06-30 深圳追一科技有限公司 系统测试方法、装置、电子设备以及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106777248A (zh) * 2016-12-27 2017-05-31 努比亚技术有限公司 一种搜索引擎测试评价方法和装置
CN107273404A (zh) * 2017-04-26 2017-10-20 努比亚技术有限公司 搜索引擎的评估方法、装置及计算机可读存储介质
CN107291607A (zh) * 2016-03-31 2017-10-24 高德信息技术有限公司 一种针对搜索引擎的评测方法及装置
US20180018333A1 (en) * 2016-07-18 2018-01-18 Bioz, Inc. Continuous evaluation and adjustment of search engine results
CN109739768A (zh) * 2018-12-29 2019-05-10 深圳Tcl新技术有限公司 搜索引擎评测方法、装置、设备以及可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110191311A1 (en) * 2010-02-03 2011-08-04 Gartner, Inc. Bi-model recommendation engine for recommending items and peers
CN104239216A (zh) * 2014-10-14 2014-12-24 北京全路通信信号研究设计院有限公司 一种软件数据的测试方法和系统
CN106776333A (zh) * 2016-12-27 2017-05-31 努比亚技术有限公司 一种搜索引擎的测试方法及移动终端
CN106709076B (zh) * 2017-02-27 2023-09-29 华南理工大学 基于协同过滤的社交网络推荐装置及方法
CN108710620B (zh) * 2018-01-18 2022-05-20 日照格朗电子商务有限公司 一种基于用户的k-最近邻算法的图书推荐方法
CN108334592B (zh) * 2018-01-30 2021-11-02 南京邮电大学 一种基于内容与协同过滤相结合的个性化推荐方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291607A (zh) * 2016-03-31 2017-10-24 高德信息技术有限公司 一种针对搜索引擎的评测方法及装置
US20180018333A1 (en) * 2016-07-18 2018-01-18 Bioz, Inc. Continuous evaluation and adjustment of search engine results
CN106777248A (zh) * 2016-12-27 2017-05-31 努比亚技术有限公司 一种搜索引擎测试评价方法和装置
CN107273404A (zh) * 2017-04-26 2017-10-20 努比亚技术有限公司 搜索引擎的评估方法、装置及计算机可读存储介质
CN109739768A (zh) * 2018-12-29 2019-05-10 深圳Tcl新技术有限公司 搜索引擎评测方法、装置、设备以及可读存储介质

Also Published As

Publication number Publication date
CN109739768B (zh) 2021-03-30
CN109739768A (zh) 2019-05-10

Similar Documents

Publication Publication Date Title
WO2020135059A1 (zh) 搜索引擎评测方法、装置、设备以及可读存储介质
US10409874B2 (en) Search based on combining user relationship datauser relationship data
US9607325B1 (en) Behavior-based item review system
US20200226133A1 (en) Knowledge map building system and method
US11195050B2 (en) Machine learning to generate and evaluate visualizations
TWI533246B (zh) 使用者未知興趣之探索方法與系統
US10902077B2 (en) Search result aggregation method and apparatus based on artificial intelligence and search engine
WO2017024884A1 (zh) 一种搜索意图识别方法及装置
CN109167816B (zh) 信息推送方法、装置、设备和存储介质
US8489604B1 (en) Automated resource selection process evaluation
US20160283952A1 (en) Ranking information providers
US20190155846A1 (en) Search result displaying method and apparatus
KR20150057987A (ko) 추천 엔진에 기초하는 일반화된 그래프, 규칙 및 공간 구조
US9286379B2 (en) Document quality measurement
KR102150660B1 (ko) 검색 이력 기반 디지털 컨텐츠 추천 방법 및 그 장치
US20150127491A1 (en) Determining search relevance from user feedback
CN111061954B (zh) 搜索结果排序方法、装置及存储介质
US20120158710A1 (en) Multi-tiered information retrieval training
CN111913954B (zh) 智能数据标准目录生成方法和装置
KR20140098314A (ko) 추천 컨텐츠 및 연관 컨텐츠 제공 방법
US20150160847A1 (en) System and method for searching through a graphic user interface
CN111159563A (zh) 用户兴趣点信息的确定方法、装置、设备及存储介质
US9213745B1 (en) Methods, systems, and media for ranking content items using topics
US20140095465A1 (en) Method and apparatus for determining rank of web pages based upon past content portion selections
CN113127720A (zh) 一种热搜词确定方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19903093

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19903093

Country of ref document: EP

Kind code of ref document: A1