WO2018205391A1 - 信息检索准确性评估方法、系统、装置及计算机可读存储介质 - Google Patents
信息检索准确性评估方法、系统、装置及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2018205391A1 WO2018205391A1 PCT/CN2017/091355 CN2017091355W WO2018205391A1 WO 2018205391 A1 WO2018205391 A1 WO 2018205391A1 CN 2017091355 W CN2017091355 W CN 2017091355W WO 2018205391 A1 WO2018205391 A1 WO 2018205391A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- retrieval
- accuracy
- result
- search result
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2425—Iterative querying; Query formulation based on the results of a preceding query
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/9038—Presentation of query results
Definitions
- the present invention relates to the field of information retrieval, and in particular, to a method, system, device and computer readable storage medium for evaluating information retrieval accuracy.
- MRR Mel Reciprocal Rank
- Mean Average Precision is the arithmetic mean (Mean) of the average of the accuracy (ie, Average Precision) retrieved for each relevant document.
- DCG Discounted Cumulative Gain
- the first method is the simplest and most versatile, but the calculation is too large, and it is necessary to manually label the relevance of all the search results, and does not consider the ordering of the results and affect the accuracy.
- the second method is relatively simple, the method only considers the first correlation result in the search. In actual engineering applications, the user may need to view multiple results compared to focusing only on the first related result. Comprehensive evaluation, so this method can not meet the user's use well in actual use, and the accuracy is low.
- the third method comprehensively considers the sorting of related results and all the correlations, the method needs to consider the ordering of all the results in the repository, requires large-scale manual screening, wastes manpower and material resources, is inefficient, and is prone to errors.
- the fourth method is also that there are too many artificial factors required in the scoring process, which is difficult to quantify.
- the current methods for judging the accuracy of information retrieval results are computationally intensive, require large-scale manual screening, and have low accuracy.
- An object of the present invention is to provide an information retrieval accuracy evaluation method, apparatus and computer readable storage medium, which aim to solve the above problems in the current information retrieval accuracy evaluation method.
- a first aspect of the present invention provides an information retrieval accuracy evaluation method, The method comprises the following steps:
- a second aspect of the present invention provides an information retrieval accuracy evaluation system, including:
- a search module configured to retrieve at least one first search result corresponding to the predetermined keyword by using a predetermined first search system, and retrieve at least one second corresponding to the keyword by using a predetermined second search system Search Results;
- a sequence number generating module configured to generate a first search sequence number corresponding to the first search result and a second search sequence number corresponding to the second search result according to a preset sequence number generation rule
- the accuracy judging module is configured to analyze the generated first search serial number and second search serial number according to a predetermined accuracy analysis rule to analyze the accuracy of the first retrieval system relative to the second retrieval system.
- a third aspect of the present invention provides an information retrieval accuracy evaluation apparatus, comprising: a memory, a processor, and an information retrieval accuracy evaluation system stored on the memory and operable on the processor, wherein the information retrieval accuracy evaluation system is The following steps are performed when the processor executes:
- a fourth aspect of the invention provides a computer readable storage medium having stored thereon at least one computer readable instruction executable by a processing device to:
- the information retrieval accuracy evaluation method, device and computer of the present invention can be Reading the storage medium, first determining a search result corresponding to the predetermined keyword retrieved by the retrieval system, and generating a search serial number corresponding to the search result according to a preset sequence number generation rule, and secondly, passing the predetermined accuracy
- the sex analysis rule analyzes the retrieval sequence number to analyze the accuracy of the retrieval system.
- the method, device and computer readable storage medium for implementing the information retrieval accuracy of the present invention effectively avoid manual labeling of all retrieval results and reduce the amount of calculation, and also consider the ranking of retrieval results related to preset keywords in the retrieval results. , effectively improve the accuracy of the evaluation of the retrieval system.
- FIG. 1 is a schematic diagram of an operating environment of an embodiment of an information retrieval accuracy evaluation system according to the present invention
- Figure 3 is a step of the accuracy analysis rule in step S3 shown in Figure 1;
- FIG. 4 is a schematic diagram of functional modules according to an embodiment of the present invention.
- FIG. 5 is a schematic structural diagram of a serial number generating module shown in FIG. 4;
- FIG. 6 is a schematic structural diagram of the accuracy judging module shown in FIG. 4.
- FIG. 1 is a schematic diagram of an operating environment of a preferred embodiment of the information retrieval accuracy evaluation system 10 of the present invention.
- the information retrieval accuracy evaluation system 10 is installed and operated in the information retrieval accuracy evaluation device 1.
- the information retrieval accuracy evaluation device 1 is an apparatus capable of automatically performing numerical calculation and/or information processing in accordance with an instruction set or stored in advance.
- the information retrieval accuracy evaluation apparatus 1 may be a computer, a single network server, a server group composed of a plurality of network servers, or a cloud-based cloud composed of a large number of hosts or network servers, wherein the cloud computing is a distributed computing one.
- the cloud computing is a distributed computing one.
- kind a super virtual computer consisting of a group of loosely coupled computers.
- the information retrieval accuracy evaluation apparatus 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13 communicably connected to each other through a system bus. It is to be noted that FIG. 1 only shows the information retrieval accuracy evaluation device 1 having the components 11-13, but it should be understood that not all of the illustrated components are required to be implemented, and alternative implementations may be more or less. s component.
- the memory 11 includes a memory and at least one type of readable storage medium.
- the memory provides a cache for the operation of the information retrieval accuracy evaluation device 1;
- the readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like.
- the storage medium may be an internal storage unit of the information retrieval accuracy evaluation device 1, such as a hard disk of the information retrieval accuracy evaluation device 1.
- the non-volatile storage medium may also be an external storage device of the information accuracy evaluation device 1, such as a plug-in hard disk equipped with the information retrieval accuracy evaluation device 1, and a smart memory card (Smart Media) Card, SMC), Secure Digital (Secure Digital, SD) cards, flash cards, etc.
- the readable storage medium of the storage device 11 is generally used to store an operating system and various types of application software installed in the information retrieval accuracy evaluation device 1, for example, the information retrieval accuracy evaluation system 10 in an embodiment of the present application. Program code, etc.
- the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
- Processor 12 may be a Central Processing Unit (CPU), microprocessor or other data processing chip in some embodiments.
- the processor 12 is generally used to control the overall operation of the information retrieval accuracy evaluation apparatus 1, for example, in the present embodiment, for running program code or processing data stored in the memory 11, for example, the execution information accuracy evaluation system 10 and the like.
- the network interface 13 may include a wireless network interface or a wired network interface, and the network interface 13 is generally used to establish a communication connection between the information retrieval accuracy evaluation device 1 and other electronic devices.
- the information retrieval accuracy evaluation apparatus 1 further includes a display (not shown in the display), and in some embodiments, the display may be an LED display, a liquid crystal display, a touch liquid crystal display, and OLED (Organic Light-Emitting Diode) touch sensor.
- the display is used to display information processed in the information retrieval accuracy evaluation device 1 and a user interface for displaying visualization, such as an information retrieval result display interface or the like.
- the information retrieval accuracy evaluation system 10 includes at least one computer readable instruction stored in the memory 11, the at least one computer readable instruction being executable by the processor 12 to implement the information retrieval accuracy evaluation method of the embodiments of the present application. As described later, the at least one computer readable instruction can be classified into different logic modules depending on the functions implemented by its various parts.
- the information retrieval accuracy evaluation system 10 when executed by the processor 12, the following operations are performed: first, at least one first retrieval result corresponding to a predetermined keyword is retrieved by using a predetermined first retrieval system, And searching for at least one second search result corresponding to the keyword by using a predetermined second search system; and then generating a first search serial number corresponding to the first search result according to a preset sequence number generation rule And the second search sequence number corresponding to the second search result; finally analyzing the generated first search serial number and the second search serial number according to a predetermined accuracy analysis rule to analyze the The accuracy of the first retrieval system relative to the second retrieval system is described.
- FIG. 2 is a schematic flowchart of an embodiment of the present invention.
- the information retrieval accuracy evaluation method of this embodiment includes the following steps:
- step S1 the search result corresponding to the predetermined keyword is retrieved by using a predetermined retrieval system.
- the predetermined retrieval system includes a first retrieval system and a second retrieval system.
- the first retrieval system and the second retrieval system may be unrelated retrieval systems, or the second retrieval system is an upgrade system optimized by the first retrieval system.
- the first search system retrieves a first search result corresponding to the predetermined keyword, and uses the second search system to retrieve a keyword corresponding to the same keyword as the predetermined keyword searched by the first search system.
- Second search results Understandably, the first search result is a plurality of contents Different retrieval results, the second retrieval result is also a plurality of retrieval results with different contents. The number of the first search result and the second search result may be the same or different.
- step S2 a retrieval sequence number is generated according to a preset sequence number generation rule.
- the first search sequence number corresponding to the first search result is generated according to a preset sequence number generation rule
- the second search corresponding to the second search result is generated. serial number.
- the step comprises:
- the third search result matching the predetermined keyword is selected from the first search result according to the predetermined screening rule, and the fourth search result matching the predetermined keyword is selected from the second search result.
- the search content includes a name of the related webpage and a link address content matching the search keyword, a name of the related document matching the search keyword, and a link address content.
- the predetermined screening rule includes: manually extracting a search result matching the predetermined keyword from the first search result and the second search result, or determining according to a mapping relationship between the predetermined keyword and the related word.
- Corresponding words corresponding to the predetermined keywords, and each of the search results includes a predetermined number of keywords and corresponding total words, and if the total number of the search results is greater than or equal to the preset number, the search result is determined as a search result matching the predetermined keyword; if the total number corresponding to the search result is less than the preset number, determining that the search result is a search result that does not match the predetermined keyword.
- step S3 the generated retrieval sequence number is analyzed according to a predetermined accuracy analysis rule to analyze the accuracy of the retrieval system.
- the generated first search serial number and second search serial number are analyzed according to a predetermined accuracy analysis rule to analyze the first search system. And the accuracy of the second retrieval system.
- the present embodiment retrieves the retrieval results of the respective retrieval systems corresponding to the predetermined keywords by using different retrieval systems, and then filters the retrieval results matching the retrieval keywords from the respective retrieval results.
- the search results matching the search keywords are sorted according to the content of the search results to obtain different sorting numbers corresponding to different retrieval systems, and finally the different sorting numbers are analyzed according to the formula of the reservation, and the corresponding corresponding systems are analyzed.
- the accuracy analysis rule includes the following steps:
- the step includes:
- the preset formula is 1/Log(1+N), where N represents the number in the retrieval serial number.
- the step includes summing the respective discount values in the first discount set, obtaining a first accuracy rate corresponding to the first retrieval system, and summing the respective discount values in the second discount set. A second accuracy corresponding to the second retrieval system is obtained.
- the step includes analyzing the first accuracy rate and the second accuracy rate to determine the accuracy of the first retrieval system relative to the second retrieval system. Specifically, the accuracy of the first retrieval system and the second retrieval system is determined by comparing the magnitude relationship between the first accuracy rate and the second accuracy rate.
- determining the accuracy of the first retrieval system and the second retrieval system comprises: analyzing a magnitude relationship between the first accuracy rate and the second accuracy rate, and determining the first retrieval if the first accuracy rate is greater than the second accuracy rate The retrieval result of the system is more accurate than the retrieval result of the second retrieval system; if the first accuracy rate is less than the second accuracy rate, it is determined that the retrieval result of the second retrieval system is more accurate than the retrieval result of the first retrieval system; The rate is equal to the second accuracy, and it is determined that the retrieval result of the first retrieval system is the same as the retrieval result of the second retrieval system.
- each of the two different first retrieval systems and the second retrieval system performs a search with the same keyword, and in the first retrieval system, the first retrieval system returns sequentially.
- the first discount set is: 1/Log(1+1), 1/Log(1+2), 1/Log(1+4), 1/Log(1+5) ), 1/Log(1+9).
- the first 10 retrieval results returned by the second retrieval system are sequentially selected, and 6 matching retrieval results are obtained according to the preset judgment criteria, and the obtained second serial number is 1, 6, 7, 8, 9, 10, then the discount analysis according to the preset formula 1 / Log (1 + N), the second discount set is: 1 / Log (1 + 1), 1 / Log (1 + 6) ), 1/Log(1+7), 1/Log(1+8), 1/Log(1+9), 1/Log(1+10).
- the respective discount values in the first discount set are summed to obtain a first accuracy rate corresponding to the first retrieval system L1.
- the respective discount values in the second discount set are summed to obtain a second accuracy rate L2 corresponding to the second retrieval system.
- L2 (1/Log(1+1))+(1/Log(1+6))+(1/Log(1+7))+(1/Log(1+8))+(1/Log (1+9))+(1/Log(1+10)). Comparing the magnitudes of L1 and L2, it can be seen that the value of L1 is greater than the value of L2, then it is determined The retrieval result of the first retrieval system is more accurate than the retrieval result of the second retrieval system.
- the second retrieval system is the retrieval system optimized by the first retrieval system, it can be determined that the optimization of the first retrieval system is unsuccessful.
- the second retrieval system retrieves the number of retrieval results matching the preset retrieval keywords (6) more than the first retrieval system retrieves the retrieval matching the preset retrieval keywords.
- Results (5) but the first retrieval system retrieves the search results matching the preset search keywords in the returned search results, the overall ranking is better than the search results retrieved by the first retrieval system.
- the ranking in the result is high. Therefore, it is determined that the retrieval result of the first retrieval system is more accurate than the retrieval result of the second retrieval system, and the accuracy analysis of the accurate information retrieval result is given under the condition that the calculation amount is small. result.
- the present embodiment retrieves the retrieval results of the respective retrieval systems corresponding to the predetermined keywords by using different retrieval systems, and then filters the retrieval results matching the retrieval keywords from the respective retrieval results.
- the search results matching the search keywords are sorted according to the content of the search results to obtain different sorting numbers corresponding to different retrieval systems, and finally the different sorting numbers are analyzed according to the formula of the reservation, and the corresponding corresponding systems are analyzed.
- FIG. 4 is a functional block diagram of a preferred embodiment of the information retrieval accuracy evaluation system 10 of the present invention.
- the information retrieval accuracy evaluation system 10 may be divided into one or more modules, one or more modules being stored in the memory 11 and being processed by one or more processors (this embodiment is a processor) 12) Executed to complete the present invention.
- the information retrieval accuracy evaluation system 10 can be divided into a retrieval module 101, a serial number generation module 102, and an accuracy determination module 103.
- module refers to a series of computer program instruction segments capable of performing a specific function, and is more suitable for describing the execution process of the information retrieval accuracy evaluation system 10 in the electronic device 1 than the program.
- the search module 101 is configured to retrieve at least one first search result corresponding to the predetermined keyword by using a predetermined first search system, and retrieve a predetermined keyword corresponding to the predetermined keyword by using a predetermined second search system. At least one second search result.
- the serial number generating module 102 is configured to generate a first search serial number corresponding to the first search result and a second search serial number corresponding to the second search result according to a preset sequence number generation rule.
- the accuracy determining module 103 is configured to analyze the generated first search serial number and second search serial number according to a predetermined accuracy analysis rule to analyze the accuracy of the first search system and the second search system.
- the serial number generation module 102 is divided into a screening unit 1021, a sorting number generating unit 1022, and a serial number generating unit 1023.
- the screening unit 1021 is configured to filter, according to a predetermined screening rule, a third search result that matches a predetermined keyword from the first search result, and select a keyword that matches the predetermined keyword from the second search result.
- the fourth search result is configured to filter, according to a predetermined screening rule, a third search result that matches a predetermined keyword from the first search result, and select a keyword that matches the predetermined keyword from the second search result.
- the sorting number generating unit 1022 is configured to determine a first sorting number of each of the third search results in the first search result, and determine that each of the search contents in the fourth search result is in the second search result.
- the second sort number in .
- the serial number generating unit 1023 is configured to generate a first search serial number corresponding to the first search result according to the first sorting number, and generate a second search serial number corresponding to the second search result according to the second sorting number.
- the accuracy judging module 103 is divided into a first calculating unit 1031, a second calculating unit 1032, a third calculating unit 1033, and a judging unit 1034.
- the first calculating unit 1031 is configured to respectively substitute each of the generated first search serial numbers into a preset formula, calculate a first discount value corresponding to each number in the first search serial number, and calculate The set of the respective first discount values is the first discount set corresponding to the first retrieval system.
- the second calculating unit 1032 is configured to respectively substitute each of the generated second search serial numbers into a preset formula, and calculate a second discount value corresponding to each number in the second search serial number, and calculate the calculated
- the set of the respective second discount values is the second discount set corresponding to the second retrieval system
- the third calculating unit 1033 is configured to sum the respective discount values in the first discount set, obtain a first accuracy rate corresponding to the first retrieval system, and sum each discount value in the second discount set to obtain a second retrieval system. Corresponding second accuracy rate.
- the determining unit 1034 is configured to analyze the first accuracy rate and the second accuracy rate to determine the accuracy of the first retrieval system and the second retrieval system.
- the determining unit 1043 is configured to be an analysis subunit (not shown in the figure), and the analyzing subunit is configured to analyze a size relationship between the first accuracy rate and the second accuracy rate;
- first accuracy rate is greater than the second accuracy rate, determining that the retrieval result of the first retrieval system is more accurate than the retrieval result of the second retrieval system;
- first accuracy is less than the second accuracy, determining that the retrieval result of the second retrieval system is more accurate than the retrieval result of the first retrieval system;
- the first accuracy rate is equal to the second accuracy rate, it is determined that the retrieval result of the first retrieval system is the same as the retrieval result of the second retrieval system.
- the present embodiment retrieves the retrieval results of the respective retrieval systems corresponding to the predetermined keywords by using different retrieval systems, and then filters the retrieval results matching the retrieval keywords from the respective retrieval results.
- the search results matching the search keywords are sorted according to the content of the search results to obtain different sorting numbers corresponding to different retrieval systems, and finally the different sorting numbers are analyzed according to the formula of the reservation, and the corresponding corresponding systems are analyzed.
- the information retrieval accuracy evaluation method and system of the present invention saves the step of large-scale manual labeling of data compared with the current common and popular accuracy detection method, and reduces the labor workload.
- the accuracy of the search system for retrieving the search results is further improved.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (20)
- 一种信息检索准确性评估方法,其特征在于,所述方法包括如下步骤:A、利用预先确定的第一检索系统检索出与预先确定的关键词对应的至少一个第一检索结果,且利用预先确定的第二检索系统检索出与所述关键词对应的至少一个第二检索结果;B、根据预先设定的序列号生成规则,生成所述第一检索结果对应的第一检索序列号、及所述第二检索结果对应的第二检索序列号;C、根据预先确定的准确性分析规则对生成的所述第一检索序列号和所述第二检索序列号进行分析,以分析出所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求1所述的信息检索准确性评估方法,其特征在于,所述步骤B包括如下步骤:E、根据预先确定的筛选规则从所述第一检索结果中筛选出与所述关键词匹配的第三检索结果,从所述第二检索结果中筛选出与所述关键词匹配的第四检索结果;F、确定所述第三检索结果中各个检索内容在所述第一检索结果中的第一排序号,确定所述第四检索结果中各个检索内容在所述第二检索结果中的第二排序号;G、根据所述第一排序号生成所述第一检索结果对应的第一检索序列号,并根据所述第二排序号生成所述第二检索结果对应的第二检索序列号。
- 根据权利要求1或2所述的信息检索准确性评估方法,其特征在于,所述预先确定的筛选规则包括:人工方式从所述第一检索结果及所述第二检索结果中筛选出与所述关键词匹配的检索结果;或者根据预先确定的关键词与关联词的映射关系,确定所述关键词对应的关联词,统计各个检索结果中包含所述关键词及其对应的关联词的总数量,若有检索结果对应的总数量大于或等于预设数量,则确定该检索结果为与所述关键词匹配的检索结果,若有检索结果对应的总数量小于预设数量,则确定该检索结果为与所述关键词不匹配的检索结果。
- 根据权利要求1所述的信息检索准确性评估方法,其特征在于,所述预先确定的准确性分析规则包括:分别将生成的第一检索序列号中的各个号码代入预设的公式,计算出所述第一检索序列号中的各个号码对应的第一折扣值,其中,计算出的各个第一折扣值的集合为所述第一检索系统对应的第一折扣集;分别将生成的第二检索序列号中的各个号码代入预设的公式,计算出与所述第二检索序列号中的各个号码对应的第二折扣值,其中,计算出的各个第二折扣值的集合为所述第二检索系统对应的第二折扣集;对所述第一折扣集中的各个折扣值求和,得到所述第一检索系统对应的 第一准确率,并对所述第二折扣集中的各个折扣值求和,得到所述第二检索系统对应的第二准确率;对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求4所述的信息检索准确性评估方法,其特征在于,所述对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性的步骤包括:分析所述第一准确率和所述第二准确率之间的大小关系;若所述第一准确率大于所述第二准确率,则确定所述第一检索系统的检索结果比所述第二检索系统的检索结果更准确;若所述第一准确率小于所述第二准确率,则确定所述第二检索系统的检索结果比所述第一检索系统的检索结果更准确;若所述第一准确率等于所述第二准确率,则确定所述第一检索系统的检索结果与所述第二检索系统的检索结果的准确率相同。
- 一种信息检索准确性评估系统,其特征在于,所述系统包括:检索模块,用于利用预先确定的第一检索系统检索出与预先确定的关键词对应的至少一个第一检索结果,且利用预先确定的第二检索系统检索出与所述关键词对应的至少一个第二检索结果;序列号生成模块,用于根据预先设定的序列号生成规则,生成所述第一检索结果对应的第一检索序列号、及所述第二检索结果对应的第二检索序列号;准确性判断模块,用于根据预先确定的准确性分析规则对生成的所述第一检索序列号和所述第二检索序列号进行分析,以分析出所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求6所述的信息检索准确性评估系统,其特征在于,所述序列号生成模块包括筛选单元、排序号生成单元及序列号生成单元;所述筛选单元,用于根据预先确定的筛选规则从第一检索结果中筛选出与预先确定的关键词匹配的第三检索结果,从第二检索结果中筛选出与预先确定的关键词匹配的第四检索结果;所述排序号生成单元,用于确定第三检索结果中各个检索内容在第一检索结果中的第一排序号,确定第四检索结果中各个检索内容在第二检索结果中的第二排序号;所述序列号生成单元,用于根据所述第一排序号生成所述第一检索结果对应的第一检索序列号,并根据所述第二排序号生成所述第二检索结果对应的第二检索序列号。
- 根据权利要求6所述的信息检索准确性评估系统,其特征在于,所述预先确定的筛选规则包括:根据预先确定的关键词与关联词的映射关系,确定所述关键词对应的关联词,统计各个检索结果中包含所述关键词及其对应的关联词的总数量,若有检索结果对应的总数量大于或等于预设数量,则确定该检索结果为与所述 关键词匹配的检索结果,若有检索结果对应的总数量小于预设数量,则确定该检索结果为与所述关键词不匹配的检索结果。
- 根据权利要求6所述的信息检索准确性评估系统,其特征在于,所述准确性判断模块包括:第一计算单元,用于分别将生成的第一检索序列号中的各个号码代入预设的公式,计算出所述第一检索序列号中的各个号码对应的第一折扣值,其中,计算出的各个第一折扣值的集合为所述第一检索系统对应的第一折扣集;第二计算单元,用于分别将生成的第二检索序列号中的各个号码代入预设的公式,计算出与所述第二检索序列号中的各个号码对应的第二折扣值,其中,计算出的各个第二折扣值的集合为所述第二检索系统对应的第二折扣集;第三计算单元,用于对所述第一折扣集中的各个折扣值求和,得到所述第一检索系统对应的第一准确率,并对所述第二折扣集中的各个折扣值求和,得到所述第二检索系统对应的第二准确率;判断单元,用于对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求9所述的信息检索准确性评估系统,其特征在于,所述判断单元包括:分析子单元,用于分析所述第一准确率和所述第二准确率之间的大小关系;若所述第一准确率大于所述第二准确率,则确定所述第一检索系统的检索结果比所述第二检索系统的检索结果更准确;若所述第一准确率小于所述第二准确率,则确定所述第二检索系统的检索结果比所述第一检索系统的检索结果更准确;若所述第一准确率等于所述第二准确率,则确定所述第一检索系统的检索结果与所述第二检索系统的检索结果的准确率相同。
- 一种信息检索准确性评估装置,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的信息检索准确性评估系统,所述信息检索准确性评估系统被所述处理器执行时执行如下步骤:A、利用预先确定的第一检索系统检索出与预先确定的关键词对应的至少一个第一检索结果,且利用预先确定的第二检索系统检索出与所述关键词对应的至少一个第二检索结果;B、根据预先设定的序列号生成规则,生成所述第一检索结果对应的第一检索序列号、及所述第二检索结果对应的第二检索序列号;C、根据预先确定的准确性分析规则对生成的所述第一检索序列号和所述第二检索序列号进行分析,以分析出所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求11所述的信息检索准确性评估装置,其特征在于,所述处理器执行所述步骤B包括如下步骤:E、根据预先确定的筛选规则从所述第一检索结果中筛选出与所述关键词匹配的第三检索结果,从所述第二检索结果中筛选出与所述关键词匹配的第四检索结果;F、确定所述第三检索结果中各个检索内容在所述第一检索结果中的第一排序号,确定所述第四检索结果中各个检索内容在所述第二检索结果中的第二排序号;G、根据所述第一排序号生成所述第一检索结果对应的第一检索序列号,并根据所述第二排序号生成所述第二检索结果对应的第二检索序列号。
- 根据权利要求11所述的信息检索准确性评估装置,其特征在于,所述预先确定的筛选规则包括:根据预先确定的关键词与关联词的映射关系,确定所述关键词对应的关联词,统计各个检索结果中包含所述关键词及其对应的关联词的总数量,若有检索结果对应的总数量大于或等于预设数量,则确定该检索结果为与所述关键词匹配的检索结果,若有检索结果对应的总数量小于预设数量,则确定该检索结果为与所述关键词不匹配的检索结果。
- 根据权利要求11所述的信息检索准确性评估装置,其特征在于,所述预先确定的准确性分析规则包括:分别将生成的第一检索序列号中的各个号码代入预设的公式,计算出所述第一检索序列号中的各个号码对应的第一折扣值,其中,计算出的各个第一折扣值的集合为所述第一检索系统对应的第一折扣集;分别将生成的第二检索序列号中的各个号码代入预设的公式,计算出与所述第二检索序列号中的各个号码对应的第二折扣值,其中,计算出的各个第二折扣值的集合为所述第二检索系统对应的第二折扣集;对所述第一折扣集中的各个折扣值求和,得到所述第一检索系统对应的第一准确率,并对所述第二折扣集中的各个折扣值求和,得到所述第二检索系统对应的第二准确率;对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求14所述的信息检索准确性评估装置,其特征在于,所述处理器执行的对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性的步骤包括:分析所述第一准确率和所述第二准确率之间的大小关系;若所述第一准确率大于所述第二准确率,则确定所述第一检索系统的检索结果比所述第二检索系统的检索结果更准确;若所述第一准确率小于所述第二准确率,则确定所述第二检索系统的检索结果比所述第一检索系统的检索结果更准确;若所述第一准确率等于所述第二准确率,则确定所述第一检索系统的检索结果与所述第二检索系统的检索结果的准确率相同。
- 一种计算机可读存储介质,其上存储有至少一个可被处理设备执行以 实现以下操作的计算机可读指令:A、利用预先确定的第一检索系统检索出与预先确定的关键词对应的至少一个第一检索结果,且利用预先确定的第二检索系统检索出与所述关键词对应的至少一个第二检索结果;B、根据预先设定的序列号生成规则,生成所述第一检索结果对应的第一检索序列号、及所述第二检索结果对应的第二检索序列号;C、根据预先确定的准确性分析规则对生成的所述第一检索序列号和所述第二检索序列号进行分析,以分析出所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求16所述的存储介质,其特征在于,所述至少一个计算机指令执行的所述步骤B包括如下步骤:E、根据预先确定的筛选规则从所述第一检索结果中筛选出与所述关键词匹配的第三检索结果,从所述第二检索结果中筛选出与所述关键词匹配的第四检索结果;F、确定所述第三检索结果中各个检索内容在所述第一检索结果中的第一排序号,确定所述第四检索结果中各个检索内容在所述第二检索结果中的第二排序号;G、根据所述第一排序号生成所述第一检索结果对应的第一检索序列号,并根据所述第二排序号生成所述第二检索结果对应的第二检索序列号。
- 根据权利要求16所述的存储介质,其特征在于,所述预先确定的筛选规则包括:人工方式从所述第一检索结果及所述第二检索结果中筛选出与所述关键词匹配的检索结果;或者根据预先确定的关键词与关联词的映射关系,确定所述关键词对应的关联词,统计各个检索结果中包含所述关键词及其对应的关联词的总数量,若有检索结果对应的总数量大于或等于预设数量,则确定该检索结果为与所述关键词匹配的检索结果,若有检索结果对应的总数量小于预设数量,则确定该检索结果为与所述关键词不匹配的检索结果。
- 根据权利要求16所述的存储介质,其特征在于,所述预先确定的准确性分析规则包括:分别将生成的第一检索序列号中的各个号码代入预设的公式,计算出所述第一检索序列号中的各个号码对应的第一折扣值,其中,计算出的各个第一折扣值的集合为所述第一检索系统对应的第一折扣集;分别将生成的第二检索序列号中的各个号码代入预设的公式,计算出与所述第二检索序列号中的各个号码对应的第二折扣值,其中,计算出的各个第二折扣值的集合为所述第二检索系统对应的第二折扣集;对所述第一折扣集中的各个折扣值求和,得到所述第一检索系统对应的第一准确率,并对所述第二折扣集中的各个折扣值求和,得到所述第二检索系统对应的第二准确率;对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性。
- 根据权利要求19所述的存储介质,其特征在于,所述至少一个计算机指令执行的对所述第一准确率和所述第二准确率进行分析,以确定所述第一检索系统相对于所述第二检索系统的准确性的步骤包括:分析所述第一准确率和所述第二准确率之间的大小关系;若所述第一准确率大于所述第二准确率,则确定所述第一检索系统的检索结果比所述第二检索系统的检索结果更准确;若所述第一准确率小于所述第二准确率,则确定所述第二检索系统的检索结果比所述第一检索系统的检索结果更准确;若所述第一准确率等于所述第二准确率,则确定所述第一检索系统的检索结果与所述第二检索系统的检索结果的准确率相同。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018553419A JP6588661B2 (ja) | 2017-05-10 | 2017-06-30 | 情報検索精度の評価方法、システム、装置及びコンピュータ読み取り可能な記憶媒体 |
US16/088,829 US20200380037A1 (en) | 2017-05-10 | 2017-06-30 | Information Retrieval Precision Evaluation Method, System and Device and Computer-Readable Storage Medium |
SG11201900254RA SG11201900254RA (en) | 2017-05-10 | 2017-06-30 | Information retrieval precision evaluation method, system and device and computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710327380.3A CN107688595B (zh) | 2017-05-10 | 2017-05-10 | 信息检索准确性评估方法、装置及计算机可读存储介质 |
CN201710327380.3 | 2017-05-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018205391A1 true WO2018205391A1 (zh) | 2018-11-15 |
Family
ID=61152458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/091355 WO2018205391A1 (zh) | 2017-05-10 | 2017-06-30 | 信息检索准确性评估方法、系统、装置及计算机可读存储介质 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20200380037A1 (zh) |
JP (1) | JP6588661B2 (zh) |
CN (1) | CN107688595B (zh) |
SG (1) | SG11201900254RA (zh) |
WO (1) | WO2018205391A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109582751B (zh) * | 2018-11-29 | 2021-01-01 | 百度在线网络技术(北京)有限公司 | 一种检索效果的度量方法及服务器 |
CN111402973B (zh) * | 2020-03-02 | 2023-07-07 | 平安科技(深圳)有限公司 | 信息匹配分析方法、装置、计算机系统及可读存储介质 |
CN113254766A (zh) * | 2021-05-20 | 2021-08-13 | 北京百度网讯科技有限公司 | 信息的检索方法和装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050076019A1 (en) * | 2003-10-06 | 2005-04-07 | Lincoln Jackson | Smart browser panes |
CN202033748U (zh) * | 2011-04-22 | 2011-11-09 | 阿里巴巴集团控股有限公司 | 搜索引擎性能测试系统 |
CN102622296A (zh) * | 2012-02-21 | 2012-08-01 | 百度在线网络技术(北京)有限公司 | 搜索引擎模块的测试方法、系统及其装置 |
CN105095464A (zh) * | 2015-07-30 | 2015-11-25 | 北京奇虎科技有限公司 | 一种检索系统的检测方法和装置 |
CN106156179A (zh) * | 2015-04-20 | 2016-11-23 | 阿里巴巴集团控股有限公司 | 一种信息检索方法及装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008017103A1 (en) * | 2006-08-10 | 2008-02-14 | National Ict Australia Limited | Optimisation of a scoring function |
CN100440224C (zh) * | 2006-12-01 | 2008-12-03 | 清华大学 | 一种搜索引擎性能评价的自动化处理方法 |
US8935258B2 (en) * | 2009-06-15 | 2015-01-13 | Microsoft Corporation | Identification of sample data items for re-judging |
RU2608886C2 (ru) * | 2014-06-30 | 2017-01-25 | Общество С Ограниченной Ответственностью "Яндекс" | Ранжиратор результатов поиска |
CN105573887B (zh) * | 2015-12-14 | 2018-07-13 | 合一网络技术(北京)有限公司 | 搜索引擎的质量评估方法和装置 |
-
2017
- 2017-05-10 CN CN201710327380.3A patent/CN107688595B/zh active Active
- 2017-06-30 SG SG11201900254RA patent/SG11201900254RA/en unknown
- 2017-06-30 JP JP2018553419A patent/JP6588661B2/ja active Active
- 2017-06-30 US US16/088,829 patent/US20200380037A1/en not_active Abandoned
- 2017-06-30 WO PCT/CN2017/091355 patent/WO2018205391A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050076019A1 (en) * | 2003-10-06 | 2005-04-07 | Lincoln Jackson | Smart browser panes |
CN202033748U (zh) * | 2011-04-22 | 2011-11-09 | 阿里巴巴集团控股有限公司 | 搜索引擎性能测试系统 |
CN102622296A (zh) * | 2012-02-21 | 2012-08-01 | 百度在线网络技术(北京)有限公司 | 搜索引擎模块的测试方法、系统及其装置 |
CN106156179A (zh) * | 2015-04-20 | 2016-11-23 | 阿里巴巴集团控股有限公司 | 一种信息检索方法及装置 |
CN105095464A (zh) * | 2015-07-30 | 2015-11-25 | 北京奇虎科技有限公司 | 一种检索系统的检测方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
JP2019521406A (ja) | 2019-07-25 |
CN107688595A (zh) | 2018-02-13 |
US20200380037A1 (en) | 2020-12-03 |
JP6588661B2 (ja) | 2019-10-09 |
SG11201900254RA (en) | 2019-02-27 |
CN107688595B (zh) | 2019-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11182445B2 (en) | Method, apparatus, server, and storage medium for recalling for search | |
CN110674429B (zh) | 用于信息检索的方法、装置、设备和计算机可读存储介质 | |
JP5575902B2 (ja) | クエリのセマンティックパターンに基づく情報検索 | |
US11210334B2 (en) | Method, apparatus, server and storage medium for image retrieval | |
WO2012135319A1 (en) | Processing data in a mapreduce framework | |
US10565253B2 (en) | Model generation method, word weighting method, device, apparatus, and computer storage medium | |
CN110222203B (zh) | 元数据搜索方法、装置、设备及计算机可读存储介质 | |
CN109325108B (zh) | 查询处理方法、装置、服务器及存储介质 | |
CN110046298A (zh) | 一种查询词推荐方法、装置、终端设备及计算机可读介质 | |
CN110880136A (zh) | 配套产品的推荐方法、系统、设备和存储介质 | |
CN109933502B (zh) | 电子装置、用户操作记录的处理方法和存储介质 | |
US11288266B2 (en) | Candidate projection enumeration based query response generation | |
WO2019214142A1 (zh) | 电子装置、基于研报数据的预测方法、程序和计算机存储介质 | |
WO2018205391A1 (zh) | 信息检索准确性评估方法、系统、装置及计算机可读存储介质 | |
US9244889B2 (en) | Creating tag clouds based on user specified arbitrary shape tags | |
CN112084150A (zh) | 模型训练、数据检索方法,装置,设备以及存储介质 | |
CN107735785B (zh) | 自动信息检索 | |
US9705972B2 (en) | Managing a set of data | |
CN103678642A (zh) | 一种基于搜索引擎的概念语义相似度度量方法 | |
CN114139530A (zh) | 同义词提取方法、装置、电子设备及存储介质 | |
CN114201376A (zh) | 基于人工智能的日志解析方法、装置、终端设备及介质 | |
CN113656586A (zh) | 情感分类方法、装置、电子设备及可读存储介质 | |
CN109597873B (zh) | 语料数据的处理方法、装置、计算机可读介质及电子设备 | |
US20140164397A1 (en) | Apparatus and method for searching information | |
CN113793193B (zh) | 数据搜索准确性验证方法、装置、设备及计算机可读介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2018553419 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17908843 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17908843 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 18.03.2020) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17908843 Country of ref document: EP Kind code of ref document: A1 |