WO2024032781A1 - 一种算法测试方法、装置和存储介质 - Google Patents

一种算法测试方法、装置和存储介质 Download PDF

Info

Publication number
WO2024032781A1
WO2024032781A1 PCT/CN2023/112635 CN2023112635W WO2024032781A1 WO 2024032781 A1 WO2024032781 A1 WO 2024032781A1 CN 2023112635 W CN2023112635 W CN 2023112635W WO 2024032781 A1 WO2024032781 A1 WO 2024032781A1
Authority
WO
WIPO (PCT)
Prior art keywords
test
run
algorithm
resources
task
Prior art date
Application number
PCT/CN2023/112635
Other languages
English (en)
French (fr)
Inventor
祝丽蓉
商庆园
贾俊诚
陈雲飞
陈明生
吴光明
Original Assignee
虹软科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 虹软科技股份有限公司 filed Critical 虹软科技股份有限公司
Publication of WO2024032781A1 publication Critical patent/WO2024032781A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3696Methods or tools to render software testable
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This article relates to but is not limited to algorithm modeling technology, especially an algorithm testing method, device and storage medium.
  • AI algorithms play an important role in the development of artificial intelligence technology.
  • AI algorithms are characterized by output uncertainty, that is, for the same set of input data, the output data is not always the same. Therefore, it is necessary to test the AI algorithm, construct a test material set as the input of the AI algorithm, and conduct data analysis on the uncertain output of the AI algorithm to judge whether the performance of the AI algorithm meets expectations.
  • the current testing operation of AI algorithms lacks a unified and automated testing process.
  • the testing operation is mainly completed manually by technicians according to actual needs.
  • the testing effect is difficult to evaluate and the testing efficiency is low.
  • Embodiments of the present application provide an algorithm testing method, device and storage medium, which can realize automatic testing of algorithms and improve testing efficiency.
  • the embodiment of this application provides an algorithm testing method, which method includes:
  • Data analysis is performed based on the running results to test the algorithm.
  • encapsulating the algorithm includes:
  • the first encapsulation operation includes: one or more of compilation and compression;
  • the algorithm that has been encapsulated for the first time is encapsulated a second time to make the algorithm become a World Wide Web web service.
  • the method for obtaining the test material set for the algorithm includes:
  • running a stored test task includes:
  • the execution control of the stored test tasks based on the obtained resources includes:
  • the execution control of the stored test tasks based on the obtained resources includes:
  • test tasks are run sequentially in the order in which each test task is generated. After each test task is completed, the resources used to run the test task are reset. Recycle them into the current total available resources, and determine whether the current total available resources are greater than or equal to the resources required by the next test task to be run. If so, run the next test task to be run.
  • the execution control of the stored test tasks based on the obtained resources includes:
  • the test tasks are run sequentially in order from high to low priority of the test tasks. After each test task is completed, the resources of the test task will be run. Recycle the resources into the current total available resources, and determine whether the current total available resources are greater than or equal to the resources required by the next test task to be run. If so, run the next test task to be run.
  • the execution control of the stored test tasks based on the obtained resources includes:
  • test task determines whether there is a test task with a set priority. If it exists, run the test tasks in order from high to low priority. ; When the test tasks with set priorities have been run, if there are still test tasks to be run, the remaining test tasks to be run will be run in sequence according to the time of generating the remaining test tasks, where each test task After the operation is completed, the resources used to run the test task are recycled back to the current total available resources, and it is judged whether the current total available resources are greater than or equal to the resources required by the next test task to be run. If so, the next test task to be run is run. Run the test task.
  • the method further includes:
  • test task When running test tasks in order from high to low priority, each time a test task is run, it is determined whether there is a test task with a higher priority than the running test task. If there is, the currently running test task is suspended. Re-run the test tasks in order from high to low priority.
  • data analysis based on the running results includes one or more of the following:
  • Analyze the test material distribution in the test material set according to the running results obtain the dimensions of the test material set according to the test material distribution, and select at least one dimension to generate statistical analysis results.
  • the method further includes: when performing data analysis based on the running results, including adjusting the test material set for the algorithm according to the running results, based on the encapsulated algorithm and the obtained adjusted test material set for the algorithm Generate and store test tasks based on the test material set of the above algorithm; run the stored test tasks and output the running results; perform data analysis based on the running results.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the steps of the method described in the previous embodiment are implemented.
  • An embodiment of the present application also provides an algorithm testing device, which includes a memory and a processor.
  • the memory stores a program. When the program is read and executed by the processor, the method described in the previous embodiment is implemented.
  • the technical solution recorded in the embodiment of this application encapsulates the algorithm and obtains a test material set for the algorithm to generate a test task; by running the test task and analyzing the running results, the automatic test of the algorithm is realized, which improves algorithm testing. efficiency.
  • Figure 1 is a flow chart of the algorithm testing method provided by the embodiment of this application.
  • Figure 2 is a flow chart for encapsulating an algorithm provided by an embodiment of the present application
  • Figure 3 is a flow chart of running a stored test task provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of an implementation of the algorithm testing method provided by the embodiment of the present application.
  • Figure 5 is a structural diagram of an algorithm testing device provided by an embodiment of the present application.
  • the embodiment of this application provides an algorithm testing method, as shown in Figure 1.
  • the method includes:
  • Step S100 encapsulates the algorithm
  • Encapsulation operations for algorithms include: one or more of compilation and packaging;
  • the algorithm may include multiple versions
  • Step S101 generates and stores test tasks based on the encapsulated algorithm and the obtained test material set for the algorithm
  • each version of the algorithm corresponds to a test material set
  • test material set corresponding to the algorithm can include: male face material, female face material, child face material, adult face material, etc.;
  • test materials in the test material set can be pictures or videos
  • All test materials in the test material set can be stored in the same folder, or can be stored in different folders; storing all test materials for the same algorithm in the same folder will help improve the generation test task s efficiency;
  • the algorithm may include one or more copies
  • each copy corresponds to one or more test materials in the test material set. Different copies can correspond to different test materials, or can correspond to the same test material; all copies of an algorithm correspond to The set of test materials is the test material set for the algorithm;
  • test material set only corresponds to the copy
  • Step S102 runs the stored test task and outputs the running results
  • test tasks There can be one or more stored test tasks, and running each test task will output the running results
  • Step S103 performs data analysis based on the running results to test the algorithm.
  • step S100 encapsulates the algorithm, as shown in Figure 2, including:
  • Step S1001 performs a first encapsulation operation on the algorithm.
  • the first encapsulation operation includes: one or more of compilation and compression;
  • Step S1002 Encapsulate the algorithm that has been encapsulated for the first time for a second time to make the algorithm a World Wide Web web service.
  • An encapsulation operation makes the algorithm become a web service. After becoming a web service, the algorithm can be accessed remotely. For example, it can receive http requests, obtain the corresponding test material set according to the network storage path carried in the request, and return it to the requester, which is the testing process. It provides the possibility to access external storage devices remotely to obtain the data required for testing.
  • a method for obtaining a test material set for the algorithm includes:
  • the technical solution recorded in the embodiment of this application enables remote access to the test material set through the web service, which facilitates the selection and invocation of the test material set.
  • step S102 runs the stored test task, as shown in Figure 3, including:
  • Step S1021 obtains the current total available resources and the resources required to run each test task
  • Step S1022 performs operation control on the stored test tasks according to the obtained resources.
  • the resources may be memory resources, computing resources, etc. of the server.
  • step S1022 performs operation control on the stored test tasks according to the obtained resources, including:
  • step S1022 performs operation control on the stored test tasks according to the obtained resources, including:
  • test tasks When it is judged that the current total available resources are less than the sum of the resources required to run each test task, the test tasks will be run in sequence according to the time when each test task is generated. After each test task is completed, the resources used to run the test task will be recycled. to the current total available resources, and determine whether the current total available resources are greater than or equal to the resources required by the next test task to be run, and if so, run the next test task to be run; or
  • test task Determine whether there is a test task with a set priority. If it exists, run the test tasks in order from high to low priority; when the test task with a set priority has finished running, if there are still test tasks to be run, The remaining to-be-run test tasks are then sequentially run according to the time sequence of generating the remaining to-be-run test tasks, wherein, after each test task is completed, the resources used to run the test task are recycled back to the current total available resources, and determine whether the current total available resources are greater than or equal to the resources required by the next test task to be run, and if so, run the next test task to be run.
  • the test tasks to be run need to be selected.
  • the selection method may be to select according to the order of the test task generation time, or according to the test task generation time. Select in the order of the priority of the tasks; or when some of the stored test tasks are test tasks with set priorities, select first in the order of the priority of the test tasks, and then according to the test task generation time Select in order. Every time a test task is run, the current total available resources will be determined (the current total available resources are constantly changing with the number of currently running test tasks.
  • the solutions described in the embodiments of this application support the simultaneous execution of multiple test tasks and multiple copies when the total available resources are sufficient, effectively improving testing efficiency.
  • the method further includes:
  • each time a test task is run it is determined whether there is a test task with a higher priority than the running test task. If there is, the currently running test task is suspended. Re-run the test tasks in order from high to low priority. The resources used by the suspended test task can be recycled back to the current total available resources; or the resources used by the suspended test task can be retained, and then the suspended test task can be continued to run after the high-priority test task has finished running.
  • the embodiment of the present application continuously re-evaluates the priority of the test task. Judgment, if it is found that there is a test task with a higher priority than the currently running test task, pause the currently running test task, and re-run the higher-priority test task to ensure that the high-priority test task can be run first.
  • step S103 performs data analysis based on the running results, which may include one or more of the following:
  • the evaluation indicators of the algorithm may include: true (TP, true positives): prediction is positive, actual is positive; True negatives (TN, True negatives): Predicted to be negative, actual negative; False positives (FP, False positives): Predicted to be positive, actual negative; False negatives (FN, False negatives): Predicted to be negative, actual positive ;
  • test material set used by the face recognition algorithm contains 100 test materials.
  • the test materials involve male face materials, female face materials, children's face materials, and adult face materials. Face material, then the number of each material is the test material distribution; the material distribution result can be represented by visual graphics;
  • Adjust the test material set for the algorithm according to the operation results such as judging the rationality of material selection according to the material distribution, and adjusting unreasonable test materials; for example, when the material distribution does not meet the preset
  • the material distribution is unreasonable; for example, for example, the distribution of children's face material theoretically involves the faces of children of all ages. If it is found from the running results that the face material of children of a certain age group is missing, it can be Add test materials for children of this age group;
  • the test material set of the face recognition algorithm includes face materials of different genders and ages, then the different genders and ages are the test material sets.
  • Determined dimensions statistical analysis results of all dimensions can be generated, or statistical analysis results of only the preset dimensions selected by the tester can be generated;
  • the statistical analysis results can include: algorithm indicator results (such as algorithm precision, recall rate , and one or more of F1 scores), material distribution results and corresponding missed or misdetected pictures or videos; the statistical analysis results can be represented by visual graphics;
  • Analyze the test material distribution in the test material set according to the running results obtain the dimensions of the test material set according to the test material distribution, and select at least one dimension to generate statistical analysis results; the statistical analysis results can be represented by visual graphics.
  • the method may further include: when any statistical result of the precision, recall, and F1 score of the current version of the algorithm is lower than a preset value through data analysis based on the running results, The corresponding test material set is used for testing the next version of the algorithm.
  • the method when the data analysis includes adjusting the test material set for the algorithm according to the running results, the method further includes:
  • the solution described in the embodiment of this application uses the test material set to obtain the running results, and then adjusts the test material set according to the running results, thereby realizing a closed-loop test process.
  • the closed-loop test process can achieve a self-optimizing effect.
  • the method further includes:
  • Generate a test report based on the results of data analysis For example, automatically collect information based on the information required for the test report to be generated, and automatically generate a test report based on the collected information.
  • the searched information may include: tester , one or more of test time, project information, and version information.
  • the technical solutions recorded in the embodiments of this application can automatically generate test reports, saving testers time in writing test reports, and improving testing efficiency.
  • Figure 4 is a schematic diagram of an implementation of the algorithm testing method recorded in the embodiment of the present application.
  • Step S400 obtains the algorithm
  • Step S401 performs a first encapsulation operation on the algorithm.
  • the first encapsulation operation includes: one or more of compilation and compression;
  • Step S402 Encapsulate the algorithm that has been encapsulated for the first time for a second time to make the algorithm a World Wide Web web service;
  • Step S403 sends a hypertext transfer protocol http request to the web service, where the request includes the network storage path of the test material set;
  • Step S404 Access the network storage path of the test material set through the web service to obtain the corresponding test material set;
  • Step S405 binds the twice-encapsulated algorithm and the obtained test material set for the algorithm, and generates and stores a test task;
  • test task can also be bound with attribute description information of the algorithm
  • Step S406 calls the server to run the test task and determines whether the total available resources of the current server are greater than or equal to the sum of the resources required to run each test task. If so, execute step S407; if not, execute step S408.
  • Step S407 runs all stored test tasks and executes step S409;
  • Step S408 performs operation control on the stored test tasks according to the obtained resources
  • Step S409 performs data analysis based on the running results of the test task
  • Step S410 generates an analysis report based on the data analysis results; the analysis report can be uploaded to a unified platform or sent to a designated tester, and the process ends.
  • Embodiments of the present application also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, the steps of the method described in the previous embodiment are implemented.
  • the embodiment of the present application also provides an algorithm testing device, as shown in Figure 5, including a memory 501 and a processor. 502.
  • the memory 501 stores a program. When the program is read and executed by the processor 502, the method described in the previous embodiment is implemented.
  • computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种算法测试方法、装置和存储介质,其中,所述方法包括:对算法进行封装;根据封装后的算法以及获取的针对所述算法的测试素材集生成并存储测试任务;运行存储的测试任务,输出运行结果;基于所述运行结果进行数据分析,以实现对所述算法的测试,实现了对算法的自动测试,提升了测试效率。

Description

一种算法测试方法、装置和存储介质
本申请要求于2022年08月12日提交中国专利局、申请号为202210968964X、发明名称为“一种算法测试方法、装置和存储介质”的中国专利申请的优先权,其内容应理解为通过引用的方式并入本申请中。
技术领域
本文涉及但不限于算法建模技术,尤指一种算法测试方法、装置和存储介质。
背景技术
人工智能(Artificial Intelligence,AI)算法对人工智能技术的发展起重要作用。AI算法具有输出不确定性的特点,即对同一组输入数据,输出的数据并不总相同。因此需要对AI算法进行测试,构建测试素材集作为AI算法的输入,对AI算法不确定的输出进行数据分析,以评判AI算法的性能是否达到预期。
当前对AI算法的测试操作中缺少统一、自动化的测试流程,主要由技术人员根据实际需要手工完成测试操作,测试效果难以评估,测试效率低。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供了一种算法测试方法、装置和存储介质,能够实现对算法的自动测试,提升了测试效率。
本申请实施例提供了一种算法测试方法,所述方法包括:
对算法进行封装;
根据封装后的算法以及获取的针对所述算法的测试素材集生成并存储测试任务;
运行存储的测试任务,输出运行结果;
基于所述运行结果进行数据分析,以实现对所述算法的测试。
作为一示例,所述对算法进行封装,包括:
对算法进行第一次封装操作,所述第一次封装操作包括:编译和压缩中的一种或多种;
对经过第一次封装的算法再进行第二次封装使所述算法成为万维网web服务。
作为一示例,针对所述算法的测试素材集的获取方法,包括:
向所述web服务发送超文本传输协议http请求,所述请求包括所述测试素材集的网络存储路径;
通过所述web服务访问所述测试素材集的网络存储路径以获取相应的测试素材集。
作为一示例,所述运行存储的测试任务,包括:
获得当前总可用资源以及每个测试任务运行所需资源;
根据获得的资源对存储的测试任务进行运行控制。
作为一示例,所述根据获得的资源对存储的测试任务进行运行控制,包括:
当所述当前总可用资源大于或等于每个测试任务运行所需资源之和时,运行存储的全部测试任务。
作为一示例,所述根据获得的资源对存储的测试任务进行运行控制,包括:
当所述当前总可用资源小于每个测试任务运行所需资源之和时,按照生成每个测试任务的时间先后顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
作为一示例,所述根据获得的资源对存储的测试任务进行运行控制,包括:
当所述当前总可用资源小于每个测试任务运行所需资源之和时,按照测试任务优先级从高到低的顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
作为一示例,所述根据获得的资源对存储的测试任务进行运行控制,包括:
当所述当前总可用资源小于每个测试任务运行所需资源之和时,判断是否存在设置了优先级的测试任务,如果存在,则按照测试任务优先级从高到低的顺序依次运行测试任务;当设置了优先级的测试任务运行完毕后如果还剩余待运行测试任务,再根据生成所述剩余待运行测试任务的时间先后顺序依次运行所述剩余待运行测试任务,其中,每一测试任务运行结束,将运行该测试任务使用的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
作为一示例,所述方法还包括:
当按照测试任务优先级从高到低的顺序依次运行测试任务时,每运行一测试任务,判断当前是否存在高于运行的测试任务优先级的测试任务,如果存在,暂停当前运行的测试任务,重新按照测试任务优先级从高到低的顺序依次运行测试任务。
作为一示例,基于所述运行结果进行数据分析,包括以下一种或多种:
将所述运行结果和针对所述算法的测试素材集进行对比,得到所述算法的评价指标;
根据所述运行结果进行算法不同版本对比以实现对算法版本的评估;
根据所述运行结果分析针对所述算法的测试素材集中的测试素材分布;
根据所述运行结果调整针对所述算法的测试素材集;
生成由针对所述算法的测试素材集确定的所有维度或预设维度的统计分析结果;
根据所述运行结果分析测试素材集中的测试素材分布,以及根据所述测试素材分布得到所述测试素材集的维度,选择至少一个维度生成统计分析结果。
作为一示例,所述方法还包括:当基于所述运行结果进行数据分析,包括根据所述运行结果调整针对所述算法的测试素材集时,根据封装后的算法以及获取的调整后的针对所述算法的测试素材集生成并存储测试任务;运行存储的测试任务,输出运行结果;基于所述运行结果进行数据分析。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如前任一实施例所述方法的步骤。
本申请实施例还提供了一种算法测试装置,包括存储器和处理器,所述存储器存储有程序,所述程序在被所述处理器读取执行时,实现如前任一实施例所述方法。
本申请实施例记载的技术方案,对算法进行封装并获取针对算法的测试素材集生成测试任务;通过运行测试任务,并对运行结果进行分析实现了对所述算法的自动测试,提高了算法测试效率。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
附图用来提供对本申请技术方案的理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1为本申请实施例提供的算法测试方法流程图;
图2为本申请实施例提供的对算法进行封装的流程图;
图3为本申请实施例提供的运行存储的测试任务的流程图;
图4为本申请实施例提供的算法测试方法的一种实施示意图;
图5为本申请实施例提供的算法测试装置结构图。
详述
本申请描述了多个实施例,但是该描述是示例性的,而不是限制性的,并且对于本领域的普通技术人员来说显而易见的是,在本申请所描述的实施例包含的范围内可以有更多的实施例和实现方案。尽管在附图中示出了许多可能的特征组合,并在可选实施方式中进行了讨论,但是所公开的特征的许多其它组合方式也是可能的。除非特意加以限制的情况以外,任何实施例的任何特征或元件可以与任何其它实施例中的任何其他特征或元件结合使用,或可以替代任何其它实施例中的任何其他特征或元件。
本申请包括并设想了与本领域普通技术人员已知的特征和元件的组合。本申请已经公开的实施例、特征和元件也可以与任何常规特征或元件组合,以形成由权利要求限定的独特的发明方案。任何实施例的任何特征或元件也可以与来自其它发明方案的特征或元件组合,以形成另一个由权利要求限定的独特的发明方案。因此,应当理解,在本申请中示出和/或讨论的任何特征可以单独地或以任何适当的组合来实现。因此,除了根据所附权利要求及其等同替换所做的限制以外,实施例不受其它限制。此外,可以在所附权利要求的保护范围内进行各种修改和改变。
此外,在描述具有代表性的实施例时,说明书可能已经将方法和/或过程呈现为特定的步骤序列。然而,在该方法或过程不依赖于本文所述步骤的特定顺序的程度上,该方法或过程不应限于所述的特定顺序的步骤。如本领域普通技术人员将理解的,其它的步骤顺 序也是可能的。因此,说明书中阐述的步骤的特定顺序不应被解释为对权利要求的限制。此外,针对该方法和/或过程的权利要求不应限于按照所写顺序执行它们的步骤,本领域技术人员可以容易地理解,这些顺序可以变化,并且仍然保持在本申请实施例的精神和范围内。
本申请实施例提供了一种算法测试方法,如图1所示,所述方法包括:
步骤S100对算法进行封装;
对算法的封装操作包括:编译和打包中的一种或多种;
所述算法可包含多个版本;
步骤S101根据封装后的算法以及获取的针对所述算法的测试素材集生成并存储测试任务;
当算法有多个版本时,每个版本的算法对应一个测试素材集;
如人脸识别算法,该算法对应的测试素材集包括的测试素材可以有:男性人脸素材、女性人脸素材、儿童人脸素材、成年人人脸素材等;
对于人脸识别算法,测试素材集中的测试素材可以为图片或视频;
所述测试素材集中的所有测试素材均可存储在相同的文件夹中,或可存储在不同的文件夹中;将针对同一算法的所有测试素材存储在相同的文件夹,有利于提高生成测试任务的效率;
所述算法可以包括一个或多个副本;
当所述算法包括多个副本时,每个副本与测试素材集中的一个或多个测试素材对应,不同的副本可以对应不同的测试素材,或可以对应相同的测试素材;一个算法的所有副本对应的测试素材的集合为针对该算法的测试素材集;
当所述算法包括一个副本时,所述测试素材集仅对应该副本;
步骤S102运行存储的测试任务,输出运行结果;
存储的测试任务可以为一个或多个,运行每个测试任务均会输出运行结果;
步骤S103基于所述运行结果进行数据分析,以实现对所述算法的测试。
本申请实施例记载的技术方案,能够对算法进行自动测试,提高了算法测试效率。
在一示例性实施例中,步骤S100对算法进行封装,如图2所示,包括:
步骤S1001对算法进行第一次封装操作,所述第一次封装操作包括:编译和压缩中的一种或多种;
步骤S1002对经过第一次封装的算法再进行第二次封装使所述算法成为万维网web服务。
现有的人工测试算法流程中也存在对算法进行封装的操作,但仅执行一次封装操作,实现对算法的编译、压缩,本申请实施例记载的技术方案在现有封装操作的基础上多进行一次封装操作使算法成为web服务,成为web服务后的算法可以进行远程访问,如可以接收http请求,并根据请求中携带的网络存储路径获取相应的测试素材集并返回给请求方,为测试过程中远程访问外部存储设备以获取测试所需的数据提供了可能。
在一示例性实施例中,针对所述算法的测试素材集的获取方法,包括:
向所述web服务发送超文本传输协议http请求,所述请求包括所述测试素材集的网络存储路径;
通过所述web服务访问所述测试素材集的网络存储路径以获取相应的测试素材集。
基于成为web服务的算法,本申请实施例记载的技术方案实现了通过web服务远程访问测试素材集,方便测试素材集选取和调用。
在一示例性实施例中,步骤S102运行存储的测试任务,如图3所示,包括:
步骤S1021获得当前总可用资源以及每个测试任务运行所需资源;
步骤S1022根据获得的资源对存储的测试任务进行运行控制。
当通过调用服务器运行测试任务时,所述资源可以为服务器的内存资源,计算资源等。
在一示例性实施例中,步骤S1022根据获得的资源对存储的测试任务进行运行控制,包括:
当判断出当前总可用资源大于或等于每个测试任务运行所需资源之和时,运行存储的全部测试任务。
本实施例中,在当前总可用资源可以支持存储的全部测试任务运行时,运行存储的全部测试任务,可有效缩短测试进程,提高测试效率。
在一示例性实施例中,步骤S1022根据获得的资源对存储的测试任务进行运行控制,包括:
当判断当前总可用资源小于每个测试任务运行所需资源之和时,按照生成每个测试任务的时间先后顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务;或
按照测试任务优先级从高到低的顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务;或
判断是否存在设置了优先级的测试任务,如果存在,则按照测试任务优先级从高到低的顺序依次运行测试任务;当设置了优先级的测试任务运行完毕后如果还剩余待运行测试任务,再根据生成所述剩余待运行测试任务的时间先后顺序依次运行所述剩余待运行测试任务,其中,每一测试任务运行结束,将运行该测试任务使用的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
本实施例中,当前总可用资源小于每个测试任务运行所需资源之和时,需要对运行的测试任务进行选择,选择的方式可以为按照测试任务生成时间的先后顺序进行选择,或按照测试任务的优先级的高低顺序进行选择;或者当存储的测试任务中的部分为设置了优先级的测试任务的情况下,先按照测试任务的优先级的高低顺序进行选择,再按照测试任务生成时间的先后顺序进行选择。每运行一测试任务,均会判断当前可总可用资源(当前总可用资源是随着当前运行的测试任务的数量的变化而不断变化的,当前运行的测试任务的数量多,当前总可用资源就少;当前运行的测试任务的数量少,当前总可用资源就多)是否能够支持下一待运行的测试任务,如果能够支持,就继续运行下一待运行的测试任务,如果不能够支持,就停止测试任务的运行。本申请实施例所述的方案,在总可用资源足够的情况下,支持多测试任务、多副本的同时进行,有效地提高了测试效率。
在一示例性实施例中,所述方法还包括:
当按照测试任务优先级从高到低的顺序依次运行测试任务时,每运行一测试任务,判断当前是否存在高于运行的测试任务优先级的测试任务,如果存在,暂停当前运行的测试任务,重新按照测试任务优先级从高到低的顺序依次运行测试任务。暂停的测试任务的使用资源可以重新回收至当前总可用资源中;或可以保留暂停的测试任务的使用资源,待高优先级的测试任务运行结束后,再继续运行所述暂停的测试任务。
由于存储的测试任务的优先级不是一成不变的,如测试人员修改了测试任务的优先级,因此本申请实施例在按照测试任务优先级运行测试任务的过程中,不断对测试任务的优先级进行再判断,如果发现有高于当前运行的测试任务优先级的测试任务,暂停当前运行的测试任务,重新去运行更高优先级的测试任务,确保高优先级的测试任务能够得到优先运行。
在一示例性实施例中,步骤S103基于所述运行结果进行数据分析,可以包括以下一种或多种:
将所述运行结果和针对所述算法的测试素材集进行对比,得到所述算法的评价指标;所述算法的评价指标可以包括:真正(TP,rue positives):预测为正,实际为正;真负(TN,True negatives):预测为负,实际为负;假正(FP,False positives):预测为正,实际为负;假负(FN,False negatives):预测为负,实际为正;
根据运行结果进行算法不同版本对比以实现对算法版本的评估;
根据运行结果分析测试素材集中的测试素材分布,如人脸识别算法所使用的测试素材集中有100个测试素材,所述测试素材涉及男性人脸素材、女性人脸素材、儿童人脸素材、成年人人脸素材,那么每种素材的数量即为测试素材分布;该素材分布结果可以通过可视化的图形表示;
根据所述运行结果调整针对所述算法的测试素材集,如根据所述素材分布判断素材选择的合理性,对不合理的测试素材进行调整;示例性的,当所述素材分布不符合预设分布模型时,可认为该素材分布不合理;示例性的,如儿童人脸素材分布理论上涉及各个年龄段儿童的人脸,如果从运行结果中发现某一年龄段儿童人脸素材缺失,可增加这一年龄段儿童的测试素材;
生成由测试素材集确定的所有维度或预设维度的统计分析结果,如人脸识别算法的测试素材集包括不同性别、年龄的人脸素材,那么不同的性别和年龄即为所述测试素材集确定的维度;可以生成所有维度的统计分析结果,或可以仅生成测试人员所选择的预设维度的统计分析结果;所述统计分析结果可以包括:算法指标结果(如算法的精准率、召回率、和F1分数中的一种或多种)、素材分布结果和相应的漏检、误检图片或视频;该统计分析结果可以通过可视化的图形表示;
根据所述运行结果分析测试素材集中的测试素材分布,以及根据所述测试素材分布得到所述测试素材集的维度,选择至少一个维度生成统计分析结果;该统计分析结果可以通过可视化的图形表示。
在一示例性实施例中,所述方法还可以包括:当基于所述运行结果进行数据分析得到当前版本算法的精准率、召回率、F1分数的任一统计结果低于预设值时,将对应的测试素材集用于该算法的下一版本的测试。
在一示例性实施例中,当数据分析包括根据所述运行结果调整针对所述算法的测试素材集时,所述方法还包括:
根据封装后的算法以及获取的调整后的针对所述算法的测试素材集生成并存储测试任务;运行存储的测试任务,输出运行结果;基于所述运行结果进行数据分析。
本申请实施例记载的方案,利用测试素材集得到运行结果,根据运行结果再调整测试素材集,进而实现了闭环测试流程,相对开环测试流程,闭环测试流程可以实现自我优化的效果。
在一实施例性实施例中,所述方法还包括:
根据数据分析的结果生成测试报告,示例性的,根据需生成的测试报告所需的信息进行自动信息搜集,并根据搜集到的信息自动生成测试报告,所述搜索到的信息可以包括:测试人员、测试时间、项目信息、版本信息中的一种或多种。
本申请实施例记载的技术方案可实现自动生成测试报告,节省了由测试人员编写测试报告的时间,提升了测试效率。
图4为一种本申请实施例记载的算法测试方法的实施示意图。
步骤S400获取算法;
步骤S401对算法进行第一次封装操作,所述第一次封装操作包括:编译和压缩中的一种或多种;
步骤S402对经过第一次封装的算法再进行第二次封装使所述算法成为万维网web服务;
步骤S403向所述web服务发送超文本传输协议http请求,所述请求包括所述测试素材集的网络存储路径;
步骤S404通过所述web服务访问所述测试素材集的网络存储路径以获取相应的测试素材集;
步骤S405将经过两次封装后的算法和获取的针对所述算法的测试素材集进行绑定,生成并存储测试任务;
可选的,所述测试任务中还可绑定所述算法的属性描述信息;
步骤S406通过调用服务器运行所述测试任务,判断当前服务器的总可用资源是否大于或等于每个测试任务运行所需资源之和时,如果是,执行步骤S407;如果不是,执行步骤S408
步骤S407运行存储的全部测试任务,执行步骤S409;
步骤S408根据获得的资源对存储的测试任务进行运行控制;
步骤S409基于测试任务的运行结果进行数据分析;
步骤S410将数据分析的结果生成分析报告;所述分析报告可以上传至统一的平台,或发送至指定的测试人员,流程结束。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如前任一实施例所述方法的步骤。
本申请实施例还提供了一种算法测试装置,如图5所示,包括存储器501和处理器 502,所述存储器501存储有程序,所述程序在被所述处理器502读取执行时,实现如前任一实施例所述的方法。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由多个物理组件合作执行。某些组件或所有组件可以被实施为由处理器,如数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。

Claims (13)

  1. 一种算法测试方法,所述方法包括:
    对算法进行封装;
    根据封装后的算法以及获取的针对所述算法的测试素材集生成并存储测试任务;
    运行存储的测试任务,输出运行结果;
    基于所述运行结果进行数据分析,以实现对所述算法的测试。
  2. 根据权利要求1所述的方法,其中,所述对算法进行封装,包括:
    对算法进行第一次封装操作,所述第一次封装操作包括:编译和压缩中的一种或多种;
    对经过第一次封装的算法再进行第二次封装使所述算法成为万维网web服务。
  3. 根据权利要求2所述的方法,其中,针对所述算法的测试素材集的获取方法,包括:
    向所述web服务发送超文本传输协议http请求,所述请求包括所述测试素材集的网络存储路径;
    通过所述web服务访问所述测试素材集的网络存储路径以获取相应的测试素材集。
  4. 根据权利要求1所述的方法,其中,所述运行存储的测试任务,包括:
    获得当前总可用资源以及每个测试任务运行所需资源;
    根据获得的资源对存储的测试任务进行运行控制。
  5. 根据权利要求4所述的方法,其中,所述根据获得的资源对存储的测试任务进行运行控制,包括:
    当所述当前总可用资源大于或等于每个测试任务运行所需资源之和时,运行存储的全部测试任务。
  6. 根据权利要求4所述的方法,其中,所述根据获得的资源对存储的测试任务进行运行控制,包括:
    当所述当前总可用资源小于每个测试任务运行所需资源之和时,按照生成每个测试任务的时间先后顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
  7. 根据权利要求4所述的方法,其中,所述根据获得的资源对存储的测试任务进行运行控制,包括:
    当所述当前总可用资源小于每个测试任务运行所需资源之和时,按照测试任务优先级从高到低的顺序依次运行测试任务,每一测试任务运行结束,将运行该测试任务的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
  8. 根据权利要求4所述的方法,其中,所述根据获得的资源对存储的测试任务进行运行控制,包括:
    当所述当前总可用资源小于每个测试任务运行所需资源之和时,判断是否存在设置了优先级的测试任务,如果存在,则按照测试任务优先级从高到低的顺序依次运行测试任务; 当设置了优先级的测试任务运行完毕后如果还剩余待运行测试任务,再根据生成所述剩余待运行测试任务的时间先后顺序依次运行所述剩余待运行测试任务,其中,每一测试任务运行结束,将运行该测试任务使用的资源重新回收至当前总可用资源中,以及判断当前总可用资源是否大于或等于下一待运行测试任务所需资源,如果是,则运行所述下一待运行测试任务。
  9. 根据权利要求7或8所述的方法,所述方法还包括:
    当按照测试任务优先级从高到低的顺序依次运行测试任务时,每运行一测试任务,判断当前是否存在高于运行的测试任务优先级的测试任务,如果存在,暂停当前运行的测试任务,重新按照测试任务优先级从高到低的顺序依次运行测试任务。
  10. 根据权利要求1所述的方法,其中,基于所述运行结果进行数据分析,包括以下一种或多种:
    将所述运行结果和针对所述算法的测试素材集进行对比,得到所述算法的评价指标;
    根据所述运行结果进行算法不同版本对比以实现对算法版本的评估;
    根据所述运行结果分析针对所述算法的测试素材集中的测试素材分布;
    根据所述运行结果调整针对所述算法的测试素材集;
    生成由针对所述算法的测试素材集确定的所有维度或预设维度的统计分析结果;
    根据所述运行结果分析测试素材集中的测试素材分布,以及根据所述测试素材分布得到所述测试素材集的维度,选择至少一个维度生成统计分析结果。
  11. 根据权利要求10所述的方法,所述方法还包括:
    当基于所述运行结果进行数据分析,包括根据所述运行结果调整针对所述算法的测试素材集时,根据封装后的算法以及获取的调整后的针对所述算法的测试素材集生成并存储测试任务;运行存储的测试任务,输出运行结果;基于所述运行结果进行数据分析。
  12. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述方法的步骤。
  13. 一种算法测试装置,包括存储器和处理器,所述存储器存储有程序,所述程序在被所述处理器读取执行时,实现如权利要求1至11中任一项所述方法。
PCT/CN2023/112635 2022-08-12 2023-08-11 一种算法测试方法、装置和存储介质 WO2024032781A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210968964.XA CN115344487A (zh) 2022-08-12 2022-08-12 一种算法测试方法、装置和存储介质
CN202210968964.X 2022-08-12

Publications (1)

Publication Number Publication Date
WO2024032781A1 true WO2024032781A1 (zh) 2024-02-15

Family

ID=83952842

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/112635 WO2024032781A1 (zh) 2022-08-12 2023-08-11 一种算法测试方法、装置和存储介质

Country Status (2)

Country Link
CN (1) CN115344487A (zh)
WO (1) WO2024032781A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115344487A (zh) * 2022-08-12 2022-11-15 虹软科技股份有限公司 一种算法测试方法、装置和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874205A (zh) * 2017-02-27 2017-06-20 郑州云海信息技术有限公司 一种持续集成中的自动化功能测试装置及其方法
CN107153714A (zh) * 2017-06-01 2017-09-12 国家基础地理信息中心 基于服务关系的变化检测服务链按需生成方法
CN111651246A (zh) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 终端和服务器之间的任务调度方法、装置和调度器
CN113886052A (zh) * 2021-10-26 2022-01-04 上海商汤科技开发有限公司 任务调度方法、装置、设备、存储介质
CN114637511A (zh) * 2022-02-21 2022-06-17 北京奕斯伟计算技术有限公司 代码测试系统、方法、装置、电子设备及可读存储介质
CN115344487A (zh) * 2022-08-12 2022-11-15 虹软科技股份有限公司 一种算法测试方法、装置和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874205A (zh) * 2017-02-27 2017-06-20 郑州云海信息技术有限公司 一种持续集成中的自动化功能测试装置及其方法
CN107153714A (zh) * 2017-06-01 2017-09-12 国家基础地理信息中心 基于服务关系的变化检测服务链按需生成方法
CN111651246A (zh) * 2020-04-24 2020-09-11 平安科技(深圳)有限公司 终端和服务器之间的任务调度方法、装置和调度器
CN113886052A (zh) * 2021-10-26 2022-01-04 上海商汤科技开发有限公司 任务调度方法、装置、设备、存储介质
CN114637511A (zh) * 2022-02-21 2022-06-17 北京奕斯伟计算技术有限公司 代码测试系统、方法、装置、电子设备及可读存储介质
CN115344487A (zh) * 2022-08-12 2022-11-15 虹软科技股份有限公司 一种算法测试方法、装置和存储介质

Also Published As

Publication number Publication date
CN115344487A (zh) 2022-11-15

Similar Documents

Publication Publication Date Title
CN107291545B (zh) 计算集群中多用户的任务调度方法及设备
WO2023071075A1 (zh) 机器学习模型自动化生产线构建方法及系统
CN111414233A (zh) 一种在线模型推理系统
JP2023525393A (ja) ゲートウェイリソースを更新する方法および装置、ならびにiot制御プラットフォーム
US20060041539A1 (en) Method and apparatus for organizing, visualizing and using measured or modeled system statistics
WO2024032781A1 (zh) 一种算法测试方法、装置和存储介质
WO2019100635A1 (zh) 自动化测试脚本的编辑方法、装置、终端设备及存储介质
CN107870949B (zh) 数据分析作业依赖关系生成方法和系统
CN109344189B (zh) 一种基于NiFi的大数据计算方法及装置
CN110347407A (zh) 一种获取内存占用量的方法、装置、计算机设备及介质
WO2023131121A1 (zh) 集成电路自动化并行仿真方法和仿真装置
CN104598299A (zh) 用于对每条接收数据执行聚合处理的系统和方法
WO2023231704A1 (zh) 算法运行方法、装置、设备、存储介质
WO2022048648A1 (zh) 实现自动构建模型的方法、装置、电子设备和存储介质
CN110971439A (zh) 策略决策方法及装置、系统、存储介质、策略决策单元及集群
CN107368490A (zh) 数据处理方法及装置
CN109460365A (zh) 一种系统性能测试方法、装置、设备及存储介质
CN107704362A (zh) 一种基于Ambari监控大数据组件的方法及装置
WO2023103624A1 (zh) 任务优化方法、装置和计算机可读存储介质
CN115114275A (zh) 一种数据采集方法、设备及介质
CN112000657A (zh) 数据管理方法、装置、服务器及存储介质
TW201941124A (zh) 樣本回放資料存取方法及裝置
CN115576924A (zh) 一种数据迁移的方法
US9659041B2 (en) Model for capturing audit trail data with reduced probability of loss of critical data
CN110928860B (zh) 数据迁移方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23852001

Country of ref document: EP

Kind code of ref document: A1