GB2624418A

GB2624418A - Method and system for automatic test plan generation

Info

Publication number: GB2624418A
Application number: GB2217178.9A
Authority: GB
Inventors: Li Yanran; Cao Yushi; Zheng Yan; Shin Teo Yon; Liang Zhexin; Lin Shang-Wei; Liu Yang
Original assignee: Continental Automotive Technologies GmbH; Nanyang Technological University
Current assignee: Continental Automotive Technologies GmbH; Nanyang Technological University
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2024-05-22
Also published as: GB202217178D0; WO2024104781A1

Abstract

A method for automatically generating test plans for assessing flaws in software comprises defining a test case set 108 for assessing software based on the software to be assessed. The test case set comprises multiple test cases, each test case being defined by at least one test parameter. An initial population of candidate test plans is initialised 116, each test plan comprising one or more test cases selected from the test case set. The candidate test plans are optimised 124 based on an evaluation function that accounts for one or more test objectives. One or more test plans are selected 132 that have the best evaluation function score. The test case set may be selected from a library of predefined test cases. Optimising the test plans may involve using an evolutionary algorithm to mutate and evolve the initial population of test plans based on the evaluation function.

Description

METHOD AND SYSTEM FOR AUTOMATIC TEST PLAN GENERATION TECHNICAL FIELD

[0001] This disclosure is related generally to the identification of flaws in a software, and more specifically to the generation and implementation of a test plan to identify flaws in a software.

BACKGROUND

[0002] Software testing is becoming increasingly costly and challenging due to the increasing complexity, functional scope, and variety of product lines in the market. In reality, a full test which includes all test cases is not feasible, and a test plan must be carefully designed by the test manager who takes into account the test objectives, user or customer expectations and the triple constraints: scope, time and budget. As multiple varying features may be introduced to the software during the course of its development life cycle, different testing strategies have to be adopted for each software at each stage or phase. In the early phase, testing tends to be focused on functional tests to ensure that the developed features satisfy the software requirement specifications. As the software matures, the focus of testing gradually shifts to system-level and integration test, and finally out of requirement testing (or "monkey tests") before the software is released. An effective test plan must be adaptive to the testing needs, past test results and external factors such as the launching date of competing products and new standards or regulations. As a result, a good test manager must come up with test plans that continuously evolves with the software. Test plan creation essentially becomes a complex optimisation problem with multiple dynamic constraints.

[0003] Current methods for generating test plans still heavily rely on the discretion of test managers since there is no systematic approach that guides the decision-making process. In most practical scenarios, the test manager creates a test plan by selecting test cases from a set of test case candidates according to his domain knowledge, test constraints and his understanding of a list of factors which includes past test results, characteristic of the product, user or customer expectation, strength and knowledge of the test engineers and so on. An example of a standard scenario is as follows: A collection of test cases is maintained by the testing team, and each test case has previously been executed a certain number of times. All test cases are evaluated with multiple business and performance metrics (e.g., code coverage, priority, past failure rate, etc.). The test manager must then select a number of test cases that fit within present test constraints to construct a test plan to test the software. The test results will be recorded and stored in the database for tracking issues as well as guiding future test case selection.

[0004] The current method of generating test plans is undesirable as it is time consuming, and highly based on the expertise, knowledge, and discretion of the test manager. Furthermore, the current methods are convoluted approaches done on a high-level consideration, rather than on a low-level, data-driven basis In addition, such methods are not feasible for the testing of sophisticated software as the generation of test plans manually is a computational problem that scales exponentially with the number of test case candidates.

SUMMARY

[0005] It Is an object of the present Invention to provide a method of test plan generation that is automatic and data-driven which is of consistent quality that is not reliant on the subjective or discretionary expertise of a human test manager. Such test plan generated may then be implemented on a machine (such as a vehicle) to identify flaws in software developed for such machine. In general, candidate test plans are initialised from a set of test cases. The candidate test plans are then optimised using an evaluation function based on one or more test objectives. The test plan(s) with the best evaluation function score(s) calculated using the evaluation function is then selected and implemented by testing the software on a machine (such as a vehicle). The method of generating test plan(s) and the evaluation function used are highly customisable to suit any requirements and test objectives and may be incorporated at any stage of software development before and/or after release.

[0006] The object of the present invention is solved by the subject-matter of the independent claims, wherein further embodiments are incorporated into the dependent claims [0007] It shall be noted that all embodiments of the present invention concerning a method might be carried out with the order of the steps as described, nevertheless this has not to be the only and essential order of the steps of the method. The herein presented methods can be carried out with another order of the disclosed steps without departing from the respective method embodiment, unless explicitly mentioned to the contrary hereinafter.

[0008] To solve the above technical problems, the present invention provides a computer-implemented method for assessing flaws of a software, the method comprising: defining a test case set for assessing a software based on the software to be assessed, wherein the test case set comprises a plurality of test cases and each test case is associated with at least one test parameter defining such test case; initialising an initial population of candidate test plans, wherein each test plan comprises one or more test cases selected from the defined test case set; optimising the initial population of candidate test plans based on an evaluation function that accounts at least for one or more test objectives; and selecting one or more candidate test plans with best evaluation function score or scores respectively calculated using the evaluation function.

[0009] The computer-implemented method of the present invention is advantageous over known methods as the test plan may be automatically generated and optimised based on an evaluation function that accounts for any defined test objectives (technical or otherwise).

The resulting test plan generated may therefore be of consistent quality and may achieve a desired balance between defined test objectives. The evaluation function that accounts at least for one or more test objectives is dynamic in nature and may be easily changed, amended, or modified accordingly for different test iterations, thus ensuring that the one or more candidate test plans generated are optimised for the purpose and/or objectives they are generated for.

[0010] A preferred method of the present invention is a computer-implemented method as described above, wherein the test case set is selected from a test case library comprising test cases executed on previous versions of the software, test cases previously executed on similar software, newly defined test cases and/or generic test cases.

[0011] The above-described aspect of the present invention has the advantage that a wide variety of test cases is available from the test case library and may be considered and incorporated for the generation and optimisation of the test plan, thus potentially generating a test plan that is of higher quality.

[0012] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein optimising the initial population of candidate test plans comprises using an evolutionary algorithm to evolve the initial population of candidate test plans into an evolved population of candidate test plans based on the evaluation function.

[0013] The above-described aspect of the present invention has the advantage that optimised test plans may be generated while keeping the computational power required low. An evolutionary algorithm is robust and flexible and can capture global solutions of complex optimisation problems. Evolutionary algorithms are advantageous in optimisation, particularly in a multi-objective optimisation, such as test plan generation, as they can be applied simultaneously, and has a generally low computational cost as they grow linearly with problem size. Evolutionary algorithms are also advantageous as they have lower computational cost as compared to other machine learning paradigms like deep neural networks.

[0014] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein each candidate test plan comprises a plurality of bit-strings, wherein each bit-string represents a feature of the candidate test plan and/or a feature of each test case that forms the test case set.

[0015] The above-described aspect of the present invention has the advantage that higher quality of test plans may be generated as the usage of a plurality of bit-strings allows all relevant features of each candidate test plan and/or test case to be faithfully captured as well as optimised during the optimisation step. In addition, simultaneously optimising the features represented by the plurality of bit-strings reduces the overall computational power required as compared to sequentially optimising the different features of a test plan. Simultaneously optimising the features represented by the plurality of bit-strings is also advantageous over sequentially optimising the different features of a test plan as the simultaneous optimisation reduces the number of conflicts. For example, where there are two objectives, the best combination of test cases and features for a first objective may perform poorly for a second objective. By running the optimisation simultaneously, an acceptable combination of test cases and features may be obtained that satisfies both objectives to a reasonable or acceptable extent.

[0016] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein each candidate test plan comprises a first bit-string denoting the one or more test cases selected to form such

S

candidate test plan, and at least one second bit-string each denoting a feature of each test case of the test case set, wherein the first bit-string is preferably a bit-string of length corresponding to the number of test cases in the test case set and denotes which test cases of the test case set were assigned to form the test plan, and each of the at least one second bit-string is preferably a bit-string representing the feature.

[0017] The above-described aspect of the present invention has the advantage that higher quality test plans may be generated as both the selected test cases of a test plan and the features of each test case of the test case set are faithfully captured as well as optimised during the optimisation step.

[0018] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein each of the at least one second bit-string denotes a number of test cycles for each test case.

[0019] The above-described aspect of the present invention has the advantage that higher quality test plans may be generated. By representing the number of test cycles as bit-strings, the number of test cycles may be subsequently optimised for each test case of a test plan, thus resulting in a more optimised test plan which allocates an appropriate number of test cycles for each test case, while keeping the total test plan implementation time low.

[0020] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the evaluation function 20 comprises one or more reward terms and/or one or more weight functions that are determined based on the one or more test objectives.

[0021] The above-described aspect of the present invention has the advantage of generating higher quality and/or more customised and/or appropriate test plans by ensuring that the test objectives are satisfied while obeying any test constraints present though the incorporation of reward terms, as well as accounting for any relative importance of the test objectives through weight function(s).

[0022] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the one or more reward terms is determined based on a function that models a relationship between two or more test 30 parameters.

[0023] The above-described aspect of the present invention has the advantage of generating higher quality test plans by optimising two or more test parameters that are correlated and/or have a relationship with each other together instead of separately to achieve the one or more test objectives. The parameters of test cases are usually not independent and are usually correlated or have some relationship with each other, which is accounted for with the incorporation of the function that models a relationship between parameters.

[0024] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the evaluation function comprises priority level and failure rate as positive reward terms and/or time spent as a negative reward term and/or one or more weight functions.

[0025] The above-described aspect of the present invention has the advantage of generating higher quality test plans. The test objectives may be accounted for using priority level and failure rate as positive reward terms while obeying the test constraint of run time (through the incorporation of time spent as a negative reward term).

[0026] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein the failure rate is determined using a function that models a relationship between a number of system failures and a number of test cycles.

[0027] The above-described aspect of the present invention has the advantage that higher quality test plans may be generated using a function that models a relationship between a number of system failures and a number of test cycles, which are particularly important for software that is relevant to safety (e.g., seatbelt systems) as multiple repetitions of test may be required before any defects may surface. The quality of the test plan is also enhanced as historical data is accounted for when building the model, thus providing a data-driven optimisation of test plan.

[0028] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, further comprising: executing at least one test case of the one or more selected candidate test plans to generate test results; and identilVing one or more flaws in the software based on the generated test results, wherein executing the at least one test case preferably comprises testing the software on a vehicle and generating test results using signals received from one or more vehicle components [0029] The above-described aspect of the present invention has the advantage that flaws in a software may be identified using the results of the generated test plan, thus assisting in the creation of software with less flaws Executing the at least one test case by testing on a vehicle and generating test results using signals received from the one or more vehicle components also ensures that the software would be safe for implementation on a vehicle [0030] A preferred method of the present invention is a computer-implemented method as described above or as described above as preferred, wherein executing at least one of the one or more selected candidate test plans is carried out automatically, preferably using one or more automation scripts.

[0031] The above-described aspect of the present invention has the advantage that the implementation of the test plan may be automatically carried out without human intervention, thus increasing the efficacy of the implementation of the test plan.

[0032] The above-described advantageous aspects of a computer-implemented method of the invention also hold for all aspects of a below-described computing system of the invention. All below-described advantageous aspects of a computing system of the invention also hold for all aspects of an above-described computer-implemented method of the invention.

[0033] The invention also relates to a computing system for assessing flaws of a software, the computing system comprising one or more processors and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for carrying out a computer-implemented method according to any of the preceding claims.

[0034] A preferred method of the present invention is a computing system as described above, wherein the computing system is connected to a vehicle and configured to receive signals from one or more vehicle components.

[0035] The above-described aspect of the present invention has the advantage that the test plan may be implemented on software for vehicle applications and may be tested based on signals from vehicle components, thus ensuring that the software for vehicle applications is safe to be implemented in vehicles [0036] The above-described advantageous aspects of a computer-implemented method, computing system, data structure, or data processing system of the invention also hold for all aspects of a below-described computer program, a machine-readable storage medium, or a data carrier signal of the invention. All below-described advantageous aspects of a computer program, a machine-readable storage medium, or a data carrier signal of the invention also hold for all aspects of an above-described computer-implemented method, computing system, data structure, or data processing system of the invention [0037] The invention also relates to a computer program, a machine-readable storage medium, or a data carrier signal that comprises instructions, that upon execution on a data processing device and/or control unit, cause the data processing device and/or control unit to perform the steps of a computer-implemented method according to the invention. The machine-readable medium may include any medium and/or mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device). The machine-readable medium may be any medium, such as for example, read-only memory (ROM); random access memory (RAM); a universal serial bus (USB) stick; a compact disc (CD); a digital video disc (DVD); a data storage device; a hard disk; electrical, acoustical, optical, or other forms of propagated signals (e.g., digital signals, data carrier signal, carrier waves), or any other medium on which a program element as described above can be transmitted and/or stored.

[0038] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "test case" refers to a pre-defined testing procedure that tests certain functionalities of the software to be tested. An example is the testing if a seat belt detection system of a vehicle is working.

[0039] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "test cycle" represents a number of times a test case is 30 conducted.

[0040] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "test plan" refers to a plurality of test cases that may be implemented to test a software [0041] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "test objective" refers to the goals, reason and/or purpose of the test execution, and may be defined based on one or more software features and/or modules to be tested. Examples of test objectives include functional correctness, authorisation, service level, safety, and usability.

[0042] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term -test constraint" refers to factors or criteria that are placed on the test plan to restrict the number of test plan solutions that will meet the test objectives. Test constraints may include technical and/or business constraints. Examples of test constraints include scope, coverage, duration, hardware availability, resource availability, test techniques, and costs.

[0043] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "bit-string-refers to a sequence of bits or binary digits.

[0044] As used in this summary, in the description below, in the claims below, and in the accompanying drawings, the term "component" refers to any hardware that is a part or element of a vehicle and may be part of one or more systems within a vehicle. Examples of components include the engine, battery, alternator, brakes, radiator, transmission, shock absorbers, convertors, steering, electronic control units, suspension, sensors, etc

BRIEF DESCRIPTION OF THE DRAWINGS

[0045] These and other features, aspects, and advantages will become better understood with regard to the following description, appended claims, and accompanying drawings where: [0046] Fig. 1 is a schematic illustration of a method of assessing flaws of a software, in accordance with embodiments of the present disclosure; [0047] Fig. 2 is a schematic illustration of an example of a candidate test plan, in accordance with embodiments of the present disclosure, [0048] Fig. 3 is a schematic illustration of the optimisation of candidate test plans using an evolutionary algorithm, in accordance with embodiments of the present disclosure; [0049] Fig. 4 is a schematic illustration of a system that may be used for the implementation of one or more test cases, in accordance with embodiments of the present disclosure; [0050] Fig. 5 illustrates an example of a computing system, in accordance with embodiments

of the present disclosure; and

[0051] Figs. 6 to 8 show the results of experiments conducted using methods in the present disclosure, in accordance with embodiments of the present disclosure.

[0052] In the drawings, like parts are denoted by like reference numerals.

[0053] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and executed by a computer or processor, whether or not such computer or processor is explicitly shown.

DETAILED DESCRIPTION

[0054] In the summary above, in this description, in the claims below, and in the accompanying drawings, reference is made to particular features (including method steps) of the invention. It is to be understood that the disclosure of the invention in this specification includes all possible combinations of such particular features. For example, where a particular feature is disclosed in the context of a particular aspect or embodiment of the invention, or a particular claim, that feature can also be used, to the extent possible, in combination with and/or in the context of other particular aspects and embodiments of the invention, and in the inventions generally.

[0055] In the present document, the word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily be construed as preferred or advantageous over other embodiments.

[0056] While the disclosure is susceptible to various modifications and alternative forms, specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. It should be understood, however that it is not intended to limit the disclosure to the forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternative falling within the scope of the disclosure.

[0057] The present disclosure is directed to computer-implemented methods, computing systems, computer programs, machine-readable storage media, and data carrier signals for the assessment of flaws of a software by generating and/or implementing a test plan for the testing of the software. The test plan is optimised based on an evaluation function that may be defined and customised based on the test objectives, test constraints, and/or any relative importance between the test objectives. The test plan may be implemented manually or automatically by machines. The method, according to the principles, can be used for the testing of any type of software, including automotive-related software.

[0058] According to various embodiments, the test plans may be initialised from a plurality of test cases. The test plans may be optimised using an evolutionary algorithm. The test plans may also be encoded using multiple bit-strings such that each test plan candidate is faithfully captured and optimised. The bit-strings may denote any information or features relevant to a test plan, including test case set(s), the number of test cycles for each test case, the run time for each test case, the priority of each test case, etc. [0059] Fig. 1 is a schematic illustration of a method of assessing flaws of a software, in accordance with embodiments of the present disclosure. Method 100 for assessing flaws of a software may be implemented by a data processing device on any architecture and/or computing system. For example, various architectures employing, for example, multiple integrated circuit (TC) chips and/or packages, and/or various computing devices and/or consumer electronic (CE) devices, such as multi-function devices, tablets, smart phones, etc., may implement the techniques and/or arrangements described herein. Method 100 may be stored as executable instructions that, upon execution on a data processing device and/or control unit, cause the data processing device and/or control unit to perform the steps of method 100.

[0060] According to some embodiments, method 100 for assessing flaws of a software may comprise step 108 wherein a test case set for assessing a software may be defined. In some embodiments, the test case set may be defined based on the software to be assessed. For example, the test case set may comprise test cases for the testing of automotive functions for a software to be used for an automotive application. In some embodiments, the test case set may be defined based on the test objectives, the software features and/or functions to be tested, and/or the stage or phase of software development. The test objectives may be any test objectives, including technical and/or business objectives. In some embodiments, the test case set may comprise a plurality of test cases, and each test case may be associated with at least one test parameter defining such test case. A test case set may comprise any number of test cases. The at least one test parameter may define technical and/or business requirements. In some embodiments, each test case may be associated with unique ID, resource required (e.g., the cost, manpower effort and time cost for executing the test case), allocated time (e.g., test duration, the allocated hours for running the test case), performance (e.g., scope, the test coverage of the test case), or some combination thereof Examples of test parameters also include, but are not limited to, test case name, test case ID, test case priority, test type, test setup, test setup effort, module/source, test script name, test script path, pre-conditions, test steps/variations, expected results, resources required. In some embodiments, each test case may be assigned a priority based on the test objectives, the software features or functions to be tested, the stage or phase of software development, and/or the results of previous implementations of the test cases, etc. In some embodiments, the test case may be any type of test case, including functionality test cases, performance test cases, unit test cases, user interface test cases, security test cases, integration test cases, database test cases, usability test cases, user acceptance test cases, regression test cases, lifecycle test, stress test, or any combination thereof In some embodiments, the test case may be selected from a test case library comprising test cases executed on previous versions of the software, test cases previously executed on similar software, newly defined test cases and/or generic test cases. A newly defined test case may be a test case that is newly created, designed, or defined to test the software and has yet to be implemented and/or executed. New test cases may be defined and added to the test case library in situations where there are new requirements or features and the test cases in the test case library do not cover such new requirements. A generic test case may be test cases defined for features that may be common across different software, applications, and/or product lines, and may be reused and/or readapted with minimal effort. Examples of generic test cases include test cases designed to test functionalities, such as backup battery, Bluetooth, emergency call. In some In embodiments, the test case library may be stored on a data storage device and retrieved when implementing method 100.

[0061] According to some embodiments, method 100 may comprise step 116 wherein an initial population of candidate test plans is initialised. The initial population of candidate test plans may comprise any number of candidate test plans. In some embodiments, each test plan may comprise one or more test cases selected from the defined test case set from step 108. Each test plan may comprise any number of test cases selected from the defined test case set. For example, where the defined test case set comprises 10 test cases, a first candidate test plan may comprise 4 out of the 10 test cases in the defined test case set, a second candidate test plan may comprise 1 out of the 10 test cases in the defined test case set, a third candidate test plan may comprise 10 out of the 10 test cases in the defined test case set, and so on. In some embodiments, the one or more test cases may be randomly selected from the defined test case set from step 108. In some embodiments, the one or more test cases may be a subset of test cases correlated to the test objective(s) that were preselected from the defined test case set from step 108 via feature engineering. In some embodiments, each candidate test plan may comprise, or may be represented by, a plurality of bit-strings, wherein each bit-string may represent a feature of the candidate test plan and/or a feature of each test case of the test case set. This ensures that all relevant features, properties, and/or dimensions of each candidate test plan and/or test case can be faithfully captured and optimised in subsequent steps.

[0062] In some embodiments, each candidate test plan may comprise a first bit-string denoting the one or more test cases selected for such candidate test plan. In some embodiments, the first bit-string may be a bit-string of length corresponding to the number of test cases in the test case set and may denote which test cases of the test case set were selected to form the test plan. In some embodiments, each test case of the test case set may be assigned with a unique index and may be represented by '1' if such test case is selected for the test plan and represented by '0' if such test case is not selected for the test plan, or vice versa. It is contemplated that any other representation may also be employed.

[0063] In some embodiments, each candidate test plan may comprise at least one second bit-string each denoting a feature of each test case of the test case set and/or feature of the candidate test plan. In some embodiments, each of the at least one second bit-string may be an 8-bit bit-string. In some embodiments, a feature of a test case may be a test parameter or may be calculated based on one or more test parameters. Examples of features include test priority, test bench availability, tester availability, hardware availability, and module tested. In some embodiments, the at least one second bit-string may denote a number of test cycles for each test case of the test case set. In some embodiments, the number of test cycles may be encoded as an 8-bit bit-string.

[0064] Fig. 2 is a schematic illustration of an example of a candidate test plan, in accordance with embodiments of the present disclosure. In some embodiments, a candidate test plan 200 may comprise a first bit string 208 denoting the one or more test cases selected to form such candidate test plan 200. The length of the first bit-string 208 may correspond to the length of the test case set defined in step 108. The first bit-string 208 illustrated in Fig. 2 has a length of 5, indicating a defined test case set of 5 test cases. The first bit-string 208 illustrated in Fig. 2 is [0, 1, 0, 0, 1], which represents that the second and fifth test cases of the defined test case set were selected for test plan 200 (denoted as '1'), and wherein the first, third, and fourth test cases of the defined test case set were not selected for test plan 200 (denoted as 0'). Although selected cases were denoted as '1' and unselected cases were denoted as '0' in Fig. 2, it is contemplated that any other method of denoting the selection of test cases may be employed. In some embodiments, candidate test plan 200 may comprise at least one second bit-string 216 each denoting a feature of each test case of the test case set. Two second bit-strings 216 are illustrated in Fig. 2. Second bit-string 216a represents a feature of the first test case of the defined test case set, while second bit-string 216b represents a feature of the second test case of the defined test case set. It is noted that features of test cases may be included even though a specific test case was not selected for the test plan.

[0065] Returning to Fig. 1, according to some embodiments, method 100 may comprise step 124 wherein the initial population of candidate test plans is optimised based on an evaluation function that accounts at least for one or more test objectives. In general, the test plans may be optimised by determining an evaluation function score for each of the test plans and optimising the test plans by maximising the evaluation function score calculated using the evaluation function. Optimisation of the initial population of candidate test plans may be carried out using any known methods, including any known algorithms and/or machine learning methods (e.g., neural networks or deep learning). In some embodiments, the initial population may be optimised using an evolutionary algorithm to evolve the initial population of candidate test plans into an evolved population of candidate test plans based on the evaluation function [0066] In some embodiments, the evaluation function may comprise one or more reward terms that are determined based on the one or more test objectives. The reward terms may be positive reward terms or negative reward terms (also termed as penalty terms). Optionally, the evaluation function may comprise one or more weight functions determined based on the one or more test objectives, wherein the one or more weight functions control the relative importance of the one or more reward terms. The one or more weight functions may be provided and/or adjusted based on the required testing conditions and/or test objectives.

[0067] Given test plan p = {c1, c2, * ** , cn, r1, r2, * ** , TO comprised of 17 test cases, denoted by c" and their corresponding run times r, an example of an evaluation function is Equation (1) as follows: Quality(p) = coverage (ci) + time reward(ci) + fail rate(ci, ri) (1) ciEp where coverage reflects the percentage of code that is covered by each test case, and time reward is calculated as 1 abs(Tp-L), where Tp = c 7-, is the total testing time for test plan p and L is the time constraint. Using the absolute value of the difference between the recommended testing time and the time limitation may push the method to generate test plans that make full use of the time allocated (with the time constraint) instead of generating test plans having small running times r,. The fail-rate function takes as input the r,, which is required to measure the fail-rate of a test case. In this manner, each test case c, is measured by multiple considerations and the quality of a test plan Quality(p) is measured by summing up the quality of all constituent test cases. Equation (1) may further include one or more weight functions and may be expressed as Equation (2) as follows: Quality(p) = I a * coverage (ci) + * time reward(ci) + y * fail rate(ci, ri) (2) ciEp where a, p and 7 are weights controlling the relative importance of the various reward terms.

[0068] In some embodiments, the one or more reward terms may be determined based on a function that models a relationship between two or more test parameters. The function used to model the relationship between two or more test parameters may be any known algorithm and/or machine learning model. In some embodiments, a regression model may be used to model a relationship between two or more test parameters. In some embodiments, a belief model may be used to model a relationship between two or more test parameters. In some embodiments, the function used to model the relationship between two or more test parameters may be built using test records comprising information on test previously obtained for each test case, including information, such as a test case name, priority, number of test cycles, running time, and result (e.g., pass or fail). In some embodiments, the function modelling the relationship may leverage a geometric distribution model generated based on past test records, and subsequently determining a confidence value.

[0069] In some embodiments, the evaluation function may comprise priority level and failure rate as positive reward terms, and/or time spent as a negative reward term. In some embodiments, the evaluation function may further comprise one or more weight functions. The priority level may be provided, adjusted, assigned and/or defined according to different test objectives. For example, where the test objective is to test safety features or functions of a software for a vehicle, test cases that are designed to test safety functions (e.g., seat belt detection system) may be assigned higher priority. The failure rate corresponds to how often the test case has failed and is derived from past results of running such test case. In some embodiments, the failure rate may be determined by dividing the number of failures over number of test cycles. In some embodiments, the failure rate may be determined using a function that models a relationship between a number of system failures and a number of test cycles. For example, a belief model may be constructed to estimate the dependence of system failures on the number of test cycles.

[0070] For example, a belief model to estimate the dependence of system failures on the number of test cycles may be constructed by leveraging on a geometric distribution which describes the probability distribution of the number of failures before the first success. Specifically, success means finding defects while failure means passing the test (i.e., without any defects found). In general, a belief model may be generated for each type of test case based on past test records for the test case. The past test records may be generated based on results obtained from miming tests on previous software releases. The geometric distribution gives the probability of the first success after k number of failed trials. For example, the probability of the first success after k number of failed trials may be expressed as P(Y = k) = p(1 -p)", where p is the success probability of one trial, and may be unknown to human experts. For each test case, given a set of past records: {x1, x2, x3, * * * }el, where x, represents the number of trials to get one success in the 1' experiment, the probability p may be estimated using Equation (3) as follows: A n (3) P = Ent=i xt [0071] In some embodiments, a confidence value CO E [0,1] may be used to represent how likely defects may be found by running a certain test case for i cycles. The cumulative distribution may be calculated using the p calculated for each test case and used for as the confidence value using Equation (4) as follows: C((tc re)) = 1 -(1 -pt,i) ,r1 E ri (4) where tc, is the tth test case and r, is its number of test cycles [0072] In some embodiments, the evaluation function may comprise priority level and failure rate as positive reward terms, and/or time spent as a negative reward term and/or one or more weight functions. An example of such an evaluation function is Equation (5) as follows: E(t) = y * penalty(t) + a * C((tc"r,))+ 13 * priority(tc,) (5) (re,,r,) E t where t = {(tc1,r1), (tc2, r2), * ** (tan, rr,)} is the test plan composed of ti test cases, tc, denotes the selected test case, and r, denotes the corresponding number of test cycles. The time negative reward term is defined as penalty(t) = x min (0, Tt -L), where Tr is the total testing time for test plan t and L is the time budget allocated for the current test plan. When the total time of a test plan exceeds the time budget, it penalizes the evaluation function score. Otherwise, this term will be zero, i.e., having no effect on the evaluation. The three coefficients: a, /1, y are the changeable weights functions controlling the relative importance of various terms, which can be freely adjusted to reflect the realistic testing conditions. The confidence value and priority may be calculated based on test records. Each test record may comprise the following: * Test Case Name: The unique identifier of each test case.

* Priority: Test priority (a real number in [0, 1] where 0 denotes the lowest priority) is set based on test objectives.

* Test Cycles: The number of test cycles conducted.

* Running Time: The total test time (minutes) spent for running this test case.

* Result: A binary value indicates the result: pass or fail.

Given the test records, the probability ptcrfor each test case Fe, may be estimated according to Equation (3). The belief model may be constructed and thus the confidence value may be calculated according to Equation (4).

[0073] According to some embodiments, method 100 may comprise step 132 wherein one or more candidate test plans with the best evaluation function score or scores respectively are selected. In general, the aim is to select the candidate test plan(s) that maximises the evaluation function score.

[0074] According to some embodiments, method 100 may comprise step 140 wherein at least one test case of the one or more selected candidate test plans is executed to generate test results. In some embodiments, the at least one test case may be executed manually. In some embodiments, the at least one test case may be executed automatically, preferably using one or more automation scripts. Examples of test cases that may be executed automatically include BUB (backup battery test), FOTA (firmware over the air), ATB (engine cranking, voice call), and AFT (diagnostic DTC). In some embodiments, where the software is of a vehicle application, execution of the at least one test case may comprise embedding and/or testing the software on a vehicle. In such embodiments, the test results may be generated using signals received from one or more vehicle components.

[0075] According to some embodiments, method 100 may comprise step 148 wherein one or more flaws are identified in the software based on the test results generated in step 148.

For example, when running the BUB test, if the test results indicate that the back-up battery is not actively supplying power or is not deactivating when the power has been turned off, there may be flaws identified in the backup battery module (i.e., the module that controls the backup battery) of the software. The implementation of steps 140 and 148 are discussed in further detail in relation to Fig. 4.

[0076] Fig. 3 is a schematic illustration of the optimisation of candidate test plans using an evolutionary algorithm, in accordance with embodiments of the present disclosure. According to some embodiments, an evolutionary algorithm may be used in step 124 of method 100 to evolve the initial population of candidate test plans initialised in step 116 into an evolved population of candidate test plans based on the evaluation function. In particular, the aim of the evaluation algorithm is to evolve the test plans in a direction that maximises the evaluation function score calculated based on the evaluation function. In general, an evolutionary algorithm comprises 4 stages: an initialisation stage 308, an evolution stage 316, an evaluation stage 324, and a selection stage 332, and the stages are repeated over multiple cycles or iterations until a predefined number of cycles or iterations is reached, or the solution converges (i.e., no visible changes observed in the test plans after a certain number of cycles or iterations).

[0077] According to some embodiments, the initialisation stage 308 may correspond with steps 108 and 116 of method 100, wherein test case set 340 may be defined from a test case library 348 and encoded as a plurality of bit-strings, followed by the initialisation of an initial population 356 of candidate test plans, each test plan represented as an uncoloured circle in Fig. 3, Although Fig. 3 illustrates 10 candidate test plans in the initial population 356, it is contemplated that the initial population 356 of candidate test plans may comprise any number of candidate test plans. Although Fig. 3 illustrates that the encoding comprises a test case set and test cycles associated with each test case of the test case set, it is contemplated that the number of bit-strings and information encoded by the bit-strings may be adjusted based on the requirements of the tester and/or test objectives.

[0078] According to some embodiments, the candidate test plans may be evolved in the evolution stage 316. The evolution stage 316 may comprise an evolutionary procedure 364 comprising a tournament selection 372, crossover 376, and/or mutation 380. During tournament selection 372, two candidate test plans are selected from the initial population of candidate test plans and designated as parent test plans (indicated as 'Parent A' and 'Parent B'). In some embodiments, the two candidate test plans selected and designated as the parent test plans may be selected randomly. In some embodiments, the two candidate test plans selected and designated as the parent test plans may be the two candidate test plans with the highest evaluation function score. Offspring test plans are generated from the parent test plans 'Parent A' and 'Parent B' through crossover 376 and/or mutation 380. Crossover 376 comprises the exchange of values between corresponding bit-strings of 'Parent A' and 'Parent B' (illustrated as double-headed arrows), while mutation 380 comprises the random conversion of '0' values to 'I', and '1' values to '0' (illustrated as rectangles). Additional information on crossover and mutation may be found in -Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning" by Shen el.

al. In some embodiments, simulated binary crossover (SB X) may be used for crossover 376, and random bit selection may be used for mutation 380. Any number of offspring test plans may be generated. In some embodiments, the number of offspring test plans generated may be determined using hyperparameters and/or may be user-defined.

[0079] In some embodiments, the offspring test plans generated may be evaluated at the evaluation stage 324 based on the evaluation function and, optionally, a belief model 384 generated based on test records stored on a test record database 388. The evaluation function used may be the above-described Equation (5), although other evaluation functions may be used.

[0080] In some embodiments, the offspring test plans may then be selected in selection stage 332, wherein a population 392 of offspring test plans (illustrated as patterned circles) with the highest evaluation function score are selected, and the population of selected offspring test plans are merged with a number of test plans from the initial population 356 of candidate test plans with the highest evaluation function scores to form a next generation population 396. The population 392 of offspring test plans may comprise any number of offspring test plans. The next generation population 396 may comprise any number of test plans from the initial population 356 of candidate test plans. In some embodiments, the number of offspring test plans and/or test plans from the initial population of candidate test plans may be determined using hyperparameters and/or may be user-defined. The next generation population 392 is then designated as the initial population 356 of candidate test plans for the next iteration or cycle.

[0081] Fig. 4 is a schematic illustration of a system that may be used for the implementation of one or more test cases, in accordance with embodiments of the present disclosure. System 400 may be used to implement one or more steps of method 100. In some embodiments, system 400 may comprise one or more processors 408 configured to carry out step 140 of method 100. In some embodiments, system 400 may be installed in a vehicle 416 for testing the software to be tested and executing one or more test cases on such software on the vehicle. In some embodiments, executing the one or more test cases on such software on the vehicle may comprise the control of one or more vehicle components 424 and/or receipt of signals from the one or more vehicle components 424. An example of a test case that may be run on a vehicle is a test for emergency call (safety system), wherein airbag pulse wide modulation (PWM) and control area network (CAN) signals are collected and used by a data communication module (DCM) to make decisions of collision detection, in which an Advanced Collision Notification Call feature will be triggered (ECall). Another example of a test case that may be run on a vehicle is a test for backup battery (BUB), wherein an Arduino platform is used to control a data communication module (DCM) with coding, and a system on chip (SOC) and vehicle microcontroller (VUC) is used to monitor the DCM activity. The preconditions for such BUB test is that the backup battery is connected, the DCM power is ON, the ignition (IGN) is ON, and the ACC (accessory) is ON. The steps for such a BUB test is as follows: 1 Check that serial ports are not occupied by other communication channel other than the automation test platform (ATP) (to avoid failure of ports occupied) 2 Turn off DCM power (IGN and ACC still on) 3 ATP wait for 10 sec 4 Check BUB is actively supplying power 5 ATP wait for 5 sec 6 Turn on DCM power (IGN and ACC still on) 7 Check BUB is inactive If any of the checking of step 4, followed by step 7 has failed, flaws in the BUB module of the tested software might be identified. In some embodiments, the flaws may be identified by further attention of the tester. The flaws may be in the software and/or hardware.

[0082] Fig. 5 illustrates an example of a computing system, in accordance with embodiments of the present disclosure. Computing system 500 can be used, for example, for one or more steps of method 100. Computing system 500 can be used for one or more of components of system 400 of Fig. 4. System 500 can be a computer connected to a network. System 500 can be a client or a server. As shown in Fig. 5, system 500 can be any suitable type of processor-based system, such as a personal computer, workstation, server, handheld computing device (portable electronic device) such as a phone or tablet, or an embedded system or other dedicated device. The system 500 can include, for example, one or more of input device 520, output device 530, one or more processors 510, storage 540, and communication device 560. Input device 520 and output device 530 can generally either be connectable or integrated with the computing system 500. In some embodiments, storage 540 may store the test case library 348 and/or the test record database 388.

[0083] Input device 520 can be any suitable device that provides input, such as a touch screen, keyboard or keypad, mouse, gesture recognition component of a virtual/augmented reality system, or voice-recognition device Output device 530 can be or include any suitable device that provides output, such as a display, touch screen, haptics device, virtual/augmented reality display, or speaker.

[0084] Storage 540 can be any suitable device that provides storage, such as an electrical, magnetic, or optical memory including a RAM, cache, hard drive, removable storage disk, or other non-transitory computer readable medium. Communication device 560 can include any suitable device capable of transmitting and receiving signals over a network, such as a network interface chip or device. The components of the computing system 500 can be connected in any suitable manner, such as via a physical bus or wirelessly.

[0085] Processor(s) 510 can be any suitable processor or combination of processors, including any of or any combination of a central processing unit (CPU), graphics processing unit (GPU), field programmable gate array (FPGA), and application-specific integrated circuit (ASIC). Software 550, which can be stored in storage 540 and executed by one or more processors 510, can include, for example, the programming that embodies the functionality or portions of the functionality of the present disclosure (e.g., as embodied in the devices as described above) For example, software 550 can include one or more programs for execution by one or more processor(s) 510 for performing one or more of the steps of method 100.

[0086] Software 550 can also be stored and/or transported within any non-transitory computer-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as those described above, that can fetch instructions associated with the software from the instruction execution system, apparatus, or device and execute the instructions. In the context of this disclosure, a computer-readable storage medium can be any medium, such as storage 540, that can contain or store programming for use by or in connection with an instruction execution system, apparatus, or device.

[0087] Software 550 can also be propagated within any transport medium for use by or in connection with an instruction execution system, apparatus, or device, such as those described above, that can fetch instructions associated with the software from the instruction execution system, apparatus, or device and execute the instructions In the context of this disclosure, a transport medium can be any medium that can communicate, propagate or transport programming for use by or in connection with an instruction execution system, apparatus, or device. The transport computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, or infrared wired or wireless propagation medium [0088] System 500 may be connected to a network, which can be any suitable type of interconnected communication system. The network can implement any suitable communications protocol and can be secured by any suitable security protocol. The network can comprise network links of any suitable arrangement that can implement the transmission and reception of network signals, such as wireless network connections, Ti or T3 lines, cable networks, DSL, or telephone lines.

[0089] System 500 can implement any operating system suitable for operating on the network. Software 550 can be written in any suitable programming language, such as C, C++, Java, or Python. In various embodiments, application software embodying the functionality of the present disclosure can be deployed in different configurations, such as in a client/server arrangement or through a Web browser as a Web-based application or Web service, for example [0090] Figs. 6 to 8 show the results of experiments conducted using methods in the present disclosure, in accordance with embodiments of the present disclosure. Figs. 6a to 6c show the results of test plans generated using an evolutionary algorithm based on Equation (1) as compared to test plans generated using random selection methods. As illustrated in Fig. 6a, the method disclosed in the present disclosure is able to learn and discard solutions with low fail-rates and converges quickly as compared to a random selection method, thus producing test plans with high fail-rates. As illustrated in Fig. 6b, the method disclosed in the present disclosure is able to retain solutions that maximise the allocated resources without violating any time constraints as compared to a random selection method. Although Fig. Gc illustrates that the method disclosed in the present disclosure may generate test plans with lower coverage as compared to random selection methods, the coverage is satisfactory in view of the high fail-rate and the testing time adherence.

[00911 Figs. 7a to 7c show the results of test plans generated using an evolutionary algorithm based on Equation (2) as compared to test plans generated using random selection methods. "Equal" means that the reward terms were treated equally. In the experiment, the values were set as a = -' = ' -and y = -1 for "Equal". "Fail" means that the method prioritised failure-3 3 3 rate by setting y higher than a and P. In the experiment, the values were set as a = 0.1, fl = 0.1, and y = 0.8 for "Fair "Time" means that the method prioritised test time constraints by setting p higher than a and y. In this experiment, the values were set as a = 0.1, fl = 0.8, and y = 0.1 for "Time". "Coverage" means that the method prioritised coverage by setting a higher than y and O. In this experiment, a = 0.8, fl = 0.1, and y = 0.1 for "Coverage". As illustrated in Figs. 7a to 7c, when the weight functions were treated equally, the test plans generated found a balance among the failure rate, testing time, and coverage When the weight functions were adjusted, the test plans generated had better results for the reward term with a higher weightage. For example, when 7 is set as much higher than a and p in "Fail", the method generates test cases with high failure rate while ignoring coverage and test time.

[0092] Figs. 8a to 8c show the results of test plans generated using various embodiments of the present disclosure based on Equation (5). In all the embodiments, the weight functions a, p, and y were set as indicating that all terms were treated equally. "EA-only" means that the method used an evolutionary algorithm, and the test plans were each encoded with a single bit-string. "EA-Belief' means that the method used an evolutionary algorithm, incorporated a belief model, and the test plans were each encoded with a single bit-string.

"EA-Encoding" means that the method used an evolutionary algorithm and the test plans were each encoded with a plurality of bit strings, in particular a first bit-string denoting the one or more test cases randomly selected to form such test plan, and at least one second bit-string denoting a number of test cycles for each test case. "EA-Encoding-Belief' means that the method used an evolutionary algorithm, incorporated a belief model, and the test plans were each encoded with a plurality of bit strings, in particular a first bit-string denoting the one or more test cases randomly selected to form such test plan, and at least one second bit-string denoting a number of test cycles for each test case. As illustrated in Figs. 8a to 8c, the "EA-Encoding-Belief' method had the best performance (at least 50% improvement) in meeting the test objectives (high defect detection and high test priorities) while obeying the time limit. Specifically, by comparing the results from "EA-Only" and "EA-Encoding", we can see that encoding the test plans as a plurality of bit-strings is able to encode the information to satisfy the test objectives. The "EA-Only" method appeared to only optimize test priority and time limit while showing poor performance in discovering defects. By encoding the test plans as a plurality of bit-strings, the "EA-Encoding" method is able to balance the performance between defect discovery and test priority while obeying the time limit. However, without the belief model, the "EA-Encoding" method may assign inefficient test cycles for the test cases to detect defects. Additionally, the -EA-Encoding' method appears to also assigns unnecessary numbers of test cycles to some test cases, leaving no time budget for selecting the remaining high-priority test cases. Therefore, with the incorporation of the belief model, the -EA-Encoding-Belief' method is able to assign suitable numbers of test cycles to the test cases, leading to a higher probability of finding defects while covering the test cases with high priorities.

[0093] Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. It is therefore intended that the scope of the invention be limited not by this detailed description, but rather by any claims that issue on an application based here on. Accordingly, the embodiments of the present invention are intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.

Claims

CLAIMS1. A computer-implemented method for assessing flaws of a software, the method comprising: defining a test case set for assessing a software based on the software to be assessed, wherein the test case set comprises a plurality of test cases and each test case is associated with at least one test parameter defining such test case; initialising an initial population of candidate test plans, wherein each test plan comprises one or more test cases selected from the defined test case set; optimising the initial population of candidate test plans based on an evaluation function that accounts at least for one or more test objectives; and selecting one or more candidate test plans with best evaluation function score or scores respectively calculated using the evaluation function.
2. The computer-implemented method of claim 1, wherein the test case set is selected from a test case library comprising test cases executed on previous versions of the software, test cases previously executed on similar software, newly defined test cases and/or generic test cases
3. The computer-implemented method of any of the preceding claims, wherein optimising the initial population of candidate test plans comprises using an evolutionary algorithm to evolve the initial population of candidate test plans into an evolved population of candidate test plans based on the evaluation function.
4. The computer-implemented method of any of the preceding claims, wherein each candidate test plan comprises a plurality of bit-strings, wherein each bit-string represents a feature of the candidate test plan and/or a feature of each test case that forms the test case set.
5. The computer-implemented method of any of the preceding claims, wherein each candidate test plan comprises a first bit-string denoting the one or more test cases selected to form such candidate test plan, and at least one second bit-string each denoting a feature of each test case of the test case set, wherein the first bit-string is preferably a bit-string of length corresponding to the number of test cases in the test case set and denotes which test cases of the test case set were assigned to form the test plan, and each of the at least one second bit-string is preferably a bit-string representing the feature.
6. The computer-implemented method of claim 5, wherein each of the at least one second bit-string denotes a number of test cycles for each test case.
7. The computer-implemented method of any of the preceding claims, wherein the evaluation function comprises one or more reward terms and/or one or more weight functions that are determined based on the one or more test objectives.
8. The computer-implemented method of claim 7, wherein the one or more reward terms is determined based on a function that models a relationship between two or more test 10 parameters
9. The computer-implemented method of any of the preceding claims, wherein the evaluation function comprises priority level and failure rate as positive reward terms and/or time spent as a negative reward term and/or one or more weight functions.
10. The computer-implemented method of claim 9, wherein the failure rate is determined using a function that models a relationship between a number of system failures and a number of test cycles
11. The computer-implemented method of any of the preceding claims, further comprising: executing at least one test case of the one or more selected candidate test plans to generate test results; and identifying one or more flaws in the software based on the generated test results, wherein executing the at least one test case preferably comprises testing the software on a vehicle and generating test results using signals received from one or more vehicle components.
12. The computer-implemented method of claim 11, wherein executing at least one of the one or more selected candidate test plans is carried out automatically, preferably using one or more automation scripts.
13. A computing system for assessing flaws of a software, the computing system comprising one or more processors and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for carrying out a computer-implemented method according to any of the preceding claims.
14 The computing system of claim 13, wherein the computing system is connected to a vehicle and configured to receive signals from one or more vehicle components.
15. A computer program, a machine-readable storage medium, or a data carrier signal that comprises instructions, that upon execution on a data processing device and/or control unit, cause the data processing device and/or control unit to perform the steps of a computer-implemented method according to any one of claims 1 to 12.