CN117370151B

CN117370151B - Reduction and optimization method, device, medium and equipment for test case execution

Info

Publication number: CN117370151B
Application number: CN202311161198.7A
Authority: CN
Inventors: 刘云龙
Original assignee: China Software Evaluation Center
Current assignee: China Software Evaluation Center
Priority date: 2023-09-08
Filing date: 2023-09-08
Publication date: 2024-03-29
Anticipated expiration: 2043-09-08
Also published as: CN117370151A

Abstract

The invention provides a reduction and optimization method, a device, a medium and equipment for test case execution, which are used for collecting the coverage times of each test case to each code unit in a program file; calculating the coverage density of the code units traversed by each test case by adopting a kernel density estimation algorithm: calculating the coverage intensity of each test case on each code unit in the program file to obtain a code unit coverage intensity matrix; quantizing the computing code unit coverage priority; clustering the test cases in the test case set by using a split hierarchical clustering method and storing a cluster partitioning process; extracting a target cluster partition set meeting a first preset condition to form an intra-cluster queue; taking a test case from each cluster queue to form an inter-cluster queue; at least one test case is fetched from the inter-cluster queue and executed. The method has more accurate feature description of the test cases, can flexibly configure the test case clusters according to the requirements of users, and has positive effects of improving the continuous integrated development efficiency of software and reducing the research and development cost.

Description

Reduction and optimization method, device, medium and equipment for test case execution

Technical Field

The present invention relates to the field of software testing, and in particular, to a method, an apparatus, a medium, and a device for reducing and optimizing test case execution.

Background

Along with the development of information industry, the information demand based on software is continuously upgraded and expanded, the software scale is also larger and larger, the complexity is continuously improved, the quality hidden trouble is more and more caused, and the information risk is more and more remarkable. As a necessary means for guaranteeing the quality of software, software testing is a necessary link of a software life cycle, and is a key link in the process of checking and improving software design and realizing quality assurance, and the position in the software life cycle is also increasingly important. Efficient discovery of errors in software to ensure software quality is also an important issue for software testing efforts.

The test cases are the basis for performing software testing, and along with the increase of the software scale and the enhancement of continuous integration and continuous development demands, the number of test case sets is continuously increased, and the problems of redundancy, insufficient coverage and the like are unavoidable. The number and quality of test case sets determines the cost, efficiency, and effectiveness of software testing. A test case set with high coverage and simplicity is beneficial to reducing test cost and improving test efficiency and effectiveness. In the test working process, the number and the priority of the test cases are closely related to the cost, the efficiency, the quality and the like of the test, and are always important in the field of software test research. In software testing, the reduction and execution of the cases refer to solving and generating a least-scale test case subset based on the existing test case set, and designating the execution sequence of the cases, so that the reduction subset is equivalent to the original set in test capability, and problems are covered and found according to a certain execution rule, and the test requirements are met, thereby improving the test efficiency and reducing the cost. The method is used as a key means for improving the software testing efficiency and reducing the software testing cost, becomes an important research subject for guaranteeing the software quality, and is widely focused in the industry and academia.

In the process of software development and maintenance, codes are required to be modified due to requirements of defect modification, function perfection, performance optimization and the like, and then software evolution is triggered, so that the software codes are changed. Code alterations typically require running regression tests to evaluate the effects of incoming code modifications and potential alterations. The regression test is an effective software test method, and is an important method for guaranteeing the software correctness and improving the software quality. Regression testing can occur during unit testing, integration testing, etc., and is an important, complex and time-consuming task. The applicant finds that the test requirements are continuously changed in the regression test process, and the test examples of the regression test set are more and more, if all the existing use case sets are executed without policy, the test cost is greatly increased, and the problems of huge number of use cases, redundant use cases, low test efficiency and the like are caused. The regression testing cost is up to 80% of the overall testing cost and accounts for over 50% of the software maintenance cost. Theoretically, due to the limited testing resources, the method of running all regression testing cases is not feasible due to the limitation of the regression testing cost of manpower, time and the like. Therefore, reduction and optimization of regression testing sets is necessary to reduce regression testing costs.

Disclosure of Invention

The invention aims to provide a reduction and optimization method, a device, a medium and equipment for test case execution, which reduce regression test cost by reducing and optimizing test cases in regression test set.

In a first aspect, an embodiment of the present invention provides a method for reducing and optimizing test case execution, including:

collecting the coverage times of each test case to each code unit in the program file;

calculating the coverage density of the code units traversed by each test case by adopting a kernel density estimation algorithm:

calculating the coverage intensity of each test case to each code unit in the program file based on the coverage density and the coverage times to obtain a code unit coverage intensity matrix;

quantitatively calculating code unit coverage priorities based on the code unit coverage intensity matrix;

based on the code unit coverage intensity matrix, carrying out cluster division on the test cases in the test case set by utilizing a split hierarchical clustering method, and storing a cluster division process;

extracting a target cluster division set meeting a first preset condition in the cluster division process;

sequencing the priority of each cluster in the target cluster division set to form a cluster queue;

And taking a test case from each intra-cluster queue, and forming an inter-cluster queue based on the code unit coverage priority ordering.

In some implementations, calculating the coverage intensity of each test case to each code unit in the program file based on the coverage density and the coverage times includes:

the coverage intensity of each test case to each code unit in the program file is calculated by adopting the following calculation formula:

wherein, covst (u) _ij ) Representing the coverage intensity, p, of the jth code unit covered by the ith test case _h (u _ij ) The coverage density, count (u) _ij ) The number of times of coverage of the jth code unit covered by the ith test case is represented, m represents the number of test cases in the test case set, and n represents the number of code units in the program file.

In some implementations, the elements in the code unit coverage intensity matrix are values obtained by normalizing the coverage intensity of the code units in the program file for each test case.

In some implementations, the computing the code unit coverage priority based on the code unit coverage intensity matrix includes:

Calculating the overall coverage intensity of the code unit based on the code unit coverage intensity matrix;

calculating the importance of the code unit in a quantization mode based on the overall coverage intensity of the code unit;

based on the code unit importance, the code unit coverage priority of each test case is quantitatively calculated.

In some implementations, the computing the overall coverage intensity of the code unit based on the code unit coverage intensity matrix includes:

the overall coverage strength of the code unit is calculated using the following calculation formula:

wherein, covst (u) _j ) Representation ofThe overall coverage of all test cases to the jth code unit, covst (u _ij ) The coverage intensity of the jth code unit covered by the ith test case is represented, m represents the number of test cases in the test case set, and n represents the number of code units in the program file.

In some implementations, the computing code unit importance based on the overall coverage intensity of the code unit includes:

the code unit importance is quantitatively calculated using the following calculation formula:

wherein impt (j) represents the code element importance of the jth code element, covst (u) _j ) And the overall coverage intensity of all the test cases to the jth code unit is represented, m represents the number of the test cases in the test case set, and n represents the number of the code units in the program file.

In some implementations, the quantitatively calculating the code unit coverage priority of each test case based on the code unit importance includes:

wherein, the priority _covum (t _i ) Representing the ith test case t _i Irregular code cell coverage of (a)Cover priority, priority _cov (t _i ) Representing the ith test case t _i Is normalized, and impt (i, j) represents the ith test case t _i The code unit importance of the j-th code unit covered, impt (j) represents the code unit importance of the j-th code unit, m represents the number of test cases in the test case set, n represents the number of code units in the program file, delta (impt (i, j)) represents a judging function, and the i-th test case t is judged _i Whether the jth code unit is covered or not, if so, the function value is 1, otherwise, is 0.

In some implementations, the clustering process for cluster-dividing and storing test cases in the test case set by using a split hierarchical clustering method based on the code unit coverage intensity matrix includes:

taking test case set as cluster c to be split _o { co } = C, C representing a cluster partition set;

computing cluster c _o And cluster c _o The diameters of the cluster data are saved to a cluster diameter dictionary, the cluster diameter dictionary is added to a cluster division list, and the cluster division list is added to a cluster division process;

Judgment cluster c _o Whether the number of test cases is equal to 2:

if cluster c _o If the number of test cases in the cluster c is equal to 2 _o Direct cleavage into 2 clusters c _o+1 、c _o+2 And the number of test cases in each cluster is equal to 1, and the cluster diameter dictionary in the cluster division list is replaced by { c } _o+1 ：0},{c _o+2 :0}, adding a replaced cluster division list in the cluster division process, and executing the step of judging whether the number of test cases in each cluster in the cluster division set C is 1;

if cluster c _o If the number of the test cases in the test program is more than 2, calculating c _o Average distance between each test case and other test cases;

finding out a test case corresponding to the maximum value of the average distance as a split test case, and recording the sequence number of the split test case;

cluster c _o Split into 2 initial clusters c _o+1 、c _o+2 Wherein cluster c _o+1 The test cases in (a) are split test cases, cluster c _o+2 The test case in (a) is cluster c _o Other test cases in (a);

judgment cluster c _o+2 Whether the number of test cases is 1:

if cluster c _o+2 The number of the test cases in the cluster division set C is 1, and the step of judging whether the number of the test cases in each cluster in the cluster division set C is 1 is executed;

if cluster c _o+2 The number of test cases in the test pattern is not 1, and clusters c are calculated respectively _o+2 To cluster c for each test case in _o+1 Average distance of test cases of (c) and cluster c _o+2 To cluster c for each test case in _o+2 The average distance of the test cases in the cluster c is determined according to the comparison result of the average distance _o+2 Middle distance cluster c _o+1 The most recent test case is attributed to cluster c _o+1 And from cluster c _o+2 Executing the step of judging whether the number of the test cases in each cluster in the cluster division set C is 1 or not until the second preset condition is met;

judging whether the number of test cases in each cluster in the cluster division set C is 1:

if the number of the test cases in each cluster in the cluster division set C is 1, ending the cluster division process;

if the number of test cases in each cluster in the cluster division set C is not equal to 1, taking the cluster with the largest diameter as the cluster C to be split _o Execute decision cluster c _o A step of testing whether the number of cases is equal to 2.

In some implementations, the extracting the target cluster partition set satisfying the first preset condition in the cluster partition process includes:

extracting a target cluster division set meeting a first preset condition based on a preset cluster number or a diameter upper limit threshold value for use case reduction;

the first preset condition is a cluster division set corresponding to a cluster division list with the element number meeting the number of the clusters in the cluster division process, or a cluster division set corresponding to a cluster division list with the first diameter smaller than or equal to the diameter upper limit threshold in the cluster division process.

In some implementations, before the sorting the priorities of the test cases in each cluster in the target cluster partition set to form the cluster queues, the method further includes:

the intra-cluster priority of the test case is calculated using the following formula:

wherein, the priority _clst (t _i ) Representing the ith test case t _i Intra-cluster priority of d (t) _i ,t _cen ) Representing the ith test case t _i To the cluster center t _cen Distance, priority of (2) _cov (t _i ) Representing the ith test case t _i The |c| represents the number of test cases in cluster c, and cluster c represents any cluster in the target cluster partition set.

In some implementations, the method of the embodiment further includes: at least one test case is fetched from the inter-cluster queue and executed.

In some implementations, the method of the embodiment further includes:

if the currently fetched and executed test case does not reach the execution expectation, continuing to fetch and execute a case from the inter-cluster queue if the inter-cluster queue is not empty, otherwise, executing the step of fetching a test case from each intra-cluster queue and forming the inter-cluster queue based on the code unit coverage priority ordering;

and ending the test case execution under the condition that the currently fetched and executed test case reaches the execution expectation.

In a second aspect, an embodiment of the present invention provides a reduction and optimization apparatus for test case execution, including:

the collection module is used for collecting the coverage times of each test case to each code unit in the program file;

the first calculation module is used for calculating the coverage density of the code units traversed by each test case by adopting a kernel density estimation algorithm:

the second calculation module is used for calculating the coverage intensity of each test case to each code unit in the program file based on the coverage density and the coverage times to obtain a code unit coverage intensity matrix;

a third calculation module for quantitatively calculating code unit coverage priorities based on the code unit coverage intensity matrix;

the clustering module is used for carrying out cluster division on the test cases in the test case set by utilizing a split hierarchical clustering method based on the code unit coverage intensity matrix and storing a cluster division process;

the extraction module is used for extracting a target cluster division set meeting a first preset condition in the cluster division process;

the intra-cluster sequencing module is used for sequencing the priority of each intra-cluster test case in the target cluster partition set to form an intra-cluster queue;

and the inter-cluster sequencing module is used for taking a test case from each intra-cluster queue and sequencing based on the code unit coverage priority to form an inter-cluster queue.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by at least one processor, implements a method as described in the first aspect.

In a fourth aspect, an embodiment of the present invention provides an electronic device, including a memory and at least one processor, where the memory stores a computer program, where the computer program implements the method according to the first aspect when executed by the at least one processor.

The embodiment of the invention has at least the following beneficial effects:

the scheme provided by the embodiment of the invention has more accurate description on the characteristics of the test cases, can flexibly configure the test case clusters according to the requirements of users, has positive effects on improving the continuous integrated development efficiency of software and reducing the development cost, and can promote the further optimization and improvement of the software development capability and the test technology. On the basis of guaranteeing the software correctness and improving the software quality, the number of test cases and the redundancy of the test cases are effectively reduced, the test efficiency is improved, the test cost is reduced, the software evolution and maintenance efficiency is further improved, and the service cost is reduced.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate certain embodiments of the present invention and therefore should not be considered as limiting the scope.

FIG. 1 is a flow chart of a method for reducing and optimizing test case execution provided by an embodiment of the invention;

FIG. 2 is a flowchart of a method for reducing and optimizing test case execution provided by an embodiment of the present invention;

FIG. 3 is a block diagram of a reduction and optimization apparatus for test case execution provided by an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.

In the process of software development and maintenance, codes are required to be modified due to requirements of defect modification, function perfection, performance optimization and the like, and then software evolution is triggered, so that the software codes are changed. Code alterations often require running regression tests to evaluate the effects of incoming code modifications and potential alterations.

In the related art, the regression test is taken as an effective software test method, and is an important method for guaranteeing the software correctness and improving the software quality. Regression testing can occur during unit testing, integration testing, etc., and is an important, complex and time-consuming task. If all the existing case sets are executed without policy, the test cost is greatly increased, and the problems of huge case number, case redundancy, low test efficiency and the like are caused. The regression testing cost is up to 80% of the overall testing cost and accounts for over 50% of the software maintenance cost. Theoretically, due to the limited testing resources, the method of running all regression testing cases is not feasible due to the limitation of the regression testing cost of manpower, time and the like. Therefore, reduction and optimization of regression testing sets is necessary to reduce regression testing costs.

The following describes in detail the embodiments of the invention in connection with several examples.

Example 1

In regression testing, code coverage information is widely used in testing as a common test case reduction and optimization criterion. In the test optimization process based on the code coverage information, the test capability of the test case is usually expressed as the coverage capability of the test case on the code units, so the code coverage information is an important index for measuring the test capability of the test case. In the test process, it is generally required to cover sentences in the code as much as possible for the tested program, and the program can be divided into different program code units according to different fine granularity. Common code units include branches, blocks, sentences and the like, and can respectively form a branch coverage matrix, a block coverage matrix and a sentence coverage matrix. According to the different code division degrees, the divided program individuals are defined as code units, conventionally, when a certain code unit is executed by a test case, the code unit is covered and marked as 1, otherwise, the code unit is marked as 0.

In the aspect of code unit coverage matrix generation, the embodiment gives 4 influencing factors for the coverage situation of the code units based on the consideration of the relevance of the running context of the program:

1. In the execution process of the test case, the test case may be repeatedly covered for the same code unit;

2. the context logic of code running exists in the test case, namely, the logic sequence and the relevance exist;

3. the logical distance between the code units of the adjacent paths traversed by each test case is 1;

4. the coverage of code units should be based on the logical distance between code units while introducing factors that influence other code units on it, in view of the execution context, except considering the case where itself is covered.

Based on the influence factor analysis of the code unit coverage situation, the embodiment of the invention creatively provides a method for establishing the code unit coverage intensity matrix based on the weighted coverage times of the kernel density estimation in the aspect of the code unit coverage intensity matrix, assigns values to the code unit coverage intensity row elements of each test case, comprehensively considers the mutual influence of the density estimation among the code units of the same test case, and simultaneously establishes the code unit coverage intensity matrix by taking the code unit coverage times as important basis and base number of the code unit coverage situation.

In practical application, when executing the reduction and optimization method for test case execution provided in this embodiment, it may be preset that:

1. Selecting a kernel function;

2. setting a bandwidth parameter h of the nuclear density estimation;

3. setting a measurement mode of the distance between test cases, and selecting one of Euclidean distance, manhattan distance, chebyshev distance, cosine distance and Minkowski distance;

settings that may also be entered by the user:

1. code unit types, such as: branches, blocks, statements;

2. setting the cluster number clstnum for test case reduction, or setting the upper diameter threshold tstthresh of the cluster at the termination time;

3. setting an instance execution switch, wherein 'on' represents executing a step of taking out at least one test instance from an inter-cluster queue and executing the test instance, feeding back an execution result, and 'off' represents taking the inter-cluster queue generated by a reduction and optimization method of the test instance execution as an operation result;

4. setting the execution quantity of test cases;

5. setting code unit coverage statistics, which is divided into: counting the coverage condition of the code units of all test cases, and counting the coverage condition of the code units of the specified test cases;

6. setting an update index of the test case, namely, an index of the test case in the code unit coverage statistics;

7. and setting the update condition of the test case, namely, corresponding operations (new addition, deletion and modification) of the test case indexed in the corresponding 6.

And successively loading test cases into the program file from the test case set T and running to obtain the execution coverage condition of the test cases on the program code unit.

The method for reducing and optimizing test case execution provided in this embodiment, as shown in fig. 1, at least includes steps S101 to S109:

step S101, collecting the coverage times of each test case to each code unit in the program file.

And counting the coverage condition of each test case ti in the test case set T for each code unit in the program file and the coverage times count of each test case ti for each code unit. In one example, the code unit coverage times matrix is shown in table 1:

TABLE 1 code cell coverage times matrix schematic

The test case set T includes test cases T1, T2, T3, and T4, code units in the program file include u1, u2, u3, u4, u5, u6, u7, u8, and u9, and the number of times of coverage of the test case T1 on the code unit u1 is 2, and so on.

And step S102, calculating the coverage density of the code units traversed by each test case by adopting a kernel density estimation algorithm.

In some implementations, the following calculation is specifically employed:

es _i = { q|q is the sequence number of the code unit of the execution path of the test case i, q epsilon N+, 1.ltoreq.q.ltoreq.n, N is the total code unit number of the program file },

1≤|es _i |≤n

Wherein p is _h (u _ij ) Representing the coverage density estimation value of the jth code unit covered by the ith test case under the bandwidth parameter h, wherein the bandwidth parameter h is a super parameter, and the smaller the h is, the fewer the points participating in fitting in the neighborhood are, d _u (u _ij ,u _iq ) Indicating the distance of the jth code unit covered by the ith test case relative to the jth code unit covered by the ith test case, if j=q is 0, and the distance of the adjacent execution path code units is 1, and so on, the distance is sequentially increased by 1 from near to far. m represents the number of test cases in the test case set, n represents the number of total code units in the program file, es _i For the sequence number set of the execution path code unit of the ith test case, the sequence number takes positive integer values from 1 to n, and is |es _i| The number of execution path code units representing the i-th test case. K represents a kernel function, which is non-negative, has an integral of 1, meets the probability density distribution property, and has a mean value of 0, where different functions can be selected, such as a linear kernel function, a polynomial kernel function, a Gaussian kernel function, etc. Thus, the coverage density of each code unit covered by each test case can be counted.

And step S103, calculating the coverage intensity of each test case to each code unit in the program file based on the coverage density and the coverage times to obtain a code unit coverage intensity matrix.

wherein, covst (u) _ij ) Representing the coverage intensity, p, of the jth code unit covered by the ith test case _h (u _ij ) The coverage density, count (u) _ij ) The number of times of coverage of the jth code unit covered by the ith test case is represented, m represents the number of test cases in the test case set, and n represents the total number of code units in the program file.

In this embodiment, normalization regularization is performed on the coverage intensity values of the code units covered by each test case, and the values are processed into values between 0 and 1. The resulting m n specification code cell coverage intensity matrix is thus generated, with the resulting code cell coverage intensity matrix in one example being shown in table 2.

Table 2 code cell coverage intensity matrix

Table 2 is generated based on table 1, and the number of times of code unit coverage is weighted by the coverage density of each code unit to form each row element of the coverage intensity matrix, and the rows in the regularized matrix are used to generate the final coverage matrix. Since the execution of code units is sequential in one test case, the distance between code units should be inversely related to the coverage intensity, in the distance setting of the core density estimation, the distance between adjacent path code units is 1, the distance increases in units of 1 from the near to the far, the coverage intensity of the code units uncovered by the test case is 0, for example, in table 2, the coverage intensity of the code unit u2 by the test case t1 is 0, which indicates that the code unit u2 is not covered by the test case t 1.

The coverage intensity of each code unit after each test case coverage execution is the basis of the test case priority quantification, the code unit coverage intensity and the code unit importance form negative correlation, and the code unit importance and the test case priority form positive correlation, so the quantification of the test case priority needs to integrate the coverage intensity of each code unit by the test case and form statistical negative correlation with the coverage intensity.

Step S104, based on the code unit coverage intensity matrix, the code unit coverage priority is calculated in a quantization mode.

In some implementations, computing the code unit coverage priority based on the code unit coverage intensity matrix includes:

step S104a, calculating the whole coverage intensity of the code unit based on the code unit coverage intensity matrix.

In some implementations, calculating the overall coverage intensity of the code unit based on the code unit coverage intensity matrix includes:

wherein, covst (u) _j ) Representing the overall coverage of all test cases to the jth code unit, covst (u _ij ) The coverage strength of the jth code unit for the ith test case coverage is represented, m represents the number of test cases in the test case set, and n representsNumber of code units in the program file.

In this embodiment, normalization regularization is performed on the overall coverage intensity values of the code units, and the numbers are processed to values between 0 and 1. The resulting overall coverage intensity value of the code unit is thus obtained.

Step S104b, based on the overall coverage intensity of the code unit, the importance of the code unit is calculated in a quantization mode.

In some implementations, computing the code unit importance based on the overall coverage intensity of the code unit includes:

wherein impt (j) represents the code element importance of the jth code element, covst (u) _j ) And the overall coverage intensity of all the test cases to the jth code unit is represented, m represents the number of the test cases in the test case set, and n represents the number of the code units in the program file. Since normalization regularization is employed, the sum of impt (j) is 1.

Step S104c, based on the importance of the code units, the code unit coverage priority of each test case is calculated in a quantization mode.

The priority of test case ti is equivalent to the average importance of the code units it covers, considering 2 aspects: the method comprises the steps of firstly, weighting the sum of importance of the code units covered by the test case by the coverage times, and secondly, the total coverage times of the code units covered by the test case. And after the average importance of the code units covered by each test case is calculated, the code unit coverage priority of the irregular test case is formed, and normalization regularization treatment is carried out on the code unit coverage priority of the irregular test case, so that the code unit coverage priority of the test case is controlled to be in the interval of [0,1] and the sum is 1.

In some implementations, based on the code unit importance, the code unit coverage priorities for each test case are quantitatively calculated, including:

wherein, the priority _covum (t _i ) Representing the ith test case t _i Is used for covering the priority of the non-regularized code units _cov (t _i ) Representing the ith test case t _i Is the ith test case t _i Is the code unit coverage priority of (i, j) represents the ith test case t _i The code unit importance of the j-th code unit covered, delta (impt (i, j)) represents a determination function, and the i-th test case t is determined _i Whether the jth code unit is covered or not, if so, the function value is 1, otherwise, is 0. The code unit coverage priorities of all test cases are quantitatively calculated.

Step 105, based on the code unit coverage intensity matrix, performing cluster division on the test cases in the test case set by using a split hierarchical clustering method, and storing a cluster division process.

In this embodiment, a row vector of a code unit coverage intensity matrix is used as a test sample set, and a split hierarchical clustering method is used for clustering to form cluster division of test case sets, so that the test case sets are classified to obtain a cluster division set C to support test case reduction.

In some implementations, based on the code unit coverage intensity matrix, the clustering process of cluster division and saving of test cases in the test case set by using a split hierarchical clustering method includes:

step S105a, taking the whole test case set T as a cluster c to be split _o ，{c _o And C, C represents a cluster division set of the entire test sample set.

Step S105b, computing cluster c _o And cluster c _o And the diameters of the cluster data are saved to a cluster diameter dictionary, the cluster diameter dictionary is added to a cluster division list, the cluster division list is added to a cluster division process, and the cluster division process is saved.

clstprcs＝[clstdictlist]，

clstdictlist＝[{c _o ：D(c _o )}]，

Wherein { c } _o ：D(c _o ) The cluster diameter dictionary, clstDictlist, cluster partition list, clstprcs, and c cluster partition process (clustering process).

Wherein D (c) _o ) Representing cluster c _o Is max { d (t) _i ,t _d ) The cluster c _o The maximum distance between the inner test cases, d, represents the measurement between the test cases, can be measured by Euclidean distance, manhattan distance, chebyshev distance, cosine distance, minkowski distance and the like, and the specific measurement mode can be selected according to the setting.

In this embodiment, the row vector of the test case coverage intensity matrix is used as a cluster, the diameter of the cluster is calculated, and the mapping relation between the cluster and the diameter is stored in a cluster diameter dictionary.

Step S105c, judging cluster c _o Whether the number of test cases is equal to 2:

if cluster c _o If the number of test cases in the cluster c is equal to 2 _o Direct cleavage into 2 clusters c _o+1 、c _o+2 And the number of test cases in each cluster is equal to 1, and the cluster division list clstDictlist is divided into a plurality of clustersCluster diameter dictionary { c _o ：D(c _o ) The { c } is replaced with _o+1 ：0},{c _o+2 :0}, and adding a replaced cluster division list clstDictlist in the cluster division process clstprcs, recording the cluster division process, and executing a step S105d of judging whether the number of test cases in each cluster in the cluster division set C is 1;

if cluster c _o If the number of test cases in the cluster is greater than 2, calculating the cluster c _o Average distance between each test case and other test cases.

The calculation formula is as follows:

wherein,representing the ith test case to cluster c _o The average distance of other test cases in (a) s represents cluster c _o Number of test cases, t _i And t _k The i test case and any other test case in the cluster except the i test case are respectively represented. d (t) _i ,t _k ) And (3) representing the distance between the ith test case and the kth test case, and performing distance calculation by adopting the i row vector and the k row vector of the code unit coverage intensity matrix. cov _i Covst (u) representing the ith test case _j )，cov _k Covst (u) representing the kth test case _j )。

Step S105c-3, finding the average distanceAnd taking the test case corresponding to the maximum value as a split test case, and recording the serial number of the split test case, thereby positioning the outlier test case with the maximum deviation in the cluster.

Wherein s represents a split testThe serial number of the case, the split test case is the test case t with the largest average distance in the cluster _s 。

Step S105c-4, clustering c _o Split into 2 initial clusters c _o+1 、c _o+2 Wherein cluster c _o+1 The test cases in (a) are split test cases, cluster c _o+2 The test case in (a) is cluster c _o Other test cases in (a).

c _o+1 ＝{t _s }

c _o+2 ＝c _o -{t _s }

Test case t _s Separately forming an initial cluster c _o+1 The rest of test cases are another initial cluster c _o+2 。

Step S105c-5, judging cluster c _o+2 Whether the number of test cases is 1:

if cluster c _o+2 Step S105d, wherein the number of the test cases in the cluster division set C is 1, and the step S105d is executed to judge whether the number of the test cases in each cluster in the cluster division set C is 1;

if cluster c _o+2 If the number of test cases is not 1, 2 initial clusters c are performed _o+1 、c _o+2 The test case allocation of (2) is as follows:

separately computing cluster c _o+2 To cluster c for each test case in _o+1 Average distance of test cases of (c) and cluster c _o+2 To cluster c for each test case in _o+2 The average distance of the test cases in the cluster c is determined according to the comparison result of the average distance _o+2 Middle distance cluster c _o+1 The most recent test case is attributed to cluster c _o+1 And from cluster c _o+2 Until the second preset condition is satisfied, step S105d of determining whether the number of test cases in each cluster in the cluster division set C is 1 is performed.

In one example, the allocation procedure may be expressed as:

D(c _o+1 )＝max{d(t _i ,t _j )}

D(c _o+2 )＝max{d(t _i ,t _j )}

wherein, |c _o+1 |、|c _o+2 I denote 2 initial clusters c, respectively _o+1 、c _o+2 If the ith test case t _i To c _o+1 Recently, then t _i Fall under c _o+1 And from c _o+2 Delete t in _i . If for all of the i's,or c _o+2 If the number of test cases is 1, stopping splitting c _o+2 I.e. splitting of one cluster is completed so far. D (c) _o+1 ) Represented as cluster c _o+1 Diameter of D (c) _o+2 ) Representing cluster c _o+2 Is a diameter of (c). Will be { c in clstDictlist _o ：D(c _o ) The { c } is replaced with _o+1 ：D(c _o+1 )},{c _o+2 ：D(c _o+2 ) And adding the replaced clstdi ctlist in clstmrcs, recording the splitting process of the cluster, and turning to step S105d. Otherwise continue splitting c _o+2 。

Step S105d, judging whether the number of test cases in each cluster in the cluster division set C is 1:

if the number of test cases in each cluster in the cluster division set C is not equal to 1, taking the cluster with the largest diameter as the cluster C to be split _o Execute decision cluster c _o Step S105c of testing whether the number of cases is equal to 2.

If the number of test cases in each cluster in the C is 1, the diameter of each cluster is not calculated any more, the cluster division of the C is completed, the division is not performed any more, and the cluster analysis process is finished. Otherwise, the cluster with the largest diameter is positioned as the cluster to be split to continue the cluster division by the splitting hierarchical clustering method.

Wherein D (c) represents the diameter of cluster c, which is the maximum distance between test cases within the cluster.

In this embodiment, if the number of test cases in each cluster in the cluster division set C is not equal to 1, a cluster C with the largest diameter in the cluster set C is selected _o As a cluster to be split, the process proceeds to step S105c.

Step S106, extracting a target cluster division set meeting a first preset condition in the cluster division process;

in some implementations, since the number clstnum of clusters for use case reduction is preset, or the cluster diameter upper limit threshold tstthresh at termination is set, in this embodiment, extracting the target cluster partition set satisfying the first preset condition in the cluster partition process includes:

step S106a, extracting a target cluster division set meeting a first preset condition based on a preset cluster number or a diameter upper limit threshold for use case reduction;

The first preset condition is a cluster division set corresponding to a cluster division list with the element number meeting the number of clusters in the cluster division process, or a cluster division set corresponding to a cluster division list with the first diameter smaller than or equal to the diameter upper limit threshold in the cluster division process.

If the number of clstnum for use case reduction is preset, clstdi ctlist (cluster splitting list) with clstdi ctlist element length of clstnum is located in clstprcs, and the cluster division of the corresponding test use case set is extracted as the basis of use case reduction.

If the cluster diameter upper limit threshold tstthresh of the similar case division is preset, searching a cluster diameter dictionary in the corresponding clstdialect in clstprcs, and extracting a cluster division set of a test case set corresponding to the cluster division list from the clstdialist (cluster division list) with the first cluster diameter smaller than or equal to tstthresh in positioning to serve as the basis of case reduction.

And S107, sequencing the priority of each intra-cluster test case in the target cluster partition set to form an intra-cluster queue. For example, in descending order of intra-cluster priority, each cluster forms an intra-cluster queue, respectively.

Each cluster is a reduced set, and the execution of test cases in the cluster needs to consider the representativeness of the cases and the code unit coverage priority of the test cases. In this embodiment, the quantized computation of the intra-cluster priority of the intra-cluster use case is implemented by integrating the distance between the test use case and the cluster center in the cluster and the factors of the use case coverage priority.

In some implementations, before the sorting of the priorities of the test cases in each cluster in the target cluster partition set to form the cluster queue, the method further includes:

the intra-cluster priority of the test cases is quantitatively calculated by adopting the following calculation formula:

wherein, the priority _clst (t _i ) Representing the ith test case t _i Intra-cluster priority of d (t) _i ,t _cen ) Representing the ith test case t _i To the cluster center t _cen Distance, priority of (2) _cov (t _i ) Representing the ith test case t _i The |c| represents the number of test cases in cluster c, and cluster c represents any cluster in the target cluster partition set. The distance between the test case and the cluster center is in negative correlation with the priority in the cluster.

Calculating the intra-cluster priority of all test cases in each cluster _clst (t _i ) The test cases in each cluster are arranged in descending order in the cluster according to the cluster priority of the test cases, and each cluster forms a cluster queue respectively.

Step S108, a test case is taken from each cluster queue, and the cluster queues are formed based on the code unit coverage priority ordering.

In this embodiment, one test case is sequentially taken out from each in-cluster queue to form an inter-cluster queue, and the test cases in the inter-cluster queue are arranged in descending order according to the coverage priority of the code units. In practical applications, the inter-cluster queues may refer to an overall queue obtained by sequencing test cases in each intra-cluster queue.

Under the condition that the lengths of the queues in each cluster are the same, one test case is sequentially taken out from each cluster queue and is arranged in a descending order according to the code unit coverage priority, and a group of test cases arranged in each descending order are sequentially arranged, so that an integral inter-cluster queue is obtained.

In some cases, the lengths of the queues in each cluster may be different, and at this time, under the condition that one test case can be fetched at the same position of the queue in each cluster, the fetched test cases are arranged in descending order according to the coverage priority of the code units; under the condition that at least one cluster queue cannot take out test cases, sequentially taking out one test case from the rest cluster queues with the test cases capable of being taken out, and arranging the test cases in a descending order according to the coverage priority of the code units; and under the condition that only one cluster queue exists and the removable test cases exist, the rest of the test cases in the cluster queue are continued to the tail of the ordered queue, so that an integral inter-cluster queue is obtained.

In some implementations, the method of this embodiment further includes:

step S109, at least one test case is fetched from the inter-cluster queue and executed.

In some implementations, the number of test cases executed may be preset to limit the number of test cases that are fetched and executed from the inter-cluster queue each time; and setting the case execution switch to be on, in which case, after forming the inter-cluster queues, test cases consistent with the set execution number are taken out from the inter-cluster queues and executed. In practical application, when the test does not reach the expected or exit, the user can modify the execution number of the use cases required by the next execution according to the feedback result after the execution, for example, the currently set execution number is 4, after 4 test use cases are taken out from the inter-cluster queue and executed, the user considers that the execution effect is not ideal, and the granularity and the effect of the regression test can be adjusted by modifying the execution number to be 1 according to the requirement. Of course, after the test expectation or exit is reached, before the next execution of the method of the embodiment re-performs the reduction and optimization of the test case, whether to adjust the parameters input by the user according to the actual requirement may also be determined, including the code unit type, the execution number, and the cluster number (or the upper diameter threshold) for use case reduction.

In practical application, there is a case of updating test cases, when the test cases in the test case set are subjected to operations of adding, deleting and modifying, so that the test case set, the test case clustering result (cluster division set), the test case code unit coverage strength, the code unit coverage priority of the test cases, the intra-cluster priority of the test cases in the clusters and the like are changed, the coverage strength covst (u) of the test cases on the code units can be performed _ij ) Is updated according to the update of the update program.

When the test cases in the test case set are added and repairedThe operation is changed, and the test case clustering result needs to be adjusted, so that the coverage strength covst (u) of the corresponding test case to the code unit can be updated according to the steps S101 to S103 of the embodiment _ij )。

When the deletion operation is performed on the test cases in the test case set and the test case clustering result needs to be adjusted, steps S101 to S103 may be skipped to directly delete the coverage strength covst (u _ij )。

And (4) performing operations such as code unit coverage priority quantization calculation, intra-cluster priority quantization calculation of test cases in clusters, test case clustering and the like according to steps after the step S104 to form inter-cluster sequences for test case reduction and optimization execution.

In practical application, if the program file is updated, the prior test case reduction and the cluster sequence which is executed preferentially can be used. If the program file is greatly changed, such as a radical change, the method of the embodiment needs to be rerun, and the inter-cluster sequence of use case reduction and optimization execution is regenerated.

The method completes the division of clusters of the test case code unit coverage intensity space through split hierarchical clustering, thereby forming the classification of the test case space. The division number of clusters may be accomplished by configuring the number of clusters for use case reduction or an upper cluster diameter threshold at termination. And applying the distribution of the test cases in the clusters and the code unit coverage priority of the test cases, and quantifying the priority of the test cases in each cluster. By establishing the in-cluster queues and the inter-cluster queues, test cases of the in-cluster queues are ordered in descending order of the priority in the cluster, and test cases of the inter-cluster queues are ordered in descending order of the priority by covering the code units. The use cases of the inter-cluster queues are formed by cyclically taking out the use cases from the intra-cluster queues. At test time, use cases of the inter-cluster queues are executed until expectations are reached. The ordering of intra-cluster use cases and inter-cluster use cases is effectively a hierarchical mechanism of use cases.

In some implementations, the method of this embodiment further includes:

step S110, if the currently fetched and executed test case does not reach the execution expectation, if the inter-cluster queue is not empty, the step S109 is shifted to continue to fetch a case from the inter-cluster queue and execute the test case, otherwise, the step of fetching a test case from each intra-cluster queue and forming the inter-cluster queue based on the code unit coverage priority order is executed; and under the condition that the currently fetched and executed test case reaches the execution expectation, ending the test case execution, and thus completing the reduction and optimization of the test case execution.

In this embodiment, if all test cases have been executed and have not yet reached the expectations, the relevant parameters need to be reset, the execution of the reduction and optimization cases is performed again, and the bandwidth parameter h of the kernel density estimation and the kernel function are set; meanwhile, the code unit type (such as branches, blocks, sentences) and the cluster number clstnum for test case reduction are set by the user, or the upper diameter threshold tstthresh of the cluster at the termination is set. In the aspect of setting, the number of clusters can be set or indirectly planned through the diameter of the clusters, so that granularity control of test case set reduction is realized, and meanwhile, the relevance of the context of the code unit can be flexibly adapted through setting a kernel function and bandwidth parameters for estimating the coverage intensity of the code unit.

In a specific example, the execution flow of the method of the present embodiment is shown in fig. 2.

The embodiment takes code unit coverage information as input, utilizes the test case execution path context logic coverage relation, realizes the estimation of the code unit coverage intensity based on a kernel density estimation algorithm, estimates the importance weight of the code unit, establishes the estimation of the code unit coverage priority of the test case which takes the code unit importance weight as input, and forms test case vector representation based on the coverage intensity corresponding to the code unit traversed by each test case, and completes the clustering division of the test case set based on a split hierarchical clustering algorithm. Further integrating the importance of the code units of the test cases and the distribution of the test cases in the clusters, as factors of reduction of execution of the test cases in the clusters and optimization of coverage of the code units, determining the priority of the test cases in the clusters, sequencing the priority of the test cases in each cluster in a descending order, sequentially taking out the test cases with the highest current corresponding priority from each cluster, and comparing the test cases through the coverage priority of the code units to form an execution optimization sequence of the test cases with the coverage priority arranged in a descending order.

The test case reduction and execution optimization method of the embodiment is more accurate in description of the characteristics of the test cases, and the test case clusters can be flexibly configured according to the user requirements, so that the method has positive effects on improving the continuous integrated development efficiency of software and reducing the development cost, and can promote further optimization and improvement of the software development capability and the test technology. On the basis of guaranteeing the software correctness and improving the software quality, the number of test cases and the redundancy of the test cases are effectively reduced, the test efficiency is improved, the test cost is reduced, the software evolution and maintenance efficiency is further improved, and the service cost is reduced.

Example two

The embodiment provides a reduction and optimization device for test case execution, as shown in fig. 3, including:

a collecting module 201, configured to collect the number of times each test case covers each code unit in the program file;

a first calculation module 202, configured to calculate a coverage density of the code units traversed by each test case by using a kernel density estimation algorithm:

the second calculating module 203 is configured to calculate, based on the coverage density and the coverage times, the coverage intensity of each test case to each code unit in the program file, so as to obtain a code unit coverage intensity matrix;

A third calculation module 204, configured to quantitatively calculate a code unit coverage priority based on the code unit coverage intensity matrix;

the clustering module 205 is configured to perform cluster division on test cases in the test case set by using a split hierarchical clustering method based on the code unit coverage intensity matrix, and save a cluster division process;

the extracting module 206 is configured to extract a target cluster partition set that meets a first preset condition in a cluster partition process;

an intra-cluster sorting module 207, configured to sort the intra-cluster priorities of each intra-cluster test case in the target cluster partition set to form an intra-cluster queue;

the inter-cluster ordering module 208 is configured to take a test case from each intra-cluster queue, and form an inter-cluster queue based on the code unit coverage priority ordering;

in some implementations, the system further includes an execution module to fetch and execute at least one test case from the inter-cluster queue.

The specific implementation manner of each module in this embodiment may refer to the first embodiment, and this embodiment is not repeated. It should be appreciated that the device of this embodiment has at least all of the benefits that can be achieved by the first embodiment.

Example III

The present embodiment provides a computer-readable storage medium having a computer program stored thereon, which when executed by at least one processor, implements the method as in the first embodiment.

The aforementioned computer-readable storage medium may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk.

The processor may be an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), digital signal processing device (Digital Signal Processing Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field Programmable Gate Array, FPGA), controller, microcontroller (Microcontroller Unit, MCU), microprocessor or other electronic component implementation for performing the above method.

Example IV

The present embodiment provides an electronic device including a memory and at least one processor, the memory storing a computer program that when executed by the at least one processor implements the method as in the first embodiment.

The processor may be an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), digital signal processor (Digital Signal Processor, DSP), digital signal processing device (Digital Signal Processing Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field Programmable Gate Array, FPGA), controller, microcontroller (Microcontroller Unit, MCU), microprocessor or other electronic component implementation for performing the above method. In practical applications, the electronic device may refer to a notebook, a desktop, a server, etc.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative.

It should be noted that, in this document, the terms "first," "second," and the like in the description and the claims of the present application and the above drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Although the embodiments of the present invention are described above, the embodiments are only used for facilitating understanding of the present invention, and are not intended to limit the present invention. Any person skilled in the art can make any modification and variation in form and detail without departing from the spirit and scope of the present disclosure, but the scope of the present disclosure is still subject to the scope of the appended claims.

Claims

1. The method for reducing and optimizing the execution of the test case is characterized by comprising the following steps:

taking a test case from each cluster queue, and forming an inter-cluster queue based on code unit coverage priority ordering;

the method for calculating the coverage intensity of each test case to each code unit in the program file based on the coverage density and the coverage times comprises the following steps:

wherein, covst (u) _ij ) Representing the coverage intensity, p, of the jth code unit covered by the ith test case _h (u _ij ) The coverage density, count (u) _ij ) The coverage times of the jth code unit covered by the ith test case are represented, m represents the number of test cases in the test case set, and n represents the number of code units in the program file;

each element in the code unit coverage intensity matrix is a value obtained by normalizing the coverage intensity of each code unit in the program file for each test case;

The process for carrying out cluster division on the test cases in the test case set and storing cluster division based on the code unit coverage intensity matrix by utilizing a split hierarchical clustering method comprises the following steps:

judgment cluster c _o Whether the number of test cases is equal to 2:

judgment cluster c _o+2 Whether the number of test cases is 1:

2. The method for reducing and optimizing test case execution according to claim 1, wherein the quantizing the computing of the code unit coverage priority based on the code unit coverage intensity matrix comprises:

3. The method for reducing and optimizing test case execution according to claim 2, wherein the calculating the overall coverage of the code unit based on the code unit coverage matrix comprises:

wherein, covst (u) _j ) Representing the overall coverage of all test cases to the jth code unit, covst (u _ij ) The coverage intensity of the jth code unit covered by the ith test case is represented, m represents the number of test cases in the test case set, and n represents the number of code units in the program file.

4. The method for reducing and optimizing test case execution according to claim 2, wherein the calculating the code unit importance based on the overall coverage strength of the code unit comprises:

5. The method for reducing and optimizing test case execution according to claim 2, wherein the quantitatively calculating the code unit coverage priority of each test case based on the code unit importance includes:

wherein, the priority _covum (t _i ) Representing the ith test case t _i Is used for covering the priority of the non-regularized code units _cov (t _i ) Representing the ith test case t _i Is normalized, and impt (i, j) represents the ith test case t _i The code unit importance of the j-th code unit covered, impt (j) represents the code unit importance of the j-th code unit, m represents the number of test cases in the test case set, n represents the total number of code units in the program file, delta (impt (i, j)) represents a judging function, and the i-th test case t is judged _i Whether the jth code unit is covered or not, if so, the function value is 1, otherwise, is 0.

6. The method for reducing and optimizing test case execution according to claim 1, wherein the extracting the target cluster division set satisfying the first preset condition in the cluster division process includes:

7. The method for reducing and optimizing test case execution according to claim 1, wherein before the sorting the intra-cluster priorities of each intra-cluster test case in the target cluster partition set to form an intra-cluster queue, further comprising:

8. The method for reducing and optimizing test case execution according to claim 1, further comprising: at least one test case is fetched from the inter-cluster queue and executed.

9. The method of reducing and optimizing test case execution of claim 8, further comprising:

10. A reduction and optimization device for test case execution is characterized by comprising:

the inter-cluster sequencing module is used for taking a test case from each intra-cluster queue and sequencing based on the code unit coverage priority to form an inter-cluster queue;

the second calculating module is configured to calculate, based on the coverage density and the coverage frequency, coverage intensity of each test case to each code unit in the program file, and includes:

the clustering module is used for carrying out cluster division on the test cases in the test case set by utilizing a split hierarchical clustering method based on the code unit coverage intensity matrix and storing a cluster division process, and comprises the following steps:

judgment cluster c _o Whether the number of test cases is equal to 2:

if cluster c _o If the number of test cases in the cluster c is equal to 2 _o Direct cleavage into 2 clusters c _o+1 、c _o+2 And the number of test cases in each cluster is equal to 1, and the cluster diameter dictionary in the cluster division list is replaced by { c } _o+1 ：0},{c _o+2 :0, and adding a replaced cluster division list in the cluster division process, and executing judgment cluster Dividing whether the number of test cases in each cluster in the set C is 1;

judgment cluster c _o+2 Whether the number of test cases is 1:

11. A computer-readable storage medium, on which a computer program is stored which, when executed by at least one processor, implements the method according to any one of claims 1 to 9.

12. An electronic device comprising a memory and at least one processor, the memory having stored thereon a computer program which, when executed by the at least one processor, implements the method of any of claims 1-9.