CN113590472A

CN113590472A - Test case priority ranking method in regression test

Info

Publication number: CN113590472A
Application number: CN202110762419.0A
Authority: CN
Inventors: 杨秋辉; 刘巧韵; 潘春霞
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2021-11-02
Anticipated expiration: 2041-07-06
Also published as: CN113590472B

Abstract

The invention relates to a test case priority ranking method in regression testing. Firstly, clustering test cases according to the text theme and coverage similarity of the test cases; then, taking the maximized code coverage rate, the historical execution failure rate and the minimized execution time as targets, and sequencing the test cases by using a multi-objective optimization algorithm and combining clustering results; and finally, dynamically adjusting the sequencing result by utilizing the incidence relation among the test cases to finally obtain the sequencing sequence of the priority of the test cases. The method integrates a data mining technology and a multi-objective optimization technology, combines static text information and dynamic execution information of the test cases, is a brand new scheme integrating the advantages of white-box test case priority ordering and black-box test case priority ordering, improves defect detection rate, is considered more comprehensively, and can obtain better ordering results through experimental verification.

Description

Test case priority ranking method in regression test

Technical Field

The invention belongs to the field of software testing in software engineering, and particularly relates to a test case priority ranking method in regression testing.

Background

Regression testing is a software test for verifying whether a program or code change adversely affects existing functions, and is very common in the software industry. However, test cases are typically large in size, and the cost of completely executing all test cases in a regression test can be high. The test cases are sorted according to the priority, so that the test cases which are more likely to detect faults can be preferentially executed, and greater fault detection capability is provided within available test time, thereby improving the test efficiency and reducing the time and resource overhead. How to effectively calculate the priority of the test case is an important content in the regression test.

The test case priority ranking method in the existing regression test usually determines the similarity of test cases based on single information, so that the similarity of the test cases in all aspects cannot be comprehensively considered; also, many sorting methods only consider ranking test cases that can find defects first, and do not consider whether they reveal different defects.

Disclosure of Invention

The invention has proposed the priority ranking method of test case in a regression test, calculate the similarity of the test case in conjuction with static and dynamic information of the test case at first, cluster the test case; then, combining the clustering result to adjust the multi-target optimization sequencing result; and finally, whether the test cases reveal the same defects is taken into consideration of the association rule, and the sequencing result of the test cases is further adjusted. Experiments prove that the method has a better sequencing effect.

In order to achieve the above purpose, the invention adopts a method for sequencing the priorities of test cases in a regression test, which comprises the following three steps:

step 1, clustering test cases according to the text theme similarity and the sentence coverage similarity of the test cases;

the step 1, clustering test cases according to the text theme similarity and the sentence coverage similarity of the test cases, comprises the following steps:

step 1.1, preprocessing a test case text, performing theme modeling on the preprocessed text, and calculating text theme similarity of the test case;

the reason for executing the step is that the test case text contains various information of the test case, so that the similarity of the test case in function can be judged from the viewpoint of static text information of the test case;

step 1.2, counting statement coverage conditions of the test cases, and calculating statement coverage similarity of the test cases;

the reason for executing the step is that the statement coverage condition describes the proportion and the degree of the test case testing of the program source code, so that the similarity of the program source code on the code coverage range can be judged from the angle of the dynamic execution information of the test case;

step 1.3, determining the weighting coefficients of the two similarities according to experiments, and calculating the weighted similarity sum;

step 1.4, carrying out hierarchical clustering on the test cases according to the weighted similarity sum to obtain a clustering result; dividing the test cases into N different classes according to the similarity;

step 2, performing multi-target sequencing on the test cases, and adjusting a sequencing sequence according to a clustering result;

the step 2 of performing multi-target sequencing on the test cases and adjusting the sequencing sequence according to the clustering result comprises the following steps:

step 2.1, performing multi-target sequencing on the test cases by using a multi-target genetic algorithm to obtain a sequencing sequence, wherein the sequencing targets are code coverage maximization, historical execution failure rate maximization and execution time maximization;

step 2.2, according to the clustering result obtained in the step 1, adjusting the sequencing sequence obtained in the step 2.1 to ensure that the test cases arranged in the front belong to different clustering categories;

the reason for performing this step is that the test cases for detecting the same error may be arranged at the front end of the sequence in the sorted sequence, so the sorting needs to be adjusted according to the clustering result to improve the error detection speed.

Step 3, mining association rules according to historical execution results of the test cases, and dynamically adjusting sequencing sequences;

the step 3 of mining association rules according to historical execution results of the test cases and dynamically adjusting the sequencing sequence comprises the following steps:

step 3.1, mining association rules of execution failure among the test cases according to the historical execution information of the test cases;

step 3.2, if the obtained association rule, the front piece and the back piece of the association rule reveal the same defect, discarding the association rule;

the reason for performing this step is that when one test case fails to execute, another test case has to be executed immediately only if the revealed defects are different.

3.3, executing the sequencing sequence obtained in the step 2, if the execution of a certain test case fails in the process, searching for an association rule taking the test case as a front piece, and adjusting the test case of a rear piece to be immediately executed;

the reason for executing the step is to ensure that after a certain test case fails, the following test case with execution failure association is immediately executed, and the test case reveals different defects so as to improve the error detection rate.

The resulting test case execution sequence is the prioritized sequence.

According to the text theme similarity and the sentence coverage similarity of the test cases, the method measures the similarity of the test cases from static and dynamic dimensions respectively and clusters the test cases; then, taking the maximized code coverage rate, the historical execution failure rate and the minimized execution time as targets, and sequencing the test cases by using a multi-objective optimization algorithm and combining clustering results; and finally, dynamically adjusting the sequencing result by utilizing the incidence relation among the test cases to finally obtain the sequencing sequence of the priority of the test cases. The method is a brand new scheme fusing a data mining technology and a multi-objective optimization technology, combines static information and dynamic execution information of test cases to carry out similarity judgment, improves defect detection rate, is more comprehensive in consideration, and can obtain better results.

Drawings

FIG. 1 is a schematic general flow diagram of the process of the present invention.

Detailed Description

In order to more clearly show the objects and technical solutions of the present invention, embodiments of the present invention will be described in more detail below with reference to the accompanying drawings.

A method for sorting the priority of test cases in regression testing, as shown in FIG. 1, generally includes the following three steps:

the method comprises the steps of measuring the similarity of test cases respectively by using the similarity of text themes and the similarity of code coverage rates, combining the two similarities by using weights, clustering the test cases by using a hierarchical clustering method, and classifying the similar test cases into one class to obtain a clustering result.

The step 1, performing test case clustering according to the text topic similarity and the sentence coverage similarity of the test cases, as shown in fig. 1, includes the following steps:

the test case text preprocessing refers to extracting language data of the test case text and acquiring a service concern point in the test case;

the theme modeling means that a theme vector is used for representing the correlation between each test case and different semantic themes to approximate the functions of the test cases;

the text theme similarity of the test cases means that the text theme similarity between the test cases is measured by calculating the distance between the text theme vectors of the test cases, and the smaller the distance is, the more similar the test cases are represented.

the statement coverage condition of the test case refers to which statements in the source code of the program to be tested are executed when the test case is used for executing the program to be tested. The more the statement coverage of the two test cases coincide, the more similar they represent;

the statement coverage similarity of the test cases means that the similarity based on statement coverage between two test cases is calculated by using the Jacard distance, and the smaller the Jacard distance is, the more similar the test cases are represented.

if the text topic similarity St is assigned a weight of 0.7 and the sentence coverage similarity Sc is assigned a weight of 0.3, the weighted similarity sum is St × 0.7+ Sc × 0.3. The determination of the two weighting factors is obtained by experiments on the actual project data.

Step 1.4, carrying out hierarchical clustering on the test cases according to the weighted similarity sum to obtain a clustering result;

if the threshold value of the hierarchical clustering is set to be N, the test cases are divided into N different classes according to the similarity, and the N different classes are used as clustering results.

the method comprises the steps of combining three targets of code coverage maximization, historical execution failure rate maximization and execution time minimization, using a multi-target genetic algorithm to carry out multi-target sequencing on test cases, and then adjusting a sequencing sequence according to a clustering result obtained in the step 1, so that the test cases at the front in the sequence all belong to different classes. And preliminarily obtaining a sequence result after the step is finished.

The step 2 of performing multi-target sequencing on the test cases and adjusting a sequencing sequence according to a clustering result, as shown in fig. 1, includes the following steps:

the multi-target sequencing of the test cases refers to searching a compromise sequencing scheme in a plurality of sequencing targets so as to improve the defect detection rate of the test cases; and respectively calculating three target values by using the function expressions, and obtaining a test case sequencing sequence optimized according to the target values by using a multi-target genetic algorithm.

if N test cases of different categories exist in the clustering result, the sorting sequence is adjusted to that the first N test cases belong to different categories, and the N test cases are respectively sorted first in the category to which the test cases belong.

the method comprises the steps of firstly mining the incidence relation of the test cases which fail at the same time from the historical execution information of the test cases by using an incidence rule mining technology, and then adjusting the sequencing result of the test cases obtained in the previous step according to the relation, so that the error uncovering rate of the test cases is improved.

The step 3, mining association rules according to historical execution results of the test cases, and dynamically adjusting the sequencing sequence, as shown in fig. 1, includes the following steps:

the association rule of execution failure means that every time a certain test case fails, some other test cases always fail. Such as the association rule of execution failure X → Y, means that when one test case X fails to execute, another test case Y also often fails at the same time.

Step 3.2, if the front piece and the back piece of the obtained certain association rule reveal the same defect, discarding the association rule;

if the association rule X → Y is obtained, the defects revealed by the front part X and the back part Y are the same, and then the defects are discarded.

if the execution of a certain test case X fails, searching an association rule X → Y taking X as a front part, and adjusting the test case Y of a rear part to be immediately executed;

the resulting test case execution sequence is the prioritized sequence.

Claims

1. A method for sequencing the priority of test cases in regression test is characterized by comprising the following steps:

therefore, the test case priority is sequenced.

2. The method according to claim 1, wherein the step 1 of clustering the test cases according to the text topic similarity and the sentence coverage similarity of the test cases comprises the following steps:

the method according to claim 1, wherein the step 2 of performing multi-target sequencing on the test cases and adjusting the sequencing sequence according to the clustering result comprises the following steps:

the method according to claim 1, wherein the step 3 of mining association rules according to historical execution results of test cases and dynamically adjusting the sequencing sequence comprises the following steps:

the resulting test case execution sequence is the prioritized sequence.