CN102323906B

CN102323906B - MC/DC test data automatic generation method based on genetic algorithm

Info

Publication number: CN102323906B
Application number: CN201110265194.4A
Authority: CN
Inventors: 高峰; 刘厂; 赵玉新; 李刚
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2011-09-08
Filing date: 2011-09-08
Publication date: 2014-01-08
Anticipated expiration: 2031-09-08
Also published as: CN102323906A

Abstract

The invention discloses a MC/DC (Modified Condition Decision Coverage) test data automatic generation method based on a genetic algorithm, comprising the following steps of: statically analyzing a tested program to generate a control flow graph, a data flow graph, an abstract syntax tree and an abstract analysis tree; generating a MC/DC test case expected result set; executing code instrumentation on the tested program; constructing a fitness function; randomly generating the test data, and checking whether the test data satisfies an expected execution path; and obtaining proper test data through genetic operations, such as selection, crossing, mutation and the like of the genetic algorithm. In the method, for construction of the fitness function, optimization of approximate-level fitness evaluation by a method of obtaining a control node directly or indirectly influencing defective node traversing through data dependency is proposed according to the thought of a chaining method and based on the conventional fitness function. The method has greater practical value for testing a system with complex logical relations.

Description

A kind of MC/DC automatic generation of test data based on genetic algorithm

Technical field

The invention belongs to the software testing technology field, be specifically related to a kind of MC/DC automatic generation of test data based on genetic algorithm, particularly in engineering test, meet the automatic generation of test data that MC/DC covers (correction conditions is judged covering).

Background technology

The test data Auto refers to by specific algorithm constructs the test input data automatically according to stipulations or the program structure of software, its purpose is to alleviate a large amount of work that the tester must pay, reduce the great number cost of manual test, improve the confidence level of test process simultaneously.Genetic algorithm is a kind of biological evolution process and machine-processed self-adaptation artificial intelligence technology that solves extreme-value problem simulated, and has unique advantage when solving large space, non-linear contour complexity issue.

The thought that the Dynamic Execution program generates test data is proposed by Miller and Spooner at first, and scholar has afterwards done a lot of further research.1992, at first Xanthakis was applied to genetic algorithm test case and generates, and its adopts coding techniques that D is mapped to gene space G, and determined the direction of search by the natural selection of the genetic manipulation such as selection, intersection, variation and the survival of the fittest.Crossover and mutation is operating as population and introduces new information, thereby more is conducive to find globally optimal solution, has avoided algorithm in the past often easily to be absorbed in the problem of local extremum, can generate the test data that meets all branches decision criteria.At home, 1996 to 1998, pod is big, Gao Zhongyi etc. delivered many pieces of papers, and the automatic generation that genetic algorithm is applied to the Ada software configuration test data based on the path covering is discussed, and show that genetic algorithm generates the high conclusion of efficiency of test data than climbing method and random approach.Calendar year 2001, Wang Hao etc. have provided formal expression and the test data generation system prototype based on this algorithm of genetic algorithm.2003, Jing Zhiyuan from the angle analysis of mathematics by improved algorithm application such as MGA and hereditary K averages in the automatic generation of test case.2006, Cheng Ye generated genetic algorithm for path coverage test data automatically at academic dissertation, and has developed corresponding tool model.2008, Lv Shanshan was used artificial neural network and genetic algorithms to generate the test data based on input domain in her academic dissertation.Generally speaking, present researcher both domestic and external adopts the improvement algorithm of genetic algorithm or genetic algorithm to generate test data as core algorithm more.But above-mentioned is all to select the relatively simple statement of logical organization to cover or branch's covering generating test use case, and it can't be widely used at actual engineering field.

Summary of the invention

For problems of the prior art, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm.Only rely on and control internodal control dependence in flow graph for traditional fitness function in genetic algorithm, do not consider the defect of the data dependence relation of tested program inside, propose to use cascade synthesis thought to collect directly or pass through the control node of data dependence remote effect trouble node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct a new fitness function, overcoming traditional fitness function with this lacks because of the guidance information that the data dependence relation of ignoring program inside causes, the problem that search is degenerated.Design has realized the automatic generation of test data that meets the MC/DC criterion based on genetic algorithm on this basis.The MC/DC criterion is emphasized the ability of single variable true and false impact for last expression formula, require each condition will independently affect the result of predicated expressions, have great practical value when logical relation complexity and the higher software systems of security requirement are tested.

The present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, specifically comprises following step:

Step 1: tested program is carried out to static analysis, produce and control flow graph, data flow diagram, abstract syntax tree and abstract analysis tree.

Step 2: generate MC/DC test case expected results collection.

Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree, and each leaf is exactly a variable, means the number of the leaf variable of extraction with N.

Step 2.2: build truth table, to N leaf variable, have 2 ⁿplant permutation and combination.

Step 2.3: by

numeral

0 and 1, fill truth table.

Step 2.4: for the every a line in truth table:

Step 2.4.1: each leaf variable of the Boolean correspondence in the truth table current line being distributed to the abstract analysis tree;

Step 2.4.2: the Boolean of each condition node of bottom-up evaluation, until the top of abstract analysis tree, the Boolean of the condition node on the top of the abstract analysis tree of final gained is exactly the Output rusults value of this decision statement;

Step 2.4.3: the output row of filling truth table by the Output rusults value.

Step 2.5: for each leaf variable, find its test case, and the MC/DC test case of adding each leaf variable to is concentrated:

Step 2.5.1: for each leaf variable is set up the MC/DC test use cases of a pair of sky;

Step 2.5.2: in truth table, find other leaf variate-values and fix, two row that only have target leaf variable to change;

Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2, if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, adds this two row to set up in step 2.5.1 MC/DC test case with paired form and concentrates.

Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases.

Step 3: tested program is carried out to the code pitching pile.

At first travel through the position that abstract syntax tree finds pitching pile, judge whether it is the pitching pile point, if not the pitching pile point, return to continuation traversal abstract syntax tree and find the pitching pile position; If the pitching pile point is implanted inspect statement, then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree and find the position of pitching pile, until complete pitching pile.

Step 4: structure fitness function.

Step 4.1: set up and control the dependence fitness function:

Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; Arbitrary node x, can form an individual path between two nodes (y, x), during passing through (y, x) individual path and all comprise node z when each from destination node y to Egress node e, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control node of node z y, claim node z to control and depend on destination node y, controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, departs from the internodal key node of condition by target and execution in the control flow graph and calculates.

Step 4.1.2: set up and control the dependence fitness function:

Control the dependence fitness function and comprised the objective function of controlling each branch node in the dependence set, foundation is controlled the dependence fitness function and is:

ControlDepFit _testdata＝dependent _decisions-executed _decisions （1）

Wherein, dependent _decisionsfor the control nodes in the control dependence set of target; Executed _decisionsexpression be take current test data as input; ControlDepFit _testdatamean to control the dependence fitness function;

If controlling dependence fitness function value is 0, test data can arrive target discrimination; Be greater than 0 if control dependence fitness function value, test data has departed from destination node somewhere, by controlling the value of dependence fitness function, obtains departing from node diverged _node;

Step 4.2: set up the data dependence relation fitness function:

Definition pn is trouble node, the set that S is the dependence for storing a given node, using pn and S as the input of obtaining data dependence relation fitness function method, DepSets is used for storing the node set of the current Existence dependency relationship of collecting, PV is used for the variable that the storage problem node is used, S and DepSets are initialized to empty set, and the method for obtaining the dependence fitness function of a node is:

Step 4.2.1: the control dependence set ControlDep(pn that obtains trouble node pn) be assigned to S, and the variable UsedVariables(pn that pn is used) be added in PV;

Step 4.2.2: for each the variable pv in PV, obtain the last definition set lastDefs of pv;

(a) for each last definition ld in lastDefs, obtain the control dependence set ControlDep(ld of ld), and itself and S are merged to the S be expanded, the variable UsedVariables(ld then ld used) add in a newly-built PVnew;

(b) for ControlDep(ld) in each control node cd, obtain the variable UsedVariables(ld that its uses), and add step 4.2.2(a to) in the PVnew set of setting up;

Step 4.2.3: for each the variable pv in PV, iteration is obtained PV dependence collection S (ld), at first obtain first variable in PV, obtain its dependence collection S (ld), return to step 4.2.2, then obtain the next variable in PV, until complete variablees all in PV, finishing iteration, and S (ld) is added in DepSets;

Step 4.2.4: set up for defining the dependence set of fitness function:

Definition DepFit is the set of fitness function dependence, for each the subset s in DepSets _i:

(a) add s _iin DepFit;

(b) each non-s in judgement DepSets _isubset s _j, with s _iwhether there is the branch interference of then, else, if there is no, merge two subset s _i, s _j, obtain new subset S _i,j, and by S _i,jadd in DepFit; If have the branch interference of then, else can not merge, by s _i, s _jall add in DepFit;

Step 4.2.5: set up the data dependence relation fitness function:

After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function.

Step 4.3: set up branching adaptation degree function:

When test data arrives at target, measure the test data test case that whether meets the expectation by branch distance, approximation ratio by Branch Computed distance metric test data apart from the expectation test case, if test data arrives target discrimination, but do not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case.

Step 5: on the basis of MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, and obtain fitness value, whether check simultaneously meets the path that expection is carried out; If meet, enter step 7; Otherwise enter step 6.

Calculate and approach horizontal fitness function ApproachLevelFitness, if ApproachLevelFitness is 0, test data arrives target; Then at the target discrimination place, Branch Computed is apart from BranchFitness, if ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, otherwise, the fitness function value of test data equals ApproachLevelFitness+normalized (BranchFitness), and normalized (BranchFitness) means the standardization apart from BranchFitness to branch, and concrete computation process is:

Step 5.1: the method according to step 2 generates MC/DC test case expected results collection.

Step 5.2: the method according to step 4.1 and step 4.2 obtains the dependence set, comprises and controls Dependency Set and data dependence collection.

Step 5.3: for each target discrimination, generate at random test data T _awith test data T _b.

Step 5.4: according to the computing method in step 4.1, according to the dependence set, calculate respectively test data T _awith test data T _bcontrol rely on fitness function and data dependence fitness function.

Step 5.5: according to the computing method in step 4.3, calculate respectively test data T _awith test data T _bbranch's distance;

Step 5.6: the standardization BranchFitness (T that carries out branch's distance _a) _normalised, computing formula is:

BranchFitness {(T_{a})}_{normalised} = \frac{BranchFitness (T_{a})}{BranchFitness (T_{a}) + BranchFitness (T_{b})}

Wherein, BranchFitness (T _a) expression test data T _abranch's distance, BranchFitness (T _b) expression test data T _bbranch's distance.

Step 5.7: total fitness function value Fitness (T) is:

Fitness(T)＝ApproachLevelFitness(T)+BranchFitness(T) _normalized

Wherein, ApproachLevelFitness (T) means the horizontal fitness function that approaches of test data T;

BranchFitness (T) _normalizedthe standardized value that means branch's distance of test data T.

Step 5.8: according to these two test datas of total fitness function value comparison which closer to reaching the judgement target.

Step 5.9:

If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, enters step 7, otherwise, enter step 6.

Step 6: according to the fitness value obtained, the genetic manipulations such as the selection of use genetic algorithm, intersection, variation, generate new test data, and return to step 5, calculates the fitness value of newly-generated test data.

Step 6.1: select coding strategy, parameter sets X and territory are converted to bit string structure space S.

The mode Encoded Chromosomes that adopts integer and full mold to mix, the parameter transcoding of problem is the gene on chromosome, and the number of parameter is converted to chromosome length, and the interval of each parameter is mapped as the span of each gene, and detailed process is as follows:

If Solve problems comprises n input variable X ₁, X ₂..., X _n, at first, with the codomain of equivalence class division and boundary value analysis method processing parameter, wherein Y _i(1≤i≤n) means parameter X _i(1≤i≤n) can value the set of limited discrete point, | Y _i| mean the size of set, set up the mapping relations in solution space and chromosome space, chromosome is expressed as:

X＝(X ₁,X ₂,...,X _n)→C＝(C ₁,C ₂,...,C _n) （2）

Wherein, C is the solution in chromosome Space Solutions space, the solution that X is problem space;

Step 6.2: design and selection genetic manipulation comprise Population Size, selection, intersection, variation method, and definite crossover probability p _cwith the variation Probability p _metc. genetic parameter.

Step 6.2.1: carry out sequencing selection strategy and elite's retention strategy:

The detailed process of carrying out the sequencing selection strategy is:

(a) according to the size of adaptive value, all individualities in the descending sort population;

(b) estimate by design allocation table, according to the adaptive value size, ascending order is distributed each individual probable value;

(c) each individuality is genetic to the probable value that follow-on probability distributes in step (b) and is determined, then, based on these probable values, selects the chromosome that is eliminated and is replicated with roulette wheel selection; After one takes turns the sequence selection strategy, can obtain a new population, then at this, on new population basis, carry out again elite's retention strategy, detailed process is:

(a) according to the size of fitness function value, from the new population obtained after the sequencing selection strategy, be to obtain optimized individual and the poorest individuality in current population;

(b) if the fitness of the optimized individual of current population, higher than the fitness of the optimized individual of the appearance obtained before this, replaces by the optimized individual of current colony the optimized individual occurred before this;

(c) keep the optimized individual state of appearance so far constant, complete being genetic in population of future generation by it.

Step 6.2.2: interlace operation and mutation operation:

Adopt adaptive crossover probability p _cwith the variation Probability p _m, according to the average adaptive value of colony and current colony optimum individual adaptive value, automatically adjust crossover probability p _cwith the variation Probability p _m; f _maxthe fitness that means optimum individual in certain generation population, F _avgthe average fitness that means this colony in generation, the fitness of optimum individual and this in generation colony the difference DELTA=f of average fitness _max-F _avg, when Δ less, mean that the fitness difference between population at individual is less, illustrate that population now reaches the possibility of local optimum larger, the possibility of Premature Convergence is also larger; When Δ is larger, mean that the fitness difference between individuality is larger, crossover probability p _cwith the variation Probability p _mdecided p by Δ _cand p _mcomputing formula be:

p _c＝k ₁/Δ （3）

p _m＝k ₂/Δ （4）

Wherein, k ₁and k ₂be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient.

Step 6.3: random initializtion generates population P.

Step 6.4: calculate the decoded fitness value of individual bit string in population P.

Step 6.5: according to Genetic Strategies, each genetic manipulation of design in step 6.2 is acted on to population, after selection, crossover and mutation, formed population of new generation.

Step 6.6: be that test data is returned to step 5 with the new chromosome produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7.

Step 7: end of run obtains suitable test data.

The advantage that the present invention has is:

1, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, only rely on and control internodal control dependence in flow graph for traditional fitness function in genetic algorithm, do not consider the defect of the data dependence relation of tested program inside, propose to use cascade synthesis thought to collect directly or pass through the control node of data dependence remote effect trouble node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct a new fitness function, overcoming traditional fitness function with this lacks because of the guidance information that the data dependence relation of ignoring program inside causes, the problem that search is degenerated.

2, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, realized the automatic generation of test data that meets the MC/DC criterion based on genetic algorithm, had great practical value when logical relation complexity and the higher software systems of security requirement are tested.

The accompanying drawing explanation

Fig. 1: the process flow diagram of the MC/DC automatic generation of test data based on genetic algorithm that the present invention proposes;

Fig. 2: the generative process process flow diagram of abstract syntax tree in the present invention;

Fig. 3: the generative process process flow diagram of abstract analysis tree in the present invention;

Fig. 4: MC/DC test case expected results collection product process figure in the present invention;

Fig. 5: code pitching pile process flow diagram in the present invention;

Fig. 6: set up in the present invention and control the applied a certain usability of program fragments of dependence fitness function;

Fig. 7: genetic algorithm basic flow sheet in the present invention.

Embodiment

Below in conjunction with accompanying drawing, the present invention is described in detail.

The present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, this comprises the key contents such as structure, genetic Algorithm Design of generation, code pitching pile, the fitness function of generation, the MC/DC test case expected results collection of generation, the abstract analysis tree of abstract syntax tree, idiographic flow as shown in Figure 1, specifically comprises following step:

Step 1: tested program is carried out to static analysis, produce information such as controlling flow graph, data flow diagram, abstract syntax tree and abstract analysis tree; Tested program refers to the software code that will carry out test.

By analytical tool (as testing tool Testbed), tested program is carried out to lexical analysis, grammatical analysis and semantic analysis, generate abstract syntax tree.Abstract syntax tree is program compiler another expression at the source program obtained after grammatical analysis.The corresponding corresponding processing function of each syntax rule, and hang on abstract syntax tree as a node of abstract syntax tree, external interface is provided, it can be prepared for next step code pitching pile work.

The generative process of abstract syntax tree as shown in Figure 2, is at first carried out lexical analysis and grammatical analysis to the tested program code.Lexical analysis provides abstract syntax tree needed symbol node, as constant and name; Grammatical analysis provides the abstract syntax tree that contains the intermediate node that represents corresponding syntactic structure; Then the tested program code is carried out to semantic analysis, the processing that name, symbol etc. is carried out, change syntax tree into a kind of canonical form that comprises expression type information and symbol table, and they connected into to tree structure, finally obtains abstract syntax tree.In the code analysis process, complete the process of establishing of the abstract syntax tree of tested program code by means of instrument Flex and Bison.

The abstract syntax tree of having set up by access, collect the decision statement subtree, simultaneously, with capitalization, means relational expression, and newly-generated subtree is called as the abstract analysis tree.Each judges the abstract analysis tree that has to mean its logical organization.Analytical tool automatically generates and controls flow graph and data flow diagram after generating the abstract analysis tree.

The tested program fragment be " if(x > y& & X > z||x > y+z) "; Fig. 2 is the abstract syntax tree that the tested program fragment generates; Fig. 3 means the abstract analysis tree extracted from abstract syntax tree, and wherein, AND and OR are the condition node, and A, B, C are called the leaf of abstract analysis tree.

Step 2: generate MC/DC test case expected results collection;

The abstract analysis tree has formed the input that generates MC/DC test case expected results collection.Each node in the abstract analysis tree can be assigned with a Boolean and a Boolean variable evaluation, identifies the Boolean that whether calculated this node with the Boolean variable evaluation.With the tested program fragment " if(x > y& & X > z||x > y+z) " be example, introduce the generative process of MC/DC test case expected results collection, as Fig. 4.Concrete steps are as follows:

Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree.Each leaf is exactly a variable, means the number of the leaf variable of extraction with N; As Fig. 5, usability of program fragments if(x > y& & X > z||x > y+z) the leaf variable is 3, i.e. N=3.

Step 2.2: build truth table, to N leaf variable, have 2 ⁿplant permutation and combination; As Fig. 5, and the tested program fragment " if(x>y& & X>z||x y+z) " truth table 8 kinds of array modes are arranged.

Step 2.3: by

numeral

0 and 1, fill truth table;

Step 2.4: for the every a line in truth table:

Step 2.4.1: each leaf variable of the Boolean correspondence in the truth table current line being distributed to the abstract analysis tree; In Fig. 5, fourth line A=0 in truth table, B=1, C=1, distribute to respectively the leaf variable of abstract analysis tree by it.

Step 2.4.2: the Boolean of each condition node of bottom-up evaluation, until the top of abstract analysis tree.The Boolean of the condition node on the top of the abstract analysis tree of final gained is exactly the Output rusults value of this decision statement; As Fig. 5, fifth line A=0 in truth table, B=1, C=1, A& & B is true, A& & B||C is true, and therefore, finally being output as that this row is judged is true.

(3) fill the output row of truth table by the Output rusults value.

By above step, complete the foundation of the truth table of a certain given decision statement (tested program).Then start to extract test case.Consider the characteristics of MC/DC, for each leaf variable, need to find two row that can embody leaf variable independent effect result of determination.

Step 2.5.2: in truth table, finding other leaf variate-values fixes, two row that only have target leaf variable to change, as Fig. 5, when target leaf variable is A, get two fixing row of leaf variable B and C variate-value, desirable the third line and the 7th row, wherein leaf variable B and C fix, and only have target leaf variables A to change;

Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2 (the Output rusults value of condition node voluntarily forms by each), if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, and this test case is effective, because they have shown the influence of target leaf variable, add this two row to set up in step (1) MC/DC test case with paired form and concentrate.

As Fig. 4, when target variable is A, the third line is (010), the 7th behavior (110), from truth table, the third line is output as 0, the seven row and is output as 1, the output valve that compares this two row, not identical, (010) and (110) is exactly a pair of test case of variables A, and this test case is effective, paired form, add the test case of leaf variables A to and concentrate.Other leaf variablees are as target leaf variable, and the method for finding its test case is identical as the finding method of target leaf variable with above-mentioned leaf variables A, and the MC/DC test case of adding each leaf variable to is concentrated.

Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases.This test use cases is exactly the target set of uses case that uses Genetic algorithm searching to realize.

As Fig. 4, the test use cases of leaf variables A, B, C is respectively (010,110), (100,110), (000,001,010,011,100,101), minimize this test use cases, obtain (010,100,110,101), this group test use cases meets the MC/DC test use cases coverage requirement of each leaf variable.

Step 3: tested program is carried out to the code pitching pile;

The program pitching pile is one of step of the automatic generative process of test data.In test process, obtaining and record all of test result data completes by the program pitching pile.Main process is to insert inspect statement on the basis that keeps the original logic integrality of tested program, when tested program is moved, and the operation characteristic data of the execution capture program by inspect statement.Analyze these characteristics, can obtain the multidate informations such as logic covering of program, and complete thus the calculating of individual fitness.Its detailed process is as Fig. 5, and abstract syntax tree (AST) at first travels through the position that abstract syntax tree (AST) finds pitching pile after generating, and judges whether it is the pitching pile point, if not the pitching pile point, returns to continuation traversal abstract syntax tree and finds the pitching pile position; If the pitching pile point is implanted inspect statement (probe), then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree (AST) and find the position of pitching pile, until complete pitching pile.

Step 4: structure fitness function;

Fitness function is unique interface of genetic algorithm and practical problems, is individual good and bad a kind of quantification reflection in population, and its structure directly affects the efficiency of problem solving.Traditional fitness function f(x) by two parts, formed:

f（x）＝approachlevel+branchdistance

What the approachlevel of first embodied is to control dependence, is commonly called the level of approaching.Second portion branchdistance is called as branch's distance, it has overcome only uses the level that approaches to adapt to the limitation of estimating, and branch meets target or meets the approximation ratio that departs from branch apart from having weighed distance on the basis of current input test data.

On this practical problems of generation of MC/DC test data, target of the present invention is finally to find the test data of the MC/DC that meets intended target.The present invention takes full advantage of the advantage of cascade synthesis aspect the characterization test data dependence relation, use cascade synthesis thought to collect directly or rely on by test data the control node of remote effect conditions of problems node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct new fitness function, with this, overcome the problem that guidance information lacks, search is degenerated that traditional fitness function causes because of the data dependence relation of ignoring program inside.

Specifically, the structure fitness function is designed three aspects of fitness function from setting up control dependence fitness function, data dependence relation fitness and 3 factors of branching adaptation degree.

Step 4.1: set up and control the dependence fitness function

Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; If x is an arbitrary node, can form an individual path between two nodes (y, x), when each path of passing through branch (y, x) from destination node y to Egress node e all comprises node z, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control of z y, claim node z to control and depend on destination node y.Controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, and it is to depart from the internodal key node of condition by target and execution in the control flow graph to calculate.

As the control flow graph of certain tested program fragment as shown in Figure 6.As can be seen from the figure, the judgement that the judgement of the 16th row depends on the 12nd row is got the judgement of true and the 13rd row and is got vacation, think that thus the target discrimination of 16 row depends on the control stream through 12,13 row, these nodes are called as critical branch, because they determine to control stream, flow to or wide.Therefore, the judgement that the control dependence collection that the 16th row is judged comprises the 12nd, 13 row, that is to say, for reaching destination node, must carry out successively these branch nodes, and the output of these nodes must be specific.Can controlled dependence collection ControlDep (16)=12, and-13}, positive number means that needs carry out true branch, negative number representation need to be carried out false ramification.

Step 4.1.2: set up and control the dependence fitness function.

After setting up and controlling the dependence set, which test case the search judgement has carried out maximum control interstitial contents, for example, in Fig. 6, test data departed from 13 row than a test data departed from 12 row closer to target.At this moment just need an evaluation function to be used for judging which test data can make to carry out stream closer to target, so just set up and control the dependence fitness function.

Control the dependence fitness function and comprised the objective function of controlling each branch node in the dependence set.Setting up control dependence fitness function is shown below.

ControlDepFit _testdata＝dependent _decisions-executed _decisions （1）

Wherein, dependent _decisionsfor the control nodes in the control dependence set of target; Executed _decisionsexpression be take current test data as input; ControlDepFit _testdatamean to control the dependence fitness function.

If controlling dependence fitness function value is 0, illustrate that test data can arrive target discrimination; Be greater than 0 if control dependence fitness function value, just illustrate that test data has departed from destination node somewhere, can be by controlling the value of dependence fitness function, accurately obtaining is which before target discrimination controlled node and depart from away, claims this point for departing from node diverged _node.Take Fig. 6 as example, and certain input data makes to carry out stream and departs from 12 row, and it is 2 – 0=2 that the dependence of control fitness function value is arranged; If but carry out true branch at 12 row, and depart from 13 row, controlling so dependence fitness function value is just 2 – 1=1.Like this, according to test data, separately with respect to the level that approaches of target discrimination, just can be distinguished it, and by search index to the most approaching test data.

Step 4.2: set up the data dependence relation fitness function

Calculating on dependence, of the present invention focusing on controlled dependence and the guiding provided to search by approaching horizontal fitness is provided data dependence relation by insertion.Thisly to approaching the expansion of level, comprised data dependence relation, the search index by search to region of search more likely.The present invention is intended to overcome in predicate the problem of using between flag variable or code predicate the guiding existed when strong data dependence relation is arranged to lack peaceful faceted search.

Obtain data dependence relation fitness function method step as follows:

Definition pn is trouble node, and the set that S is the dependence for storing a given node is usingd pn and S as the input of obtaining data dependence relation fitness function method.DepSets is used for storing the node set of the current Existence dependency relationship of collecting, and PV is used for the variable that the storage problem node is used.S and DepSets are initialized to empty set.The method of dependence fitness function of obtaining a node is as follows:

(b) for ControlDep(ld) in each control node cd, obtain the variable UsedVariables(ld that its uses), and add in the PVnew set of setting up in step (a).

Step 4.2.3: for each the variable pv in PV, iteration is obtained PV dependence collection S (ld).At first obtain first variable in PV, obtain its dependence collection S (ld), return to step 4.2.2, then obtain the next variable in PV, until complete variablees all in PV, finishing iteration, and S (ld) is added in DepSets.

Step 4.2.4: set up one for defining the dependence set of fitness function:

(a) add s _iin DepFit;

(b) each non-s in judgement DepSets _isubset s _j, with s _iwhether there is the branch interference of then, else, if there is no merge two subset s _i, s _j, obtain new subset S _i,j, and by S _i,jadd in DepFit; If have the branch interference of then, else can not merge, by s _i, s _jall add in DepFit.

Step 4.2.5: set up the data dependence relation fitness function

After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function, its computing method are identical with the method for calculating control dependence fitness function.

Step 4.3: Branch Computed distance

When test data arrives at target, measure test data by branch's distance and whether meet test case, be i.e. the approximation ratio apart from the expectation test case by Branch Computed distance metric test data.If test data arrives target discrimination, but does not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case.

The branch of a judgement is calculated apart from the structure that is based on this judgement:

(1) if contain the a==b expression formula in the structure of judging, when a==b is true time, the computing formula of branch's distance is abs (a-b); When a==b is fictitious time, the computing formula of branch's distance is a==b k:0;

(2) if contain a ≠ b expression formula in the structure of judging, when a ≠ b is true time, the computing formula of branch's distance is a unequal to b k:0; When a ≠ b is fictitious time, the computing formula of branch's distance is a unequal to b abs (a-b): 0;

(3) if contain a<b expression formula in the structure of judging, when a<b is true time, the computing formula of branch's distance is a<b 0:a – b+k; When a<b is fictitious time, the computing formula of branch's distance is a<b a – b+k:0;

(4) if contain a<=b expression formula in the structure of judging, when a<=b is true time, the computing formula of branch's distance is a<=b 0:a-b; When a<=b is fictitious time, the computing formula of branch's distance is a<=b a – b+k:0;

(5) if contain a in the structure of judging > the b expression formula, work as a > b is true time, and the computing formula of branch's distance is a > b 0:a-b; Working as a > b is fictitious time, the computing formula of branch's distance is a > b a – b+k:0;

(6) if contain a in the structure of judging >=the b expression formula, work as a >=b is true time, the computing formula of branch's distance is a >=b 0:a-b; Work as a >=b is fictitious time, the computing formula of branch's distance is a >=b a – b+k:0;

(7) if contain the a||b expression formula in the structure of judging, when a||b is true time, the computing formula of branch's distance is min[fit (a), fit (b)]; When a||b is fictitious time, the computing formula of branch's distance is fit (a)+fit (b);

(8) if contain a&amp in the structure of judging; & The b expression formula, work as a& & B is true time, and the computing formula of branch's distance is fit (a)+fit (b); Work as a& & B is fictitious time, and the computing formula of branch's distance is max[fit (a), fit (b)];

When attempting reaching test case, need relatively the approximation ratio close to the test data that completes test case, rather than test data itself, so branch's distance be often one on the occasion of.When attempting reaching test case, compare the approximation ratio close to the test data that completes test case, rather than test data itself.Therefore, the functional value of a negative adaptation increases any valuable information can to search, so return to the absolute value of a functional value.Adding up of branch's distance that total branch's distance is each condition in judging.

Step 5: on the basis of effective MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, obtain fitness value, whether check simultaneously meets the destination path (referring to whether test data arrives target) that expection is carried out; If so, enter step 7; Otherwise enter step 6;

Calculate and approach horizontal fitness function ApproachLevelFitness, if ApproachLevelFitness is 0, test data arrives target; Then at the target discrimination place, Branch Computed is apart from BranchFitness.If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target.Otherwise, the fitness function value of test data equals ApproachLevelFitness+normalized (BranchFitness), normalized (BranchFitness) means the standardization apart from BranchFitness to branch, and its value is between 0 to 1.Concrete computation process comprises following step:

Step 5.1: the method according to step 2 generates MC/DC test case expected results collection;

Step 5.2: the method according to step 4.1 and step 4.2 obtains the dependence set, comprises and controls Dependency Set and data dependence collection;

Step 5.3: for each target discrimination, generate at random 2 test data T _aand T _b;

Step 5.4: according to the computing method in step 4.1, according to the dependence set, the control of calculating respectively 2 test datas relies on fitness function and data dependence fitness function;

Step 5.5: according to the computing method in step 4.3, calculate respectively branch's distance of 2 test datas;

BranchFitness {(T_{a})}_{normalised} = \frac{BranchFitness (T_{a})}{BranchFitness (T_{a}) + BranchFitness (T_{b})}

BranchFitness (T _a) expression test data T _abranch's distance, BranchFitness (T _b) expression test data T _bbranch's distance;

Step 5.7: total fitness function value Fitness (T) is:

Fitness(T)＝ApproachLevelFitness(T)+BranchFitness(T) _normalized

Step 5.9:

If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, enters step 7, otherwise, enter step 6;

Whether also should judge whether to enter step 6 or seven step to a step 5.9 should be arranged after this step? like this could be consistent with top statement!

The tested program fragment of below still take in Fig. 6 is introduced obtaining of fitness function value as example.Hypothetical target is the 16th row.Early stage is movable:

(1) the dependence set obtained be

, 2,12 ,-13}, 2,3,4,8,12 ,-13}, 2,3 ,-4,8,12 ,-13}};

(2) extract the MC/DC test case of the 16th row target discrimination from MC/DC test case generation module: { (010), (110), (100), (011.For example, our plan reaches test case (010).

Calculate:

While supposing operation now, test data generating is that parameter x, y, z export two groups of test datas: (12 ,-2,3) and (1,2,0).Meet test case (010) for differentiating whether to meet or approach, assess respectively the fitness function of each test data.

（1）T1＝(12，-2，3)

Target is the false ramification of judging through 4 row, therefore has:

ApproachLevelFitness(T1，16)＝Count({2，3，-4，8，12，-13}-{2，3，4})

＝Count({-4，8，12，-13})＝4

BranchFitness(T1，-4)＝(Fit(x＞0)+Fit(x＞0)+Fit(x＞0)) _T1＝0+2+0＝2

（2）T2＝(1，2，0)

Target is the false ramification of judging through the 13rd row, provides the false ramification formula.We select k=0.1.

ApproachLevelFitness(T2，16)＝Count({2，3，-4，8，12，-13}-{2，3，-4，8，12，13})

＝Count({13}=)

BranchFitness(T2，-13)＝(Fit(z＝＝0)) _T2＝k＝0.1

(3) branching adaptation standardization:

BranchFitness {(T 1)}_{normalised} = \frac{BranchFitness (T 1)}{BranchFitness (T 1) + BranchFitness (T 2)} = \frac{2}{2.1} = 0.9

BranchFitness {(T 2)}_{normalised} = \frac{BranchFitness (T 2)}{BranchFitness (T 1) + BranchFitness (T 2)} = \frac{0.1}{2.1} = 0.045

(4) comparison of test data:

Fitness(T,d)＝ApproachLevelFitness(T,d)+BranchFitness(T) _normalized

Fitness(T1，16)＝4+0.9＝4.9

Fitness(T2，16)＝1+0.045＝1.045

Obtain by contrast T2 closer to reaching the judgement target.

In genetic algorithm, the term of biological evolution with the term corresponding relation of Generation of software test case process is: chromosome: each test data in genetic algorithm, and be with characteristic chromosome also to claim individuality; Population: the set of the random test data generated; Evolve: use genetic algorithm from the new test data process of old test data grey iterative generation; Coding: the process that is the computing machine character that can operate by certain rule encoding by chromosome.

Genetic algorithm replaces the parameter space of problem with space encoder, take fitness function as estimating foundation, take coding colony as the basis of evolving, with the genetic manipulation to the individual in population bit string, realize selecting and genetic mechanism, set up an iterative process, specifically comprise following step, as shown in Figure 7.

Step 6.1: select coding strategy, parameter sets X and territory are converted to bit string structure space S;

In the automatic Generating Problems of test data, chromosomal each gene of genetic algorithm, may belong to different test data types.Therefore, the mode Encoded Chromosomes that adopts in the present invention integer and full mold to mix, the parameter transcoding of problem is the gene on chromosome, and the number of parameter is converted to chromosome length, and the interval of each parameter is mapped as the span of each gene.This also just shows, the solution of chromosome and problem has identical space.Detailed process is as follows:

If Solve problems comprises n input variable X ₁, X ₂..., X _n, at first, with the codomain of equivalence class division and boundary value analysis method processing parameter, wherein Y _i(1≤i≤n) means parameter X _i(1≤i≤n) can value the set of limited discrete point, | Y _i| mean the size of set.Set up the mapping relations in solution space and chromosome space, chromosome is expressed as:

X＝(X ₁,X ₂,...,X _n)→C＝(C ₁,C ₂,...,C _n) （2）

Wherein, C is the solution in chromosome Space Solutions space, the solution that X is problem space.

Step 6.2: design and selection genetic manipulation comprise Population Size, selection, intersection, variation method, and definite crossover probability p _cwith the variation Probability p _metc. genetic parameter;

Elite's retention strategy can guarantee to intersect, mutation operation can not destroy resulting best individuality up to now, effectively improves speed of convergence, and it is the strong prerequisite that guarantees genetic algorithm converges.On the other hand, although its result of calculation is better, generally be difficult to obtain optimum solution.The present invention adopts the mode of sequencing selection and the combination of elite's retention strategy to be copied population, before taking elite's retention strategy, first adopts the sequencing selection strategy, and for selecting, suitable individuality is intersected, mutation operation.

The detailed process of wherein carrying out the sequencing selection strategy is: according to the size of fitness function value, by all individualities in ascending order or descending sort population, individual selected probability distributes on the basis of ascending order or descending sort, is specially:

(a) according to the size of adaptive value, descending or ascending order are arranged all individualities in population;

(b) estimate by design allocation table, according to the adaptive value size, ascending order is distributed each individual probable value, and, in table, each individual fitness value is from large to small, and probable value is from small to large;

(c) each individuality is genetic to the probable value that follow-on probability distributes in step (b) and is determined, then, based on these probable values, selects the chromosome that is eliminated and is replicated with roulette wheel selection.

After one takes turns the sequence selection strategy, can obtain a new population, then at this, on new population basis, carry out again elite's retention strategy.

The detailed process of carrying out elite's retention strategy is:

(a) obtain optimized individual and the poorest individuality according to the size of fitness function value after the sequencing selection strategy from the new population (being called " current population ") obtained;

Through these two kinds of strategies, population is copied, formed final population.

Step 6.2.2: interlace operation and mutation operation

Based on Srinivas thought, adopt adaptive crossover probability p _cwith the variation Probability p _m, according to the average adaptive value of colony and current colony optimum individual adaptive value, automatically adjust crossover probability p _cwith the variation Probability p _m.

F _maxthe fitness that means optimum individual in certain generation population, F _avgthe average fitness that means this colony in generation, the fitness of optimum individual and this in generation colony the difference DELTA=f of average fitness _max-F _avg, when Δ less, mean that the fitness difference between population at individual is less, illustrate that population now reaches the possibility of local optimum larger, the possibility of Premature Convergence is also larger; When Δ is larger, mean that the fitness difference between individuality is larger.Therefore, crossover probability p _cwith the variation Probability p _mcan be decided by Δ.For making p _cand p _mcan adjust its value according to the actual conditions of population in the process of evolving, when population is tending towards restraining, improve p _cand p _m, the frequency of increase crossover and mutation, destroy current stability, makes genetic algorithm have stronger detectivity, overcomes Premature Convergence; Otherwise, when population at individual is dispersed, reduce the crossover and mutation frequency, increase development ability, make individuality be tending towards convergence.P _cand p _mcomputing formula be:

p _c＝k ₁/Δ （3）

p _m＝k ₂/Δ （4）

Wherein, k ₁and k ₂be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient.For avoiding k ₁and k ₂value is improper, has designed crossover and mutation and has proofreaied and correct probability: as crossover probability p _cbe greater than crossover probability and proofread and correct upper value k _c1the time, by k _c1value give p _c, as crossover probability p _cbe less than crossover probability and proofread and correct lower value k _c2the time, by k _c2value give p _c, when the variation Probability p _mbe greater than on the variation probability correlation and be worth k _m1the time, by k _m1value give p _m, when the variation Probability p _mbe less than on the variation probability correlation and be worth k _m2the time, by k _m2value give p _m.

Step 6.3. random initializtion generates population P;

Step 6.4. calculates the decoded fitness value of individual bit string in population P;

Step 6.5., according to Genetic Strategies, acts on population by each genetic manipulation of design in step 6.2, after selection, crossover and mutation, has formed population of new generation;

Step 6.6. returns to step 5 with the new chromosome (being test data) produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7.

Step 7: end of run obtains suitable test data.

Claims

1. the MC/DC automatic generation of test data based on genetic algorithm is characterized in that: comprise following step:

Step 1: tested program is carried out to static analysis, produce and control flow graph, data flow diagram, abstract syntax tree and abstract analysis tree;

Step 2: generate MC/DC test case expected results collection;

Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree, and each leaf is exactly a variable, means the number of the leaf variable of extraction with N;

Step 2.2: build truth table, to N leaf variable, have 2 ⁿplant permutation and combination;

Step 2.3: by numeral 0 and 1, fill truth table;

Step 2.4: for the every a line in truth table:

Step 2.4.3: the output row of filling truth table by the Output rusults value;

Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2, if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, adds this two row to set up in step 2.5.1 MC/DC test case with paired form and concentrates;

Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases;

Step 3: tested program is carried out to the code pitching pile;

At first travel through the position that abstract syntax tree finds pitching pile, judge whether it is the pitching pile point, if not the pitching pile point, return to continuation traversal abstract syntax tree and find the pitching pile position; If the pitching pile point is implanted inspect statement, then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree and find the position of pitching pile, until complete pitching pile;

Step 4: structure fitness function;

Step 4.1: set up and control the dependence fitness function:

Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; Arbitrary node x, can form an individual path between two nodes (y, x), during passing through (y, x) individual path and all comprise node z when each from destination node y to Egress node e, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control node of node z y, claim node z to control and depend on destination node y, controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, departs from the internodal key node of condition by target and execution in the control flow graph and calculates;

Step 4.1.2: set up and control the dependence fitness function:

ControlDepFit _testdata＝dependent _decisions-executed _decisions （1）

Step 4.2: set up the data dependence relation fitness function:

Step 4.2.4: set up for defining the dependence set of fitness function:

(a) add s _iin DepFit;

Step 4.2.5: set up the data dependence relation fitness function:

After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function;

Step 4.3: set up branching adaptation degree function:

When test data arrives at target, measure the test data test case that whether meets the expectation by branch distance, approximation ratio by Branch Computed distance metric test data apart from the expectation test case, if test data arrives target discrimination, but do not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case;

Step 5: on the basis of MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, and obtain fitness value, whether check simultaneously meets the path that expection is carried out; If meet, enter step 7; Otherwise enter step 6;

Step 5.3: for each target discrimination, generate at random test data T _awith test data T _b;

Step 5.4: according to the computing method in step 4.1, according to the dependence set, calculate respectively test data T _awith test data T _bcontrol rely on fitness function and data dependence fitness function;

BranchFitness {(T_{a})}_{normalised} = \frac{BranchFitness (T_{a})}{BranchFitness (T_{a}) + BranchFitness (T_{b})}

Wherein, BranchFitness (T _a) expression test data T _abranch's distance, BranchFitness (T _b) expression test data T _bbranch's distance;

Step 5.7: total fitness function value Fitness (T) is:

Fitness(T)＝ApproachLevelFitness(T)+BranchFitness(T) _normalized

BranchFitness (T) _normalizedthe standardized value that means branch's distance of test data T;

Step 5.8: according to these two test datas of total fitness function value comparison which closer to reaching the judgement target;

Step 5.9:

Step 6: according to the fitness value obtained, the genetic manipulations such as the selection of use genetic algorithm, intersection, variation, generate new test data, and return to step 5, calculates the fitness value of newly-generated test data;

X＝(X ₁,X ₂,...,X _n)→C＝(C ₁,C ₂,...,C _n) （2）

The detailed process of carrying out the sequencing selection strategy is:

(c) keep the optimized individual state of appearance so far constant, complete being genetic in population of future generation by it;

Step 6.2.2: interlace operation and mutation operation

p _c＝k ₁/Δ （3）

p _m＝k ₂/Δ （4）

Wherein, k ₁and k ₂be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient;

Step 6.3: random initializtion generates population P;

Step 6.4: calculate the decoded fitness value of individual bit string in population P;

Step 6.5: according to Genetic Strategies, each genetic manipulation of design in step 6.2 is acted on to population, after selection, crossover and mutation, formed population of new generation;

Step 6.6: be that test data is returned to step 5 with the new chromosome produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7;

Step 7: end of run obtains suitable test data.