CN102323906B - MC/DC test data automatic generation method based on genetic algorithm - Google Patents

MC/DC test data automatic generation method based on genetic algorithm Download PDF

Info

Publication number
CN102323906B
CN102323906B CN201110265194.4A CN201110265194A CN102323906B CN 102323906 B CN102323906 B CN 102323906B CN 201110265194 A CN201110265194 A CN 201110265194A CN 102323906 B CN102323906 B CN 102323906B
Authority
CN
China
Prior art keywords
dependence
test data
node
fitness function
fitness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110265194.4A
Other languages
Chinese (zh)
Other versions
CN102323906A (en
Inventor
高峰
刘厂
赵玉新
李刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201110265194.4A priority Critical patent/CN102323906B/en
Publication of CN102323906A publication Critical patent/CN102323906A/en
Application granted granted Critical
Publication of CN102323906B publication Critical patent/CN102323906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a MC/DC (Modified Condition Decision Coverage) test data automatic generation method based on a genetic algorithm, comprising the following steps of: statically analyzing a tested program to generate a control flow graph, a data flow graph, an abstract syntax tree and an abstract analysis tree; generating a MC/DC test case expected result set; executing code instrumentation on the tested program; constructing a fitness function; randomly generating the test data, and checking whether the test data satisfies an expected execution path; and obtaining proper test data through genetic operations, such as selection, crossing, mutation and the like of the genetic algorithm. In the method, for construction of the fitness function, optimization of approximate-level fitness evaluation by a method of obtaining a control node directly or indirectly influencing defective node traversing through data dependency is proposed according to the thought of a chaining method and based on the conventional fitness function. The method has greater practical value for testing a system with complex logical relations.

Description

A kind of MC/DC automatic generation of test data based on genetic algorithm
Technical field
The invention belongs to the software testing technology field, be specifically related to a kind of MC/DC automatic generation of test data based on genetic algorithm, particularly in engineering test, meet the automatic generation of test data that MC/DC covers (correction conditions is judged covering).
Background technology
The test data Auto refers to by specific algorithm constructs the test input data automatically according to stipulations or the program structure of software, its purpose is to alleviate a large amount of work that the tester must pay, reduce the great number cost of manual test, improve the confidence level of test process simultaneously.Genetic algorithm is a kind of biological evolution process and machine-processed self-adaptation artificial intelligence technology that solves extreme-value problem simulated, and has unique advantage when solving large space, non-linear contour complexity issue.
The thought that the Dynamic Execution program generates test data is proposed by Miller and Spooner at first, and scholar has afterwards done a lot of further research.1992, at first Xanthakis was applied to genetic algorithm test case and generates, and its adopts coding techniques that D is mapped to gene space G, and determined the direction of search by the natural selection of the genetic manipulation such as selection, intersection, variation and the survival of the fittest.Crossover and mutation is operating as population and introduces new information, thereby more is conducive to find globally optimal solution, has avoided algorithm in the past often easily to be absorbed in the problem of local extremum, can generate the test data that meets all branches decision criteria.At home, 1996 to 1998, pod is big, Gao Zhongyi etc. delivered many pieces of papers, and the automatic generation that genetic algorithm is applied to the Ada software configuration test data based on the path covering is discussed, and show that genetic algorithm generates the high conclusion of efficiency of test data than climbing method and random approach.Calendar year 2001, Wang Hao etc. have provided formal expression and the test data generation system prototype based on this algorithm of genetic algorithm.2003, Jing Zhiyuan from the angle analysis of mathematics by improved algorithm application such as MGA and hereditary K averages in the automatic generation of test case.2006, Cheng Ye generated genetic algorithm for path coverage test data automatically at academic dissertation, and has developed corresponding tool model.2008, Lv Shanshan was used artificial neural network and genetic algorithms to generate the test data based on input domain in her academic dissertation.Generally speaking, present researcher both domestic and external adopts the improvement algorithm of genetic algorithm or genetic algorithm to generate test data as core algorithm more.But above-mentioned is all to select the relatively simple statement of logical organization to cover or branch's covering generating test use case, and it can't be widely used at actual engineering field.
Summary of the invention
For problems of the prior art, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm.Only rely on and control internodal control dependence in flow graph for traditional fitness function in genetic algorithm, do not consider the defect of the data dependence relation of tested program inside, propose to use cascade synthesis thought to collect directly or pass through the control node of data dependence remote effect trouble node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct a new fitness function, overcoming traditional fitness function with this lacks because of the guidance information that the data dependence relation of ignoring program inside causes, the problem that search is degenerated.Design has realized the automatic generation of test data that meets the MC/DC criterion based on genetic algorithm on this basis.The MC/DC criterion is emphasized the ability of single variable true and false impact for last expression formula, require each condition will independently affect the result of predicated expressions, have great practical value when logical relation complexity and the higher software systems of security requirement are tested.
The present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, specifically comprises following step:
Step 1: tested program is carried out to static analysis, produce and control flow graph, data flow diagram, abstract syntax tree and abstract analysis tree.
Step 2: generate MC/DC test case expected results collection.
Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree, and each leaf is exactly a variable, means the number of the leaf variable of extraction with N.
Step 2.2: build truth table, to N leaf variable, have 2 nplant permutation and combination.
Step 2.3: by numeral 0 and 1, fill truth table.
Step 2.4: for the every a line in truth table:
Step 2.4.1: each leaf variable of the Boolean correspondence in the truth table current line being distributed to the abstract analysis tree;
Step 2.4.2: the Boolean of each condition node of bottom-up evaluation, until the top of abstract analysis tree, the Boolean of the condition node on the top of the abstract analysis tree of final gained is exactly the Output rusults value of this decision statement;
Step 2.4.3: the output row of filling truth table by the Output rusults value.
Step 2.5: for each leaf variable, find its test case, and the MC/DC test case of adding each leaf variable to is concentrated:
Step 2.5.1: for each leaf variable is set up the MC/DC test use cases of a pair of sky;
Step 2.5.2: in truth table, find other leaf variate-values and fix, two row that only have target leaf variable to change;
Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2, if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, adds this two row to set up in step 2.5.1 MC/DC test case with paired form and concentrates.
Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases.
Step 3: tested program is carried out to the code pitching pile.
At first travel through the position that abstract syntax tree finds pitching pile, judge whether it is the pitching pile point, if not the pitching pile point, return to continuation traversal abstract syntax tree and find the pitching pile position; If the pitching pile point is implanted inspect statement, then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree and find the position of pitching pile, until complete pitching pile.
Step 4: structure fitness function.
Step 4.1: set up and control the dependence fitness function:
Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; Arbitrary node x, can form an individual path between two nodes (y, x), during passing through (y, x) individual path and all comprise node z when each from destination node y to Egress node e, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control node of node z y, claim node z to control and depend on destination node y, controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, departs from the internodal key node of condition by target and execution in the control flow graph and calculates.
Step 4.1.2: set up and control the dependence fitness function:
Control the dependence fitness function and comprised the objective function of controlling each branch node in the dependence set, foundation is controlled the dependence fitness function and is:
ControlDepFit testdata=dependent decisions-executed decisions (1)
Wherein, dependent decisionsfor the control nodes in the control dependence set of target; Executed decisionsexpression be take current test data as input; ControlDepFit testdatamean to control the dependence fitness function;
If controlling dependence fitness function value is 0, test data can arrive target discrimination; Be greater than 0 if control dependence fitness function value, test data has departed from destination node somewhere, by controlling the value of dependence fitness function, obtains departing from node diverged node;
Step 4.2: set up the data dependence relation fitness function:
Definition pn is trouble node, the set that S is the dependence for storing a given node, using pn and S as the input of obtaining data dependence relation fitness function method, DepSets is used for storing the node set of the current Existence dependency relationship of collecting, PV is used for the variable that the storage problem node is used, S and DepSets are initialized to empty set, and the method for obtaining the dependence fitness function of a node is:
Step 4.2.1: the control dependence set ControlDep(pn that obtains trouble node pn) be assigned to S, and the variable UsedVariables(pn that pn is used) be added in PV;
Step 4.2.2: for each the variable pv in PV, obtain the last definition set lastDefs of pv;
(a) for each last definition ld in lastDefs, obtain the control dependence set ControlDep(ld of ld), and itself and S are merged to the S be expanded, the variable UsedVariables(ld then ld used) add in a newly-built PVnew;
(b) for ControlDep(ld) in each control node cd, obtain the variable UsedVariables(ld that its uses), and add step 4.2.2(a to) in the PVnew set of setting up;
Step 4.2.3: for each the variable pv in PV, iteration is obtained PV dependence collection S (ld), at first obtain first variable in PV, obtain its dependence collection S (ld), return to step 4.2.2, then obtain the next variable in PV, until complete variablees all in PV, finishing iteration, and S (ld) is added in DepSets;
Step 4.2.4: set up for defining the dependence set of fitness function:
Definition DepFit is the set of fitness function dependence, for each the subset s in DepSets i:
(a) add s iin DepFit;
(b) each non-s in judgement DepSets isubset s j, with s iwhether there is the branch interference of then, else, if there is no, merge two subset s i, s j, obtain new subset S i,j, and by S i,jadd in DepFit; If have the branch interference of then, else can not merge, by s i, s jall add in DepFit;
Step 4.2.5: set up the data dependence relation fitness function:
After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function.
Step 4.3: set up branching adaptation degree function:
When test data arrives at target, measure the test data test case that whether meets the expectation by branch distance, approximation ratio by Branch Computed distance metric test data apart from the expectation test case, if test data arrives target discrimination, but do not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case.
Step 5: on the basis of MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, and obtain fitness value, whether check simultaneously meets the path that expection is carried out; If meet, enter step 7; Otherwise enter step 6.
Calculate and approach horizontal fitness function ApproachLevelFitness, if ApproachLevelFitness is 0, test data arrives target; Then at the target discrimination place, Branch Computed is apart from BranchFitness, if ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, otherwise, the fitness function value of test data equals ApproachLevelFitness+normalized (BranchFitness), and normalized (BranchFitness) means the standardization apart from BranchFitness to branch, and concrete computation process is:
Step 5.1: the method according to step 2 generates MC/DC test case expected results collection.
Step 5.2: the method according to step 4.1 and step 4.2 obtains the dependence set, comprises and controls Dependency Set and data dependence collection.
Step 5.3: for each target discrimination, generate at random test data T awith test data T b.
Step 5.4: according to the computing method in step 4.1, according to the dependence set, calculate respectively test data T awith test data T bcontrol rely on fitness function and data dependence fitness function.
Step 5.5: according to the computing method in step 4.3, calculate respectively test data T awith test data T bbranch's distance;
Step 5.6: the standardization BranchFitness (T that carries out branch's distance a) normalised, computing formula is:
BranchFitness ( T a ) normalised = BranchFitness ( T a ) BranchFitness ( T a ) + BranchFitness ( T b )
Wherein, BranchFitness (T a) expression test data T abranch's distance, BranchFitness (T b) expression test data T bbranch's distance.
Step 5.7: total fitness function value Fitness (T) is:
Fitness(T)=ApproachLevelFitness(T)+BranchFitness(T) normalized
Wherein, ApproachLevelFitness (T) means the horizontal fitness function that approaches of test data T;
BranchFitness (T) normalizedthe standardized value that means branch's distance of test data T.
Step 5.8: according to these two test datas of total fitness function value comparison which closer to reaching the judgement target.
Step 5.9:
If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, enters step 7, otherwise, enter step 6.
Step 6: according to the fitness value obtained, the genetic manipulations such as the selection of use genetic algorithm, intersection, variation, generate new test data, and return to step 5, calculates the fitness value of newly-generated test data.
Step 6.1: select coding strategy, parameter sets X and territory are converted to bit string structure space S.
The mode Encoded Chromosomes that adopts integer and full mold to mix, the parameter transcoding of problem is the gene on chromosome, and the number of parameter is converted to chromosome length, and the interval of each parameter is mapped as the span of each gene, and detailed process is as follows:
If Solve problems comprises n input variable X 1, X 2..., X n, at first, with the codomain of equivalence class division and boundary value analysis method processing parameter, wherein Y i(1≤i≤n) means parameter X i(1≤i≤n) can value the set of limited discrete point, | Y i| mean the size of set, set up the mapping relations in solution space and chromosome space, chromosome is expressed as:
X=(X 1,X 2,...,X n)→C=(C 1,C 2,...,C n) (2)
Wherein, C is the solution in chromosome Space Solutions space, the solution that X is problem space;
Step 6.2: design and selection genetic manipulation comprise Population Size, selection, intersection, variation method, and definite crossover probability p cwith the variation Probability p metc. genetic parameter.
Step 6.2.1: carry out sequencing selection strategy and elite's retention strategy:
The detailed process of carrying out the sequencing selection strategy is:
(a) according to the size of adaptive value, all individualities in the descending sort population;
(b) estimate by design allocation table, according to the adaptive value size, ascending order is distributed each individual probable value;
(c) each individuality is genetic to the probable value that follow-on probability distributes in step (b) and is determined, then, based on these probable values, selects the chromosome that is eliminated and is replicated with roulette wheel selection; After one takes turns the sequence selection strategy, can obtain a new population, then at this, on new population basis, carry out again elite's retention strategy, detailed process is:
(a) according to the size of fitness function value, from the new population obtained after the sequencing selection strategy, be to obtain optimized individual and the poorest individuality in current population;
(b) if the fitness of the optimized individual of current population, higher than the fitness of the optimized individual of the appearance obtained before this, replaces by the optimized individual of current colony the optimized individual occurred before this;
(c) keep the optimized individual state of appearance so far constant, complete being genetic in population of future generation by it.
Step 6.2.2: interlace operation and mutation operation:
Adopt adaptive crossover probability p cwith the variation Probability p m, according to the average adaptive value of colony and current colony optimum individual adaptive value, automatically adjust crossover probability p cwith the variation Probability p m; f maxthe fitness that means optimum individual in certain generation population, F avgthe average fitness that means this colony in generation, the fitness of optimum individual and this in generation colony the difference DELTA=f of average fitness max-F avg, when Δ less, mean that the fitness difference between population at individual is less, illustrate that population now reaches the possibility of local optimum larger, the possibility of Premature Convergence is also larger; When Δ is larger, mean that the fitness difference between individuality is larger, crossover probability p cwith the variation Probability p mdecided p by Δ cand p mcomputing formula be:
p c=k 1/Δ (3)
p m=k 2/Δ (4)
Wherein, k 1and k 2be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient.
Step 6.3: random initializtion generates population P.
Step 6.4: calculate the decoded fitness value of individual bit string in population P.
Step 6.5: according to Genetic Strategies, each genetic manipulation of design in step 6.2 is acted on to population, after selection, crossover and mutation, formed population of new generation.
Step 6.6: be that test data is returned to step 5 with the new chromosome produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7.
Step 7: end of run obtains suitable test data.
The advantage that the present invention has is:
1, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, only rely on and control internodal control dependence in flow graph for traditional fitness function in genetic algorithm, do not consider the defect of the data dependence relation of tested program inside, propose to use cascade synthesis thought to collect directly or pass through the control node of data dependence remote effect trouble node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct a new fitness function, overcoming traditional fitness function with this lacks because of the guidance information that the data dependence relation of ignoring program inside causes, the problem that search is degenerated.
2, the present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, realized the automatic generation of test data that meets the MC/DC criterion based on genetic algorithm, had great practical value when logical relation complexity and the higher software systems of security requirement are tested.
The accompanying drawing explanation
Fig. 1: the process flow diagram of the MC/DC automatic generation of test data based on genetic algorithm that the present invention proposes;
Fig. 2: the generative process process flow diagram of abstract syntax tree in the present invention;
Fig. 3: the generative process process flow diagram of abstract analysis tree in the present invention;
Fig. 4: MC/DC test case expected results collection product process figure in the present invention;
Fig. 5: code pitching pile process flow diagram in the present invention;
Fig. 6: set up in the present invention and control the applied a certain usability of program fragments of dependence fitness function;
Fig. 7: genetic algorithm basic flow sheet in the present invention.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in detail.
The present invention proposes a kind of MC/DC automatic generation of test data based on genetic algorithm, this comprises the key contents such as structure, genetic Algorithm Design of generation, code pitching pile, the fitness function of generation, the MC/DC test case expected results collection of generation, the abstract analysis tree of abstract syntax tree, idiographic flow as shown in Figure 1, specifically comprises following step:
Step 1: tested program is carried out to static analysis, produce information such as controlling flow graph, data flow diagram, abstract syntax tree and abstract analysis tree; Tested program refers to the software code that will carry out test.
By analytical tool (as testing tool Testbed), tested program is carried out to lexical analysis, grammatical analysis and semantic analysis, generate abstract syntax tree.Abstract syntax tree is program compiler another expression at the source program obtained after grammatical analysis.The corresponding corresponding processing function of each syntax rule, and hang on abstract syntax tree as a node of abstract syntax tree, external interface is provided, it can be prepared for next step code pitching pile work.
The generative process of abstract syntax tree as shown in Figure 2, is at first carried out lexical analysis and grammatical analysis to the tested program code.Lexical analysis provides abstract syntax tree needed symbol node, as constant and name; Grammatical analysis provides the abstract syntax tree that contains the intermediate node that represents corresponding syntactic structure; Then the tested program code is carried out to semantic analysis, the processing that name, symbol etc. is carried out, change syntax tree into a kind of canonical form that comprises expression type information and symbol table, and they connected into to tree structure, finally obtains abstract syntax tree.In the code analysis process, complete the process of establishing of the abstract syntax tree of tested program code by means of instrument Flex and Bison.
The abstract syntax tree of having set up by access, collect the decision statement subtree, simultaneously, with capitalization, means relational expression, and newly-generated subtree is called as the abstract analysis tree.Each judges the abstract analysis tree that has to mean its logical organization.Analytical tool automatically generates and controls flow graph and data flow diagram after generating the abstract analysis tree.
The tested program fragment be " if(x > y& & X > z||x > y+z) "; Fig. 2 is the abstract syntax tree that the tested program fragment generates; Fig. 3 means the abstract analysis tree extracted from abstract syntax tree, and wherein, AND and OR are the condition node, and A, B, C are called the leaf of abstract analysis tree.
Step 2: generate MC/DC test case expected results collection;
The abstract analysis tree has formed the input that generates MC/DC test case expected results collection.Each node in the abstract analysis tree can be assigned with a Boolean and a Boolean variable evaluation, identifies the Boolean that whether calculated this node with the Boolean variable evaluation.With the tested program fragment " if(x > y& & X > z||x > y+z) " be example, introduce the generative process of MC/DC test case expected results collection, as Fig. 4.Concrete steps are as follows:
Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree.Each leaf is exactly a variable, means the number of the leaf variable of extraction with N; As Fig. 5, usability of program fragments if(x > y& & X > z||x > y+z) the leaf variable is 3, i.e. N=3.
Step 2.2: build truth table, to N leaf variable, have 2 nplant permutation and combination; As Fig. 5, and the tested program fragment " if(x>y& & X>z||x y+z) " truth table 8 kinds of array modes are arranged.
Step 2.3: by numeral 0 and 1, fill truth table;
Step 2.4: for the every a line in truth table:
Step 2.4.1: each leaf variable of the Boolean correspondence in the truth table current line being distributed to the abstract analysis tree; In Fig. 5, fourth line A=0 in truth table, B=1, C=1, distribute to respectively the leaf variable of abstract analysis tree by it.
Step 2.4.2: the Boolean of each condition node of bottom-up evaluation, until the top of abstract analysis tree.The Boolean of the condition node on the top of the abstract analysis tree of final gained is exactly the Output rusults value of this decision statement; As Fig. 5, fifth line A=0 in truth table, B=1, C=1, A& & B is true, A& & B||C is true, and therefore, finally being output as that this row is judged is true.
(3) fill the output row of truth table by the Output rusults value.
By above step, complete the foundation of the truth table of a certain given decision statement (tested program).Then start to extract test case.Consider the characteristics of MC/DC, for each leaf variable, need to find two row that can embody leaf variable independent effect result of determination.
Step 2.5: for each leaf variable, find its test case, and the MC/DC test case of adding each leaf variable to is concentrated:
Step 2.5.1: for each leaf variable is set up the MC/DC test use cases of a pair of sky;
Step 2.5.2: in truth table, finding other leaf variate-values fixes, two row that only have target leaf variable to change, as Fig. 5, when target leaf variable is A, get two fixing row of leaf variable B and C variate-value, desirable the third line and the 7th row, wherein leaf variable B and C fix, and only have target leaf variables A to change;
Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2 (the Output rusults value of condition node voluntarily forms by each), if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, and this test case is effective, because they have shown the influence of target leaf variable, add this two row to set up in step (1) MC/DC test case with paired form and concentrate.
As Fig. 4, when target variable is A, the third line is (010), the 7th behavior (110), from truth table, the third line is output as 0, the seven row and is output as 1, the output valve that compares this two row, not identical, (010) and (110) is exactly a pair of test case of variables A, and this test case is effective, paired form, add the test case of leaf variables A to and concentrate.Other leaf variablees are as target leaf variable, and the method for finding its test case is identical as the finding method of target leaf variable with above-mentioned leaf variables A, and the MC/DC test case of adding each leaf variable to is concentrated.
Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases.This test use cases is exactly the target set of uses case that uses Genetic algorithm searching to realize.
As Fig. 4, the test use cases of leaf variables A, B, C is respectively (010,110), (100,110), (000,001,010,011,100,101), minimize this test use cases, obtain (010,100,110,101), this group test use cases meets the MC/DC test use cases coverage requirement of each leaf variable.
Step 3: tested program is carried out to the code pitching pile;
The program pitching pile is one of step of the automatic generative process of test data.In test process, obtaining and record all of test result data completes by the program pitching pile.Main process is to insert inspect statement on the basis that keeps the original logic integrality of tested program, when tested program is moved, and the operation characteristic data of the execution capture program by inspect statement.Analyze these characteristics, can obtain the multidate informations such as logic covering of program, and complete thus the calculating of individual fitness.Its detailed process is as Fig. 5, and abstract syntax tree (AST) at first travels through the position that abstract syntax tree (AST) finds pitching pile after generating, and judges whether it is the pitching pile point, if not the pitching pile point, returns to continuation traversal abstract syntax tree and finds the pitching pile position; If the pitching pile point is implanted inspect statement (probe), then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree (AST) and find the position of pitching pile, until complete pitching pile.
Step 4: structure fitness function;
Fitness function is unique interface of genetic algorithm and practical problems, is individual good and bad a kind of quantification reflection in population, and its structure directly affects the efficiency of problem solving.Traditional fitness function f(x) by two parts, formed:
f(x)=approachlevel+branchdistance
What the approachlevel of first embodied is to control dependence, is commonly called the level of approaching.Second portion branchdistance is called as branch's distance, it has overcome only uses the level that approaches to adapt to the limitation of estimating, and branch meets target or meets the approximation ratio that departs from branch apart from having weighed distance on the basis of current input test data.
On this practical problems of generation of MC/DC test data, target of the present invention is finally to find the test data of the MC/DC that meets intended target.The present invention takes full advantage of the advantage of cascade synthesis aspect the characterization test data dependence relation, use cascade synthesis thought to collect directly or rely on by test data the control node of remote effect conditions of problems node traversal, and traditional fitness function that it is good at the characterization control dependence combines, construct new fitness function, with this, overcome the problem that guidance information lacks, search is degenerated that traditional fitness function causes because of the data dependence relation of ignoring program inside.
Specifically, the structure fitness function is designed three aspects of fitness function from setting up control dependence fitness function, data dependence relation fitness and 3 factors of branching adaptation degree.
Step 4.1: set up and control the dependence fitness function
Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; If x is an arbitrary node, can form an individual path between two nodes (y, x), when each path of passing through branch (y, x) from destination node y to Egress node e all comprises node z, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control of z y, claim node z to control and depend on destination node y.Controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, and it is to depart from the internodal key node of condition by target and execution in the control flow graph to calculate.
As the control flow graph of certain tested program fragment as shown in Figure 6.As can be seen from the figure, the judgement that the judgement of the 16th row depends on the 12nd row is got the judgement of true and the 13rd row and is got vacation, think that thus the target discrimination of 16 row depends on the control stream through 12,13 row, these nodes are called as critical branch, because they determine to control stream, flow to or wide.Therefore, the judgement that the control dependence collection that the 16th row is judged comprises the 12nd, 13 row, that is to say, for reaching destination node, must carry out successively these branch nodes, and the output of these nodes must be specific.Can controlled dependence collection ControlDep (16)=12, and-13}, positive number means that needs carry out true branch, negative number representation need to be carried out false ramification.
Step 4.1.2: set up and control the dependence fitness function.
After setting up and controlling the dependence set, which test case the search judgement has carried out maximum control interstitial contents, for example, in Fig. 6, test data departed from 13 row than a test data departed from 12 row closer to target.At this moment just need an evaluation function to be used for judging which test data can make to carry out stream closer to target, so just set up and control the dependence fitness function.
Control the dependence fitness function and comprised the objective function of controlling each branch node in the dependence set.Setting up control dependence fitness function is shown below.
ControlDepFit testdata=dependent decisions-executed decisions (1)
Wherein, dependent decisionsfor the control nodes in the control dependence set of target; Executed decisionsexpression be take current test data as input; ControlDepFit testdatamean to control the dependence fitness function.
If controlling dependence fitness function value is 0, illustrate that test data can arrive target discrimination; Be greater than 0 if control dependence fitness function value, just illustrate that test data has departed from destination node somewhere, can be by controlling the value of dependence fitness function, accurately obtaining is which before target discrimination controlled node and depart from away, claims this point for departing from node diverged node.Take Fig. 6 as example, and certain input data makes to carry out stream and departs from 12 row, and it is 2 – 0=2 that the dependence of control fitness function value is arranged; If but carry out true branch at 12 row, and depart from 13 row, controlling so dependence fitness function value is just 2 – 1=1.Like this, according to test data, separately with respect to the level that approaches of target discrimination, just can be distinguished it, and by search index to the most approaching test data.
Step 4.2: set up the data dependence relation fitness function
Calculating on dependence, of the present invention focusing on controlled dependence and the guiding provided to search by approaching horizontal fitness is provided data dependence relation by insertion.Thisly to approaching the expansion of level, comprised data dependence relation, the search index by search to region of search more likely.The present invention is intended to overcome in predicate the problem of using between flag variable or code predicate the guiding existed when strong data dependence relation is arranged to lack peaceful faceted search.
Obtain data dependence relation fitness function method step as follows:
Definition pn is trouble node, and the set that S is the dependence for storing a given node is usingd pn and S as the input of obtaining data dependence relation fitness function method.DepSets is used for storing the node set of the current Existence dependency relationship of collecting, and PV is used for the variable that the storage problem node is used.S and DepSets are initialized to empty set.The method of dependence fitness function of obtaining a node is as follows:
Step 4.2.1: the control dependence set ControlDep(pn that obtains trouble node pn) be assigned to S, and the variable UsedVariables(pn that pn is used) be added in PV;
Step 4.2.2: for each the variable pv in PV, obtain the last definition set lastDefs of pv;
(a) for each last definition ld in lastDefs, obtain the control dependence set ControlDep(ld of ld), and itself and S are merged to the S be expanded, the variable UsedVariables(ld then ld used) add in a newly-built PVnew;
(b) for ControlDep(ld) in each control node cd, obtain the variable UsedVariables(ld that its uses), and add in the PVnew set of setting up in step (a).
Step 4.2.3: for each the variable pv in PV, iteration is obtained PV dependence collection S (ld).At first obtain first variable in PV, obtain its dependence collection S (ld), return to step 4.2.2, then obtain the next variable in PV, until complete variablees all in PV, finishing iteration, and S (ld) is added in DepSets.
Step 4.2.4: set up one for defining the dependence set of fitness function:
Definition DepFit is the set of fitness function dependence, for each the subset s in DepSets i:
(a) add s iin DepFit;
(b) each non-s in judgement DepSets isubset s j, with s iwhether there is the branch interference of then, else, if there is no merge two subset s i, s j, obtain new subset S i,j, and by S i,jadd in DepFit; If have the branch interference of then, else can not merge, by s i, s jall add in DepFit.
Step 4.2.5: set up the data dependence relation fitness function
After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function, its computing method are identical with the method for calculating control dependence fitness function.
Step 4.3: Branch Computed distance
When test data arrives at target, measure test data by branch's distance and whether meet test case, be i.e. the approximation ratio apart from the expectation test case by Branch Computed distance metric test data.If test data arrives target discrimination, but does not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case.
The branch of a judgement is calculated apart from the structure that is based on this judgement:
(1) if contain the a==b expression formula in the structure of judging, when a==b is true time, the computing formula of branch's distance is abs (a-b); When a==b is fictitious time, the computing formula of branch's distance is a==b k:0;
(2) if contain a ≠ b expression formula in the structure of judging, when a ≠ b is true time, the computing formula of branch's distance is a unequal to b k:0; When a ≠ b is fictitious time, the computing formula of branch's distance is a unequal to b abs (a-b): 0;
(3) if contain a<b expression formula in the structure of judging, when a<b is true time, the computing formula of branch's distance is a<b 0:a – b+k; When a<b is fictitious time, the computing formula of branch's distance is a<b a – b+k:0;
(4) if contain a<=b expression formula in the structure of judging, when a<=b is true time, the computing formula of branch's distance is a<=b 0:a-b; When a<=b is fictitious time, the computing formula of branch's distance is a<=b a – b+k:0;
(5) if contain a in the structure of judging > the b expression formula, work as a > b is true time, and the computing formula of branch's distance is a > b 0:a-b; Working as a > b is fictitious time, the computing formula of branch's distance is a > b a – b+k:0;
(6) if contain a in the structure of judging >=the b expression formula, work as a >=b is true time, the computing formula of branch's distance is a >=b 0:a-b; Work as a >=b is fictitious time, the computing formula of branch's distance is a >=b a – b+k:0;
(7) if contain the a||b expression formula in the structure of judging, when a||b is true time, the computing formula of branch's distance is min[fit (a), fit (b)]; When a||b is fictitious time, the computing formula of branch's distance is fit (a)+fit (b);
(8) if contain a&amp in the structure of judging; & The b expression formula, work as a& & B is true time, and the computing formula of branch's distance is fit (a)+fit (b); Work as a& & B is fictitious time, and the computing formula of branch's distance is max[fit (a), fit (b)];
When attempting reaching test case, need relatively the approximation ratio close to the test data that completes test case, rather than test data itself, so branch's distance be often one on the occasion of.When attempting reaching test case, compare the approximation ratio close to the test data that completes test case, rather than test data itself.Therefore, the functional value of a negative adaptation increases any valuable information can to search, so return to the absolute value of a functional value.Adding up of branch's distance that total branch's distance is each condition in judging.
Step 5: on the basis of effective MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, obtain fitness value, whether check simultaneously meets the destination path (referring to whether test data arrives target) that expection is carried out; If so, enter step 7; Otherwise enter step 6;
Calculate and approach horizontal fitness function ApproachLevelFitness, if ApproachLevelFitness is 0, test data arrives target; Then at the target discrimination place, Branch Computed is apart from BranchFitness.If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target.Otherwise, the fitness function value of test data equals ApproachLevelFitness+normalized (BranchFitness), normalized (BranchFitness) means the standardization apart from BranchFitness to branch, and its value is between 0 to 1.Concrete computation process comprises following step:
Step 5.1: the method according to step 2 generates MC/DC test case expected results collection;
Step 5.2: the method according to step 4.1 and step 4.2 obtains the dependence set, comprises and controls Dependency Set and data dependence collection;
Step 5.3: for each target discrimination, generate at random 2 test data T aand T b;
Step 5.4: according to the computing method in step 4.1, according to the dependence set, the control of calculating respectively 2 test datas relies on fitness function and data dependence fitness function;
Step 5.5: according to the computing method in step 4.3, calculate respectively branch's distance of 2 test datas;
Step 5.6: the standardization BranchFitness (T that carries out branch's distance a) normalised, computing formula is:
BranchFitness ( T a ) normalised = BranchFitness ( T a ) BranchFitness ( T a ) + BranchFitness ( T b )
BranchFitness (T a) expression test data T abranch's distance, BranchFitness (T b) expression test data T bbranch's distance;
Step 5.7: total fitness function value Fitness (T) is:
Fitness(T)=ApproachLevelFitness(T)+BranchFitness(T) normalized
Wherein, ApproachLevelFitness (T) means the horizontal fitness function that approaches of test data T;
BranchFitness (T) normalizedthe standardized value that means branch's distance of test data T.
Step 5.8: according to these two test datas of total fitness function value comparison which closer to reaching the judgement target.
Step 5.9:
If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, enters step 7, otherwise, enter step 6;
Whether also should judge whether to enter step 6 or seven step to a step 5.9 should be arranged after this step? like this could be consistent with top statement!
The tested program fragment of below still take in Fig. 6 is introduced obtaining of fitness function value as example.Hypothetical target is the 16th row.Early stage is movable:
(1) the dependence set obtained be
Figure GDA0000386393440000143
, 2,12 ,-13}, 2,3,4,8,12 ,-13}, 2,3 ,-4,8,12 ,-13}};
(2) extract the MC/DC test case of the 16th row target discrimination from MC/DC test case generation module: { (010), (110), (100), (011.For example, our plan reaches test case (010).
Calculate:
While supposing operation now, test data generating is that parameter x, y, z export two groups of test datas: (12 ,-2,3) and (1,2,0).Meet test case (010) for differentiating whether to meet or approach, assess respectively the fitness function of each test data.
(1)T1=(12,-2,3)
Target is the false ramification of judging through 4 row, therefore has:
ApproachLevelFitness(T1,16)=Count({2,3,-4,8,12,-13}-{2,3,4})
=Count({-4,8,12,-13})=4
BranchFitness(T1,-4)=(Fit(x>0)+Fit(x>0)+Fit(x>0)) T1=0+2+0=2
(2)T2=(1,2,0)
Target is the false ramification of judging through the 13rd row, provides the false ramification formula.We select k=0.1.
ApproachLevelFitness(T2,16)=Count({2,3,-4,8,12,-13}-{2,3,-4,8,12,13})
=Count({13}=)
BranchFitness(T2,-13)=(Fit(z==0)) T2=k=0.1
(3) branching adaptation standardization:
BranchFitness ( T 1 ) normalised = BranchFitness ( T 1 ) BranchFitness ( T 1 ) + BranchFitness ( T 2 ) = 2 2.1 = 0.9
BranchFitness ( T 2 ) normalised = BranchFitness ( T 2 ) BranchFitness ( T 1 ) + BranchFitness ( T 2 ) = 0.1 2.1 = 0 . 045
(4) comparison of test data:
Fitness(T,d)=ApproachLevelFitness(T,d)+BranchFitness(T) normalized
Fitness(T1,16)=4+0.9=4.9
Fitness(T2,16)=1+0.045=1.045
Obtain by contrast T2 closer to reaching the judgement target.
Step 6: according to the fitness value obtained, the genetic manipulations such as the selection of use genetic algorithm, intersection, variation, generate new test data, and return to step 5, calculates the fitness value of newly-generated test data.
In genetic algorithm, the term of biological evolution with the term corresponding relation of Generation of software test case process is: chromosome: each test data in genetic algorithm, and be with characteristic chromosome also to claim individuality; Population: the set of the random test data generated; Evolve: use genetic algorithm from the new test data process of old test data grey iterative generation; Coding: the process that is the computing machine character that can operate by certain rule encoding by chromosome.
Genetic algorithm replaces the parameter space of problem with space encoder, take fitness function as estimating foundation, take coding colony as the basis of evolving, with the genetic manipulation to the individual in population bit string, realize selecting and genetic mechanism, set up an iterative process, specifically comprise following step, as shown in Figure 7.
Step 6.1: select coding strategy, parameter sets X and territory are converted to bit string structure space S;
In the automatic Generating Problems of test data, chromosomal each gene of genetic algorithm, may belong to different test data types.Therefore, the mode Encoded Chromosomes that adopts in the present invention integer and full mold to mix, the parameter transcoding of problem is the gene on chromosome, and the number of parameter is converted to chromosome length, and the interval of each parameter is mapped as the span of each gene.This also just shows, the solution of chromosome and problem has identical space.Detailed process is as follows:
If Solve problems comprises n input variable X 1, X 2..., X n, at first, with the codomain of equivalence class division and boundary value analysis method processing parameter, wherein Y i(1≤i≤n) means parameter X i(1≤i≤n) can value the set of limited discrete point, | Y i| mean the size of set.Set up the mapping relations in solution space and chromosome space, chromosome is expressed as:
X=(X 1,X 2,...,X n)→C=(C 1,C 2,...,C n) (2)
Wherein, C is the solution in chromosome Space Solutions space, the solution that X is problem space.
Step 6.2: design and selection genetic manipulation comprise Population Size, selection, intersection, variation method, and definite crossover probability p cwith the variation Probability p metc. genetic parameter;
Step 6.2.1: carry out sequencing selection strategy and elite's retention strategy:
Elite's retention strategy can guarantee to intersect, mutation operation can not destroy resulting best individuality up to now, effectively improves speed of convergence, and it is the strong prerequisite that guarantees genetic algorithm converges.On the other hand, although its result of calculation is better, generally be difficult to obtain optimum solution.The present invention adopts the mode of sequencing selection and the combination of elite's retention strategy to be copied population, before taking elite's retention strategy, first adopts the sequencing selection strategy, and for selecting, suitable individuality is intersected, mutation operation.
The detailed process of wherein carrying out the sequencing selection strategy is: according to the size of fitness function value, by all individualities in ascending order or descending sort population, individual selected probability distributes on the basis of ascending order or descending sort, is specially:
(a) according to the size of adaptive value, descending or ascending order are arranged all individualities in population;
(b) estimate by design allocation table, according to the adaptive value size, ascending order is distributed each individual probable value, and, in table, each individual fitness value is from large to small, and probable value is from small to large;
(c) each individuality is genetic to the probable value that follow-on probability distributes in step (b) and is determined, then, based on these probable values, selects the chromosome that is eliminated and is replicated with roulette wheel selection.
After one takes turns the sequence selection strategy, can obtain a new population, then at this, on new population basis, carry out again elite's retention strategy.
The detailed process of carrying out elite's retention strategy is:
(a) obtain optimized individual and the poorest individuality according to the size of fitness function value after the sequencing selection strategy from the new population (being called " current population ") obtained;
(b) if the fitness of the optimized individual of current population, higher than the fitness of the optimized individual of the appearance obtained before this, replaces by the optimized individual of current colony the optimized individual occurred before this;
(c) keep the optimized individual state of appearance so far constant, complete being genetic in population of future generation by it.
Through these two kinds of strategies, population is copied, formed final population.
Step 6.2.2: interlace operation and mutation operation
Based on Srinivas thought, adopt adaptive crossover probability p cwith the variation Probability p m, according to the average adaptive value of colony and current colony optimum individual adaptive value, automatically adjust crossover probability p cwith the variation Probability p m.
F maxthe fitness that means optimum individual in certain generation population, F avgthe average fitness that means this colony in generation, the fitness of optimum individual and this in generation colony the difference DELTA=f of average fitness max-F avg, when Δ less, mean that the fitness difference between population at individual is less, illustrate that population now reaches the possibility of local optimum larger, the possibility of Premature Convergence is also larger; When Δ is larger, mean that the fitness difference between individuality is larger.Therefore, crossover probability p cwith the variation Probability p mcan be decided by Δ.For making p cand p mcan adjust its value according to the actual conditions of population in the process of evolving, when population is tending towards restraining, improve p cand p m, the frequency of increase crossover and mutation, destroy current stability, makes genetic algorithm have stronger detectivity, overcomes Premature Convergence; Otherwise, when population at individual is dispersed, reduce the crossover and mutation frequency, increase development ability, make individuality be tending towards convergence.P cand p mcomputing formula be:
p c=k 1/Δ (3)
p m=k 2/Δ (4)
Wherein, k 1and k 2be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient.For avoiding k 1and k 2value is improper, has designed crossover and mutation and has proofreaied and correct probability: as crossover probability p cbe greater than crossover probability and proofread and correct upper value k c1the time, by k c1value give p c, as crossover probability p cbe less than crossover probability and proofread and correct lower value k c2the time, by k c2value give p c, when the variation Probability p mbe greater than on the variation probability correlation and be worth k m1the time, by k m1value give p m, when the variation Probability p mbe less than on the variation probability correlation and be worth k m2the time, by k m2value give p m.
Step 6.3. random initializtion generates population P;
Step 6.4. calculates the decoded fitness value of individual bit string in population P;
Step 6.5., according to Genetic Strategies, acts on population by each genetic manipulation of design in step 6.2, after selection, crossover and mutation, has formed population of new generation;
Step 6.6. returns to step 5 with the new chromosome (being test data) produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7.
Step 7: end of run obtains suitable test data.

Claims (1)

1. the MC/DC automatic generation of test data based on genetic algorithm is characterized in that: comprise following step:
Step 1: tested program is carried out to static analysis, produce and control flow graph, data flow diagram, abstract syntax tree and abstract analysis tree;
Step 2: generate MC/DC test case expected results collection;
Step 2.1: the node of offering from the abstract analysis tree, these condition nodes have formed the leaf of abstract analysis tree, and each leaf is exactly a variable, means the number of the leaf variable of extraction with N;
Step 2.2: build truth table, to N leaf variable, have 2 nplant permutation and combination;
Step 2.3: by numeral 0 and 1, fill truth table;
Step 2.4: for the every a line in truth table:
Step 2.4.1: each leaf variable of the Boolean correspondence in the truth table current line being distributed to the abstract analysis tree;
Step 2.4.2: the Boolean of each condition node of bottom-up evaluation, until the top of abstract analysis tree, the Boolean of the condition node on the top of the abstract analysis tree of final gained is exactly the Output rusults value of this decision statement;
Step 2.4.3: the output row of filling truth table by the Output rusults value;
Step 2.5: for each leaf variable, find its test case, and the MC/DC test case of adding each leaf variable to is concentrated:
Step 2.5.1: for each leaf variable is set up the MC/DC test use cases of a pair of sky;
Step 2.5.2: in truth table, find other leaf variate-values and fix, two row that only have target leaf variable to change;
Step 2.5.3: the Output rusults value of two row in comparison step 2.5.2, if the Output rusults value difference of two row, this two row is exactly a pair of test case of target leaf variable, adds this two row to set up in step 2.5.1 MC/DC test case with paired form and concentrates;
Step 2.6: the MC/DC test use cases of each leaf variable is merged, obtain test use cases, and minimize test use cases;
Step 3: tested program is carried out to the code pitching pile;
At first travel through the position that abstract syntax tree finds pitching pile, judge whether it is the pitching pile point, if not the pitching pile point, return to continuation traversal abstract syntax tree and find the pitching pile position; If the pitching pile point is implanted inspect statement, then directly on abstract syntax tree with the syntax tree fragment of the form code implant of subtree, judge whether pitching pile completes, as completed, tested program compilation run; As unfinished, return to continuation traversal abstract syntax tree and find the position of pitching pile, until complete pitching pile;
Step 4: structure fitness function;
Step 4.1: set up and control the dependence fitness function:
Step 4.1.1: by the control flow graph of tested program, obtain the control dependence set of each judgement; Control dependence and be used for describing the dependence of the execution of a destination node y about its front branch node output, when each path from destination node y to Egress node e all comprises node z, claim node z to be controlled by destination node y postposition; Arbitrary node x, can form an individual path between two nodes (y, x), during passing through (y, x) individual path and all comprise node z when each from destination node y to Egress node e, claim the rearmounted branch (y, x) that controls of node z; When the rearmounted branch that controls y of node z, and the not rearmounted control node of node z y, claim node z to control and depend on destination node y, controlling dependence is to weigh from the structural relation angle approximation ratio that current input test data distance arrives at target, departs from the internodal key node of condition by target and execution in the control flow graph and calculates;
Step 4.1.2: set up and control the dependence fitness function:
Control the dependence fitness function and comprised the objective function of controlling each branch node in the dependence set, foundation is controlled the dependence fitness function and is:
ControlDepFit testdata=dependent decisions-executed decisions (1)
Wherein, dependent decisionsfor the control nodes in the control dependence set of target; Executed decisionsexpression be take current test data as input; ControlDepFit testdatamean to control the dependence fitness function;
If controlling dependence fitness function value is 0, test data can arrive target discrimination; Be greater than 0 if control dependence fitness function value, test data has departed from destination node somewhere, by controlling the value of dependence fitness function, obtains departing from node diverged node;
Step 4.2: set up the data dependence relation fitness function:
Definition pn is trouble node, the set that S is the dependence for storing a given node, using pn and S as the input of obtaining data dependence relation fitness function method, DepSets is used for storing the node set of the current Existence dependency relationship of collecting, PV is used for the variable that the storage problem node is used, S and DepSets are initialized to empty set, and the method for obtaining the dependence fitness function of a node is:
Step 4.2.1: the control dependence set ControlDep(pn that obtains trouble node pn) be assigned to S, and the variable UsedVariables(pn that pn is used) be added in PV;
Step 4.2.2: for each the variable pv in PV, obtain the last definition set lastDefs of pv;
(a) for each last definition ld in lastDefs, obtain the control dependence set ControlDep(ld of ld), and itself and S are merged to the S be expanded, the variable UsedVariables(ld then ld used) add in a newly-built PVnew;
(b) for ControlDep(ld) in each control node cd, obtain the variable UsedVariables(ld that its uses), and add step 4.2.2(a to) in the PVnew set of setting up;
Step 4.2.3: for each the variable pv in PV, iteration is obtained PV dependence collection S (ld), at first obtain first variable in PV, obtain its dependence collection S (ld), return to step 4.2.2, then obtain the next variable in PV, until complete variablees all in PV, finishing iteration, and S (ld) is added in DepSets;
Step 4.2.4: set up for defining the dependence set of fitness function:
Definition DepFit is the set of fitness function dependence, for each the subset s in DepSets i:
(a) add s iin DepFit;
(b) each non-s in judgement DepSets isubset s j, with s iwhether there is the branch interference of then, else, if there is no, merge two subset s i, s j, obtain new subset S i,j, and by S i,jadd in DepFit; If have the branch interference of then, else can not merge, by s i, s jall add in DepFit;
Step 4.2.5: set up the data dependence relation fitness function:
After setting up the data dependence relation set, with calculating the method for controlling the dependence fitness function, obtain the data dependence relation fitness function;
Step 4.3: set up branching adaptation degree function:
When test data arrives at target, measure the test data test case that whether meets the expectation by branch distance, approximation ratio by Branch Computed distance metric test data apart from the expectation test case, if test data arrives target discrimination, but do not meet any MC/DC test case, the level that approaches of each test data is 0 so, but branch's distance is not 0; If test is arrived at target and realized a test case, its branch's distance is all 0 with approaching level so; Which test data is the size of branch's distance of calculating be used for estimating closer to meeting the desired the branch testing use-case;
Step 5: on the basis of MC/DC test case, fitness function formula and execution pitching pile code, the random test data that produces, and carry out the tested program after pitching pile on these test datas, and obtain fitness value, whether check simultaneously meets the path that expection is carried out; If meet, enter step 7; Otherwise enter step 6;
Calculate and approach horizontal fitness function ApproachLevelFitness, if ApproachLevelFitness is 0, test data arrives target; Then at the target discrimination place, Branch Computed is apart from BranchFitness, if ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, otherwise, the fitness function value of test data equals ApproachLevelFitness+normalized (BranchFitness), and normalized (BranchFitness) means the standardization apart from BranchFitness to branch, and concrete computation process is:
Step 5.1: the method according to step 2 generates MC/DC test case expected results collection;
Step 5.2: the method according to step 4.1 and step 4.2 obtains the dependence set, comprises and controls Dependency Set and data dependence collection;
Step 5.3: for each target discrimination, generate at random test data T awith test data T b;
Step 5.4: according to the computing method in step 4.1, according to the dependence set, calculate respectively test data T awith test data T bcontrol rely on fitness function and data dependence fitness function;
Step 5.5: according to the computing method in step 4.3, calculate respectively test data T awith test data T bbranch's distance;
Step 5.6: the standardization BranchFitness (T that carries out branch's distance a) normalised, computing formula is:
BranchFitness ( T a ) normalised = BranchFitness ( T a ) BranchFitness ( T a ) + BranchFitness ( T b )
Wherein, BranchFitness (T a) expression test data T abranch's distance, BranchFitness (T b) expression test data T bbranch's distance;
Step 5.7: total fitness function value Fitness (T) is:
Fitness(T)=ApproachLevelFitness(T)+BranchFitness(T) normalized
Wherein, ApproachLevelFitness (T) means the horizontal fitness function that approaches of test data T;
BranchFitness (T) normalizedthe standardized value that means branch's distance of test data T;
Step 5.8: according to these two test datas of total fitness function value comparison which closer to reaching the judgement target;
Step 5.9:
If ApproachLevelFitness and BranchFitness are 0, test data reaches the MC/DC test target, enters step 7, otherwise, enter step 6;
Step 6: according to the fitness value obtained, the genetic manipulations such as the selection of use genetic algorithm, intersection, variation, generate new test data, and return to step 5, calculates the fitness value of newly-generated test data;
Step 6.1: select coding strategy, parameter sets X and territory are converted to bit string structure space S;
The mode Encoded Chromosomes that adopts integer and full mold to mix, the parameter transcoding of problem is the gene on chromosome, and the number of parameter is converted to chromosome length, and the interval of each parameter is mapped as the span of each gene, and detailed process is as follows:
If Solve problems comprises n input variable X 1, X 2..., X n, at first, with the codomain of equivalence class division and boundary value analysis method processing parameter, wherein Y i(1≤i≤n) means parameter X i(1≤i≤n) can value the set of limited discrete point, | Y i| mean the size of set, set up the mapping relations in solution space and chromosome space, chromosome is expressed as:
X=(X 1,X 2,...,X n)→C=(C 1,C 2,...,C n) (2)
Wherein, C is the solution in chromosome Space Solutions space, the solution that X is problem space;
Step 6.2: design and selection genetic manipulation comprise Population Size, selection, intersection, variation method, and definite crossover probability p cwith the variation Probability p metc. genetic parameter;
Step 6.2.1: carry out sequencing selection strategy and elite's retention strategy:
The detailed process of carrying out the sequencing selection strategy is:
(a) according to the size of adaptive value, all individualities in the descending sort population;
(b) estimate by design allocation table, according to the adaptive value size, ascending order is distributed each individual probable value;
(c) each individuality is genetic to the probable value that follow-on probability distributes in step (b) and is determined, then, based on these probable values, selects the chromosome that is eliminated and is replicated with roulette wheel selection; After one takes turns the sequence selection strategy, can obtain a new population, then at this, on new population basis, carry out again elite's retention strategy, detailed process is:
(a) according to the size of fitness function value, from the new population obtained after the sequencing selection strategy, be to obtain optimized individual and the poorest individuality in current population;
(b) if the fitness of the optimized individual of current population, higher than the fitness of the optimized individual of the appearance obtained before this, replaces by the optimized individual of current colony the optimized individual occurred before this;
(c) keep the optimized individual state of appearance so far constant, complete being genetic in population of future generation by it;
Step 6.2.2: interlace operation and mutation operation
Adopt adaptive crossover probability p cwith the variation Probability p m, according to the average adaptive value of colony and current colony optimum individual adaptive value, automatically adjust crossover probability p cwith the variation Probability p m; f maxthe fitness that means optimum individual in certain generation population, F avgthe average fitness that means this colony in generation, the fitness of optimum individual and this in generation colony the difference DELTA=f of average fitness max-F avg, when Δ less, mean that the fitness difference between population at individual is less, illustrate that population now reaches the possibility of local optimum larger, the possibility of Premature Convergence is also larger; When Δ is larger, mean that the fitness difference between individuality is larger, crossover probability p cwith the variation Probability p mdecided p by Δ cand p mcomputing formula be:
p c=k 1/Δ (3)
p m=k 2/Δ (4)
Wherein, k 1and k 2be respectively crossover probability and adjust coefficient and variation probability adjustment coefficient;
Step 6.3: random initializtion generates population P;
Step 6.4: calculate the decoded fitness value of individual bit string in population P;
Step 6.5: according to Genetic Strategies, each genetic manipulation of design in step 6.2 is acted on to population, after selection, crossover and mutation, formed population of new generation;
Step 6.6: be that test data is returned to step 5 with the new chromosome produced, calculate its fitness value, judge whether its performance meets index, perhaps whether completed predetermined iterations, if do not meet and do not complete iterations, enter step 6.1, genetic algorithm is from encoding operation, the population of new generation is re-started select to copy, crossover and mutation, constantly iteration; If meet index or completed iterations and directly enter step 7;
Step 7: end of run obtains suitable test data.
CN201110265194.4A 2011-09-08 2011-09-08 MC/DC test data automatic generation method based on genetic algorithm Active CN102323906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110265194.4A CN102323906B (en) 2011-09-08 2011-09-08 MC/DC test data automatic generation method based on genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110265194.4A CN102323906B (en) 2011-09-08 2011-09-08 MC/DC test data automatic generation method based on genetic algorithm

Publications (2)

Publication Number Publication Date
CN102323906A CN102323906A (en) 2012-01-18
CN102323906B true CN102323906B (en) 2014-01-08

Family

ID=45451651

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110265194.4A Active CN102323906B (en) 2011-09-08 2011-09-08 MC/DC test data automatic generation method based on genetic algorithm

Country Status (1)

Country Link
CN (1) CN102323906B (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246597B (en) * 2012-02-07 2017-03-15 腾讯科技(深圳)有限公司 A kind of method of testing of parameter and equipment
CN102622558B (en) * 2012-03-01 2014-10-08 北京邮电大学 Excavating device and excavating method of binary system program loopholes
CN102708047B (en) * 2012-04-23 2014-12-10 福建师范大学 Data flow test case generating method
CN103294594B (en) * 2013-05-08 2016-01-06 南京大学 A kind of wrong report of the static analysis based on test removing method
CN103218297B (en) * 2013-05-15 2018-05-04 百度在线网络技术(北京)有限公司 The screening technique and device of test data
CN104142819B (en) * 2013-07-10 2016-08-24 腾讯科技(深圳)有限公司 A kind of document handling method and device
CN103593287B (en) * 2013-10-30 2016-08-17 北京信息控制研究所 A kind of data links case automatic generating method based on genetic algorithm
CN103559129B (en) * 2013-10-31 2016-08-17 中国矿业大学 Statistical regression test data generating method based on genetic algorithm
CN103729297A (en) * 2013-12-31 2014-04-16 北京理工大学 Test case generation method based on hierarchic genetic algorithm
JP6070594B2 (en) * 2014-01-31 2017-02-01 Jfeスチール株式会社 Slab knitting method and slab knitting apparatus including deformed steel plate
CN103810104B (en) * 2014-03-04 2017-08-25 中国人民解放军63863部队 A kind of software test case optimization method and system
CN105607989A (en) * 2014-11-18 2016-05-25 阿里巴巴集团控股有限公司 Sampling method and system of software test data
CN104536877B (en) * 2014-11-28 2017-09-12 江苏苏测软件检测技术有限公司 A kind of test data generating method based on mixed strategy
CN105260317B (en) * 2015-11-19 2017-10-13 上海斐讯数据通信技术有限公司 A kind of choosing method of test case
US9792204B2 (en) 2016-02-02 2017-10-17 General Electric Company System and method for coverage-based automated test case augmentation for design models
CN107103213B (en) * 2017-03-23 2018-08-31 中国航天系统科学与工程研究院 A kind of software code based on genetic algorithm obscures operation selection method
CN107229565B (en) * 2017-05-31 2020-05-01 北京京东尚科信息技术有限公司 Test method and device
CN107748721A (en) * 2017-11-27 2018-03-02 中国航空无线电电子研究所 A kind of test use cases automatic generation method
CN109901987B (en) * 2017-12-11 2022-07-05 北京京东尚科信息技术有限公司 Method and device for generating test data
CN108171413B (en) * 2017-12-26 2021-08-10 杭州电子科技大学 Chemical industry park emergency resource allocation optimization method
CN108304625B (en) * 2018-01-15 2021-10-08 北京航空航天大学 Genetic programming decision-making method for writing digital aircraft code by artificial intelligence programmer
CN108399127B (en) * 2018-02-09 2020-06-23 中国矿业大学 Class integration test sequence generation method
CN108536606B (en) * 2018-04-22 2021-01-19 北京化工大学 EFSM test method based on composite dependency coverage criterion
CN108647146B (en) * 2018-05-11 2021-06-08 北京信息科技大学 Test case generation method for judging combination coverage based on correction condition
CN108710575B (en) * 2018-05-23 2020-11-24 华南理工大学 Unit test method based on automatic generation of path coverage test case
CN108716953B (en) * 2018-06-15 2020-04-07 哈尔滨工程大学 On-site performance evaluation method for shipborne non-contact sea surface temperature measuring device
GB2577102B (en) * 2018-09-14 2021-03-03 Advanced Risc Mach Ltd Generation of code coverage information during testing of a code sequence
CN109376075B (en) * 2018-09-19 2022-04-22 奇安信科技集团股份有限公司 Processing method and device for generating optimal test coverage path of test object
CN109669436B (en) * 2018-12-06 2021-04-13 广州小鹏汽车科技有限公司 Test case generation method and device based on functional requirements of electric automobile
CN109902007B (en) * 2019-02-21 2022-04-29 南京信息工程大学 Test case generation method based on point dyeing model
CN109918304B (en) * 2019-03-06 2022-04-12 牡丹江师范学院 Rapid high-path coverage test case generation method
CN109977030B (en) * 2019-04-26 2022-04-19 北京信息科技大学 Method and device for testing deep random forest program
CN110879778B (en) * 2019-10-14 2023-09-26 杭州电子科技大学 Novel dynamic feedback and improved patch evaluation software automatic repair method
CN111144540A (en) * 2019-12-05 2020-05-12 国网山东省电力公司电力科学研究院 Generation method of anti-electricity-stealing simulation data set
CN111221741B (en) * 2020-01-17 2023-10-10 北京工业大学 Method for automatically generating abnormal unit test based on genetic algorithm and log analysis
CN112216341B (en) * 2020-09-16 2022-05-17 中国人民解放军国防科技大学 Group behavior logic optimization method and computer readable storage medium
CN112463629B (en) * 2020-12-11 2022-03-29 北京航空航天大学 Method for adjusting software configuration items of autonomous unmanned system based on genetic evolution
CN113672503A (en) * 2021-08-03 2021-11-19 中移(杭州)信息技术有限公司 Test case generation method, system, terminal device and storage medium
CN113778876A (en) * 2021-09-09 2021-12-10 南京大学 Method and device for generating program variation of source code level
CN114282261A (en) * 2021-12-10 2022-04-05 天津大学 Fine-grained privacy policy and mobile application behavior consistency checking method
US11803462B1 (en) 2022-04-27 2023-10-31 Agora Lab, Inc. System and method for automatically generating test cases for testing SDKS
CN115617700B (en) * 2022-12-19 2023-04-07 华东交通大学 Test case design and generation method and system based on relational analysis
CN116383070B (en) * 2023-04-07 2023-12-05 南京航空航天大学 Symbol execution method for high MC/DC
CN116578498B (en) * 2023-07-12 2023-09-29 西南交通大学 Automatic generation method and system for unit test cases
CN116775499A (en) * 2023-08-21 2023-09-19 中国电信股份有限公司 Test data generation method, device, computer equipment and storage medium
CN117349837A (en) * 2023-09-28 2024-01-05 广西卓梵智能科技有限公司 IAST-based quick positioning detection method and system for stain data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002099890A (en) * 2000-09-26 2002-04-05 Zexel Valeo Climate Control Corp Automatic forming method of program and automatic forming device of program
WO2009017231A2 (en) * 2007-08-02 2009-02-05 Nec Corporation Pattern examination system, pattern examination device, method, and pattern examination program
CN101710305A (en) * 2009-12-14 2010-05-19 中国科学院计算技术研究所 Method and system for realizing white box testing of computer software
CN102073589A (en) * 2010-12-29 2011-05-25 北京邮电大学 Code static analysis-based data race detecting method and system thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002099890A (en) * 2000-09-26 2002-04-05 Zexel Valeo Climate Control Corp Automatic forming method of program and automatic forming device of program
WO2009017231A2 (en) * 2007-08-02 2009-02-05 Nec Corporation Pattern examination system, pattern examination device, method, and pattern examination program
CN101710305A (en) * 2009-12-14 2010-05-19 中国科学院计算技术研究所 Method and system for realizing white box testing of computer software
CN102073589A (en) * 2010-12-29 2011-05-25 北京邮电大学 Code static analysis-based data race detecting method and system thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
《遗传算法及其在软件测试数据生成中的应用研究》;汪浩等;《计算机工程与应用》;20011231(第12期);64-68 *
《遗传算法在软件测试数据生成中的应用》;荚伟等;《北京航空航天大学学报》;19980831;第24卷(第4期);68-71 *
汪浩等.《遗传算法及其在软件测试数据生成中的应用研究》.《计算机工程与应用》.2001,(第12期),64-68.
荚伟等.《遗传算法在软件测试数据生成中的应用》.《北京航空航天大学学报》.1998,第24卷(第4期),68-71.

Also Published As

Publication number Publication date
CN102323906A (en) 2012-01-18

Similar Documents

Publication Publication Date Title
CN102323906B (en) MC/DC test data automatic generation method based on genetic algorithm
Cremer et al. From optimization-based machine learning to interpretable security rules for operation
Chen Real coded genetic algorithm optimization of long term reservoir operation 1
CN111148118B (en) Flow prediction and carrier wave turn-off method and system based on time sequence
CN101093559B (en) Method for constructing expert system based on knowledge discovery
CN105260786B (en) A kind of simulation credibility of electric propulsion system assessment models comprehensive optimization method
CN102331966A (en) Software test data evolution generation system facing path
CN114969953B (en) Optimized shield underpass tunnel design method and equipment based on Catboost-NSGA-III
CN108564205A (en) A kind of load model and parameter identification optimization method based on measured data
Ning et al. GA-BP air quality evaluation method based on fuzzy theory.
CN110751176A (en) Lake water quality prediction method based on decision tree algorithm
CN112560327B (en) Bearing residual life prediction method based on depth gradient descent forest
Hernández-Lobato et al. Designing neural network hardware accelerators with decoupled objective evaluations
CN113505477A (en) Process industry soft measurement data supplementing method based on SVAE-WGAN
CN111008790A (en) Hydropower station group power generation electric scheduling rule extraction method
CN109214500A (en) A kind of transformer fault recognition methods based on integrated intelligent algorithm
Jiang et al. Parameters calibration of traffic simulation model based on data mining
CN113988083A (en) Factual information coding and evaluating method for shipping news abstract generation
Rao et al. Optimization of machinery noise in a bauxite mine using Genetic Algorithm
CN105022798A (en) Categorical data mining method of discrete Bayesian network on the basis of prediction relationship
YANG Benefits of a metamodel for automatic calibration of 1D and 2D fluvial models
Phong et al. A fuzzy rule-based classification system using Hedge Algebraic Type-2 Fuzzy Sets
Rychtyckyj et al. Assessing the performance of cultural algorithms for semantic network re-engineering
Rychtyckyj et al. Using cultural algorithms to improve knowledge base maintainability
Rueda et al. Preliminary evaluation of symbolic regression methods for energy consumption modelling

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant