CN114968824B - Testing method and system based on chain multi-path coverage - Google Patents
Testing method and system based on chain multi-path coverage Download PDFInfo
- Publication number
- CN114968824B CN114968824B CN202210894579.5A CN202210894579A CN114968824B CN 114968824 B CN114968824 B CN 114968824B CN 202210894579 A CN202210894579 A CN 202210894579A CN 114968824 B CN114968824 B CN 114968824B
- Authority
- CN
- China
- Prior art keywords
- path
- model
- sub
- submodel
- support vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3676—Test management for coverage analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Computer Hardware Design (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a testing method and a testing system based on chain type multi-path coverage.A support vector machine-extreme gradient lifting chain type model for path prediction is constructed to replace a pile inserting method to simulate a testing data coverage path so as to reduce pile inserting time; meanwhile, similar target paths are screened as much as possible, and the utilization rate and the path coverage efficiency of the test cases are improved. Compared with other models, the SVM-LGM-LIFT chain model provided by the invention has great advantages in precision and time. In addition, more similar target paths can be obtained by the support vector machine-extreme gradient lifting chain model, so that more paths can be covered as much as possible in the subsequent test case generation, and the path coverage efficiency is improved.
Description
Technical Field
The invention relates to the technical field of computer genetic algorithms, in particular to a testing method and system based on chain multi-path coverage.
Background
At present, a lot of time is spent on a tester for generating data meeting a test target through manual testing, and a plurality of target paths to be tested are often required. Therefore, test data meeting the conditions are automatically generated, the multi-target path is attempted to be covered through the existing data to improve the generation efficiency of the test case, and more repetitive work can be avoided.
In the multi-path coverage test generation, mining the relevance between the coverage path and the test case, analyzing the similarity between the paths and the like is beneficial to improving the quality of the test case. Meanwhile, a Genetic Algorithm (Genetic Algorithm) has mechanisms such as biological evolution, Genetic variation and global probability search, can generate abundant and various test data, and is widely applied to automatic generation of test data. In addition, with the continuous maturity of machine learning methods, many researchers have combined machine learning models and testing theories to perform research in related fields. Among many Machine learning models, SVMs (Support Vector machines) and xgboosts (eXtreme Gradient Boosting) have advantages of small samples, low time consumption, and high accuracy, and are widely used, and they have respective advantages for processing different data types.
In a real test scene, a plurality of target paths contained in a test target have certain relation. Each path node of each path may be represented in two states (i.e., pass or not pass), and the prediction of node state may be considered a binary problem. Because the input data types of the tested programs are different, the path prediction by using a single model has certain limitation. The SVM model has a good classification effect on numerical samples and is suitable for test data of small samples; the XGboost model has good expandability and better classification effect on non-numerical type samples.
However, in the prior art, an effective method capable of simultaneously combining the SVM model and the XGBoost model to realize the chain multi-path coverage test is still lacking, and the actual application requirements cannot be well met.
Disclosure of Invention
In view of the above situation, the main objective of the present invention is to provide a testing method based on chain multi-path coverage to solve the above technical problems.
The embodiment of the invention provides a testing method based on chain multi-path coverage, wherein the method comprises the following steps:
step one, constructing a support vector machine-extreme gradient lifting chain type model:
inputting randomly generated test data into a pile inserting program to obtain a test path, and calculating according to the number of path nodes of the test path and the state of the path nodes to obtain the corresponding path level depth;
selecting a corresponding pre-training model according to the type of the test data; the pre-training model comprises a support vector machine model and a limit gradient lifting model;
training the submodel corresponding to each path node in the test path according to the selected pre-training model to calculate and obtain the corresponding submodel precision, wherein the submodel comprises a support vector machine submodel and a limit gradient lifting submodel;
when the sub-model precision reaches a preset optimal sub-model threshold value, storing the corresponding sub-models and adding 1 to the number of the sub-models until the number of the sub-models is equal to the number of the path nodes, and stopping constructing the sub-model of each path node to obtain an optimal support vector machine sub-model and an optimal limit gradient lifting sub-model;
linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model;
step two, utilizing the constructed support vector machine-extreme gradient lifting chain model to generate a test case through a genetic algorithm:
initializing genetic parameters of a genetic algorithm, and converting the genetic parameters into decimal systems for acquiring corresponding coverage paths;
inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding predicted path, searching to obtain a similar path in a target path according to the predicted path, inputting the current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path;
if the calculated fitness value is 1, determining that the target path is covered, deleting the target path, and storing the current test data;
and when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path covering information.
The invention provides a testing method based on chain multi-path coverage, which has the following technical advantages:
(1) constructing a support vector machine-extreme gradient lifting chain model for path prediction to replace a pile inserting method to simulate a test data coverage path so as to reduce pile inserting time; meanwhile, similar target paths are screened as much as possible, and the utilization rate and the path coverage efficiency of the test cases are improved. Compared with other models, the support vector machine-extreme gradient lifting chain model provided by the invention has great advantages in precision and time. In addition, more similar target paths can be obtained through the support vector machine-extreme gradient lifting chain model, so that more paths can be covered as much as possible in the subsequent test case generation, and the path coverage efficiency is improved;
(2) in addition, in the genetic evolution process, the constructed support vector machine-extreme gradient lifting chain model is used for predicting the coverage path corresponding to the case, the stake insertion verification is carried out on the individuals meeting the requirements, and then the fitness is calculated. In order to enable the population individuals to evolve and generate towards the target path as soon as possible, excellent use cases in the original samples are introduced during cross variation. The complicated pile inserting process is simplified, the time consumption is reduced, excellent individuals are introduced, the population individuals are guided to accelerate the evolution towards the target path, and the test case generation efficiency covering the target path is improved;
(3) designing an updating rule of a support vector machine-extreme gradient lifting chain type model, wherein the quality of an original sample has a great influence on the accuracy of an initial path prediction model, and a part of experimental objects are difficult to achieve a good test effect through the initial model; considering the overall efficiency of the test, when the population evolution algebra and the number of excellent individuals reach a certain number, replacing part of samples of the original samples with the latest samples, and retraining the chain model to obtain a model with higher precision, thereby further improving the generation efficiency of the subsequent test cases.
The testing method based on chain multi-path coverage is characterized in that in the step one, for each submodel, the input data format of the training sample of each submodel during training is represented as follows:
wherein, the first and the second end of the pipe are connected with each other,respectively representkA sample pattern of the different sub-models,indicating that one of the types of samples containsqThe number of the data is one,indicating sequential acquisition of test data after input to the instrumentation programPath node states corresponding to the submodels, wherein the types of sample patterns of the submodels are the same as the number of the path node states;
wherein, the prediction results sequentially output by all submodels form a complete prediction pathI.e. by,;
Indicating the first obtained after inputting the test data into the instrumentation procedureThe state of the node of each path,representing a set of integers.
The testing method based on chain multi-path coverage is characterized in that in the step one, in the support vector machine submodel, the test method existshingeThe loss function, specifically represents:
wherein the content of the first and second substances,in order to separate the coefficients of the hyperplane,is the number of samples to be tested,in order to make the parameters of the penalty,,in order to be a function of the relaxation variable,,in order to separate the constant parameters of the hyperplane,is as followsThe number of the samples is one,is as followsThe predicted value of the number of samples,in order to perform the transposing operation,the representation is constrained to be constrained to,all indicate sample numbers.
The testing method based on chain multi-path coverage is characterized in thathingeThe loss function is an objective function with constraint conditions, and is converted into an unconstrained objective function through a Lagrange multiplier method, and the corresponding unconstrained objective function is expressed as:
wherein the content of the first and second substances,is as followsThe lagrange multiplier in the objective function for each sample,is as followsThe lagrange multiplier in the objective function for each sample,,is as followsThe number of the samples is one,is as followsThe predicted value of the number of samples,is a Gaussian kernel function;
the optimal solution derived from the unconstrained objective function is represented as:
wherein the content of the first and second substances,for a general representation of the optimal solution found in the unconstrained objective function,for the first derived from the unconstrained objective functionAnd (4) the optimal solution corresponding to each sample.
The testing method based on chain multi-path coverage is characterized in that in the step one, the extreme gradient lifting submodel is composed ofThe objective function of the extreme gradient lifting submodel is expressed as follows:
wherein, the first and the second end of the pipe are connected with each other,is shown asThe objective function corresponding to each basic model,representing true valuesAnd the predicted valueThe loss function of (a) is calculated,represents all ofComplexity of individual basis modelsAnd (4) the sum.
The testing method based on chain multi-path coverage is characterized in that in the step one, the path level depth is expressed as:
wherein the content of the first and second substances,the depth of the path level is represented,indicates the number of nodes of a complete path,indicating the current state of the node of the path,a constant value is represented as a function of time,representing the number of path node states;
sub-model precision is expressed as:
wherein, the first and the second end of the pipe are connected with each other,the sub-model accuracy is represented by the sub-model accuracy,indicating that the prediction is a positive sample,indicating a prediction as a negative sample.
The testing method based on chain multi-path coverage comprises the following steps of linking an obtained optimal support vector machine submodel and an optimal extreme gradient lifting submodel according to the sequence of each path node of a testing path to obtain the support vector machine-extreme gradient lifting chain model, wherein a corresponding formula is represented as follows:
wherein the content of the first and second substances,is shown asThe submodel corresponding to each path node,the submodel corresponding to the maximum value of the model precision is represented,model of support vector machineThe precision value at each of the path nodes is,expressing extreme gradient lifting chain model in the first placeThe precision value at the node of each path,a chain model representing the final path prediction,represents a stripThe chain-wise manner of the full path prediction model,indicating from the 1 st path node to the 1 stkAll submodels corresponding to 1 path node are chain-combined according to the appearance order of the path nodes.
The testing method based on chain multi-path coverage is characterized in that in the second step, the formula of the path similarity is represented as follows:
wherein the content of the first and second substances,a value representing the degree of similarity of the paths,the number of nodes representing the overlay path of the test data in the same state as the target path,representing the number of nodes of a target path.
The testing method based on the chain multi-path coverage is characterized in that the fitness function is expressed by the following formula:
wherein the content of the first and second substances,the value of the fitness function is represented,representing the weight coefficients.
The invention provides a testing system based on chain multi-path coverage, wherein the system comprises:
a build module to:
inputting randomly generated test data into a pile inserting program to obtain a test path, and calculating according to the number of path nodes of the test path and the state of the path nodes to obtain the corresponding path level depth;
selecting a corresponding pre-training model according to the type of the test data; the pre-training model comprises a support vector machine model and a limit gradient lifting model;
training the submodel corresponding to each path node in the test path according to the selected pre-training model to calculate and obtain the corresponding submodel precision, wherein the submodel comprises a support vector machine submodel and a limit gradient lifting submodel;
when the sub-model precision reaches a preset optimal sub-model threshold value, storing the corresponding sub-models and adding 1 to the number of the sub-models until the number of the sub-models is equal to the number of the path nodes, and stopping constructing the sub-model of each path node to obtain an optimal support vector machine sub-model and an optimal limit gradient lifting sub-model;
linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model;
a genetic algorithm module to:
initializing genetic parameters of a genetic algorithm, and converting the genetic parameters into a decimal system for acquiring a corresponding coverage path;
inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding predicted path, searching to obtain a similar path in a target path according to the predicted path, inputting the current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path;
if the calculated fitness value is 1, determining that the target path is covered, deleting the target path, and storing the current test data;
and when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path covering information.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flow chart of a testing method based on chain multi-path coverage according to the present invention;
FIG. 2 is a schematic diagram of a support vector machine-extreme gradient lifting chain model;
FIG. 3 is a schematic diagram of a test case generated by a genetic algorithm using a support vector machine-extreme gradient lifting chain model;
fig. 4 is a structural diagram of a testing system based on chain multi-path coverage according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
These and other aspects of embodiments of the invention will be apparent with reference to the following description and attached drawings. In the description and drawings, particular embodiments of the invention have been disclosed in detail as being indicative of some of the ways in which the principles of the embodiments of the invention may be practiced, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.
For the convenience of describing the construction of the support vector machine-extreme gradient boost chain model of the present invention, the design of the sample pattern, the mathematical knowledge of the SVM (support vector machine) model and the XGBoost (extreme gradient boost) model involved needs to be introduced here.
(I) sample style:
the C-SVMXGBoost model (support vector machine-extreme gradient lifting chain model) is used for simulating and solving a coverage path of test data, and is formed by chain fusion of a plurality of submodels, each submodel needs a certain number of samples for training and comprises input data and an expected output result. The input data format of each sub-model training sample is different, and the expression form is shown as formula (1); the output result is a certain path node state corresponding to the test data and is expressed as。
For each submodel, the input data format of the training samples of each submodel at the time of training is represented as:
wherein the content of the first and second substances,respectively representkThe sample patterns of the different sub-models are seeded,indicating that one type of sample containsqThe number of the data is set to be,indicating sequential acquisition of test data after input to the instrumentation programAnd path node states corresponding to the submodels, wherein the types of the sample patterns of the submodels are the same as the number of the path node states.
In addition, the method can be used for producing a composite materialAll submodels output the prediction results in turn to form a complete prediction pathI.e. by,,Indicating the first obtained after inputting the test data into the instrumentation procedureThe state of the node of each path,representing a set of integers.
In practical applications, a certain amount of test data, expressed as,The number of samples of test data is measured, then the test data is input into the tested program after the instrumentation, and the coverage path, namelyThe test data and paths form a set of capacities ofModel sample of (2)。
Wherein the first sub-model sample isThe sample of the second submodel is(ii) a And so on, the firstkThe sub-model samples are then represented as。
In addition, in order to make the sub-model sample have certain representativeness, screening is carried out during random generation, the sub-model sample is uniformly distributed in an input field of a tested program as much as possible, and the volume of the selected model sample is not suitable to be too large or too small. If the sample capacity is too small, the accuracy of the model cannot be reflected; if the sample capacity is too large, more computing resources and model building time are consumed, and the testing efficiency is reduced.
And (II) constructing a support vector machine model (SVM model):
in actual software testing, the test case is taken as a sample, and the condition of linear divisibility hardly exists, so that the method is introduced herehingeAnd (5) solving an optimization problem by using an SVM (support vector machine) through a loss function.
In particular, in the support vector machine submodel, there ishingeA loss function, expressed as:
wherein the content of the first and second substances,in order to separate the coefficients of the hyperplane,is the number of samples to be tested,in order to make a penalty on the parameters,,in order to be a function of the relaxation variable,,to separate the constant parameters of the hyperplane,is a firstThe number of the samples is one,is as followsThe predicted value of the number of samples,in order to perform the transposing operation,the representation is constrained to be constrained to,all indicate sample numbers.
Further, the abovehingeThe loss function is an objective function with constraint conditions, and is converted into an unconstrained objective function through a Lagrange multiplier method, and the corresponding unconstrained objective function is expressed as:
wherein the content of the first and second substances,is as followsThe lagrange multiplier in the objective function for each sample,is as followsThe lagrange multiplier in the objective function for each sample,,is as followsThe number of the samples is one,is as followsThe predicted value of the number of samples,is a gaussian kernel function.
The optimal solution derived from the unconstrained objective function is represented as:
wherein the content of the first and second substances,for a general representation of the optimal solution found in the unconstrained objective function,for the first derived from the unconstrained objective functionAnd (4) the optimal solution corresponding to each sample.
Furthermore, the optimal solution isThe separation hyperplane to be solved can be obtained by substituting the formula (4) and the formula (5).
Wherein the content of the first and second substances,the coefficients representing the final hyperplane are,a constant parameter representing the final separation hyperplane,is composed ofOne component of (a).
(III) constructing a limit gradient lifting model (XGboost model):
the extreme gradient lifting submodel is composed ofThe additive model is composed of basic models, and the objective function of the extreme gradient lifting submodel is expressed as:
wherein the content of the first and second substances,is shown asThe objective function corresponding to each basic model,representing true valuesAnd the predicted valueIs used to determine the loss function of (c),represents all ofComplexity of individual basis modelsAnd (4) the sum.
By the firsttThe base model is for example the second oneiA sampleThe predicted value of (c) can be expressed as shown in equation (7).
Wherein the content of the first and second substances,is the firsttThe predicted value of the step model is calculated,is the firstt-a predicted value of the 1-step model,the predicted value of the new model is added for the need.
wherein the content of the first and second substances,as true valueAnd a firstt-1 step model predictionIs used to determine the loss function of (c),to loss ofThe first derivative of the function is the derivative of,is the second derivative of the loss function;
substituting the formula band into the objective function, and simplifying the objective function as shown in formula (9).
Wherein, the first and the second end of the pipe are connected with each other,representing the complexity of the model at step t.
In the objective function, each step can be obtained by only solving the first derivative and the second derivative of the loss function of each step and then optimizing the objective functionf(x) And then, according to the addition model, a complete model can be obtained.
Construction of chain model (support vector machine-extreme gradient lifting chain model):
referring to fig. 1 to 3, the present invention provides a testing method based on chain multi-path coverage, wherein the method includes the following steps:
s101, constructing a support vector machine-extreme gradient lifting chain model.
The method specifically comprises the following steps:
and S1011, inputting the randomly generated test data into the instrumentation program to obtain a test path, and calculating to obtain a corresponding path level depth according to the number of path nodes of the test path and the state of the path nodes.
The path level depth is represented as:
wherein, the first and the second end of the pipe are connected with each other,the depth of the path level is represented,indicates the number of nodes of a complete path,indicating the current path node state (i.e., 1 when the node is traversed, 0 when the node is not traversed),a constant value is represented as a function of time,representing the number of path node states;,it can be guaranteed that when the node state is 0, the node path level depth is not 0. In the present embodiment, 0.01 is set.
And S1012, selecting a corresponding pre-training model according to the type of the test data.
The pre-training model comprises a support vector machine model and a limit gradient lifting model. The genetic parameters include: the initial population is 40, the cross probability is 0.9, the mutation probability is 0.1, and the maximum evolution generation number is 1000.
And S1013, training the sub-model corresponding to each path node in the test path according to the selected pre-training model so as to calculate and obtain the corresponding sub-model precision.
The submodels comprise a support vector machine submodel and a limit gradient lifting submodel.
Sub-model precision is expressed as:
wherein, the first and the second end of the pipe are connected with each other,the sub-model accuracy is represented by the sub-model accuracy,indicating that the prediction is a positive sample,indicating a prediction as a negative sample.
And S1014, when the sub-model precision reaches the preset optimal sub-model threshold value, storing the corresponding sub-models, adding 1 to the number of the sub-models, and stopping constructing the sub-model of each path node until the number of the sub-models is equal to that of the path nodes, so as to obtain the optimal support vector machine sub-model and the optimal limit gradient lifting sub-model.
And S1015, linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model.
And linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model, wherein the corresponding formula is expressed as follows:
wherein, the first and the second end of the pipe are connected with each other,is shown asThe submodel corresponding to each path node,the submodel corresponding to the maximum value of the model precision is represented,model of support vector machineThe precision value at the node of each path,expressing extreme gradient lifting chain model in the first placeThe precision value at each of the path nodes is,a chain model representing the final path prediction,representing the chain of modes of a complete path prediction model,indicating from the 1 st path node to the 1 stkAll submodels corresponding to 1 path node are chain-combined according to the appearance order of the path nodes.
And (3) generating a genetic algorithm test fused with a C-SVMXGBoost chain model (support vector machine-extreme gradient lifting chain model):
designing a fitness function based on path coverage:
the fitness function is a key influence factor for screening excellent individuals and improving the testing efficiency, and the fitness function is designed according to different requirements and is an essential step in a genetic algorithm. By constructing a chain model of path prediction, more similar target paths are found, and test data are fully used in the evolution generation process as much as possible to cover more target paths.
Generating test data for easy-to-cover target paths is not of great significance, and test data that can pass through nodes of difficult-to-cover paths is considered to be a better population of individuals. The depth of the path level is related to the order and status of the path nodes in the instrumentation process, and is one of the factors for measuring the difficulty of the path being covered. In summary, the fitness function needs to be designed by comprehensively considering the similarity between the coverage path and the target path of the test data and the path hierarchy depth.
The formula for path similarity is expressed as:
wherein the content of the first and second substances,a value representing the degree of similarity of the paths,the number of nodes representing the same state of the overlay path of the test data as the target path,representing the number of nodes of a target path.
The fitness function is formulated as:
wherein the content of the first and second substances,the value of the fitness function is represented,representing the weight coefficients.
Wherein, in order to balance the comprehensive influence of the path similarity and the path hierarchy depth on the individual fitness, a weight coefficient is set, and as the path similarity is a main factor for judging whether the target path is covered or not, the weight is larger, so that the path similarity has the advantages of. Similarity of current pathWhen the value is 1, the current test case covers the target path, the value is set to 1, and the individual fitness value is only subjected toInfluence, i.e. fitness value of 1; similarity of current pathWhen the value is not 1, it indicates that the current test case does not cover the target path, and the individual fitness value is affected by the similarity of the path and the depth of the hierarchy. Therefore, evolution continues to produce superior individuals based on fitness values.
The genetic evolution test generated:
s102, the constructed support vector machine-extreme gradient lifting chain model is utilized, and a test case is generated through a genetic algorithm.
Step S102 specifically includes:
and S1021, initializing the genetic parameters of the genetic algorithm, and converting the genetic parameters into decimal systems for acquiring corresponding coverage paths.
S1022, inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding prediction path, searching to obtain a similar path in a target path according to the prediction path, inputting current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path.
And S1023, if the calculated fitness value is 1, determining that the target path is covered, deleting the merged target path, and storing the current test data.
And S1024, when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path covering information.
The invention provides a testing method based on chain multi-path coverage, which has the following technical advantages:
(1) constructing a support vector machine-extreme gradient lifting chain model for path prediction to replace a pile inserting method to simulate a test data coverage path so as to reduce pile inserting time; meanwhile, similar target paths are screened as much as possible, and the utilization rate of test cases and the path coverage efficiency are improved. Compared with other models, the support vector machine-extreme gradient lifting chain model provided by the invention has great advantages in precision and time. In addition, more similar target paths can be obtained through the support vector machine-extreme gradient lifting chain model, so that more paths are covered as much as possible in the subsequent test case generation, and the path coverage efficiency is improved;
(2) in addition, in the genetic evolution process, the constructed support vector machine-extreme gradient lifting chain model is used for predicting the coverage path corresponding to the case, the stake insertion verification is carried out on the individuals meeting the requirements, and then the fitness is calculated. In order to enable the population individuals to evolve and generate towards the target path as soon as possible, excellent use cases in the original samples are introduced during cross variation. The complicated pile inserting process is simplified, the time consumption is reduced, excellent individuals are introduced, the population individuals are guided to evolve towards the target path in an accelerating mode, and therefore the test case generating efficiency covering the target path is improved;
(3) designing an updating criterion of the support vector machine-extreme gradient lifting chain model, wherein the quality of an original sample has a great influence on the accuracy of an initial path prediction model, and a part of experimental objects are difficult to achieve a good test effect through the initial model; considering the overall efficiency of the test, when the population evolution algebra and the number of excellent individuals reach a certain number, replacing part of samples of the original samples with the latest samples, and retraining the chain model to obtain a model with higher precision, thereby further improving the generation efficiency of the subsequent test cases.
Referring to fig. 4, the present invention provides a testing system based on chain multi-path coverage, wherein the system includes:
a build module to:
inputting randomly generated test data into a pile inserting program to obtain a test path, and calculating to obtain a corresponding path level depth according to the number of path nodes and the state of the path nodes of the test path;
selecting a corresponding pre-training model according to the type of the test data; the pre-training model comprises a support vector machine model and a limit gradient lifting model;
training the submodel corresponding to each path node in the test path according to the selected pre-training model to calculate and obtain the corresponding submodel precision, wherein the submodel comprises a support vector machine submodel and a limit gradient lifting submodel;
when the sub-model precision reaches a preset optimal sub-model threshold value, storing the corresponding sub-models and adding 1 to the number of the sub-models until the number of the sub-models is equal to the number of the path nodes, and stopping constructing the sub-model of each path node to obtain an optimal support vector machine sub-model and an optimal limit gradient lifting sub-model;
linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model;
a genetic algorithm module to:
initializing genetic parameters of a genetic algorithm, and converting the genetic parameters into a decimal system for acquiring a corresponding coverage path;
inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding prediction path, searching to obtain a similar path in a target path according to the prediction path, inputting the current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path;
if the calculated fitness value is 1, confirming that the target path is covered, deleting the target path and storing the current test data;
and when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path coverage information.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A testing method based on chain multi-path coverage is characterized by comprising the following steps:
step one, constructing a support vector machine-extreme gradient lifting chain model:
inputting randomly generated test data into a pile inserting program to obtain a test path, and calculating according to the number of path nodes of the test path and the state of the path nodes to obtain the corresponding path level depth;
selecting a corresponding pre-training model according to the type of the test data; the pre-training model comprises a support vector machine model and a limit gradient lifting model;
training the submodel corresponding to each path node in the test path according to the selected pre-training model to calculate and obtain the corresponding submodel precision, wherein the submodel comprises a support vector machine submodel and a limit gradient lifting submodel;
when the sub-model precision reaches a preset optimal sub-model threshold value, storing the corresponding sub-models and adding 1 to the number of the sub-models until the number of the sub-models is equal to the number of the path nodes, and stopping constructing the sub-model of each path node to obtain an optimal support vector machine sub-model and an optimal limit gradient lifting sub-model;
linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model;
step two, utilizing the constructed support vector machine-extreme gradient lifting chain model to generate a test case through a genetic algorithm:
initializing genetic parameters of a genetic algorithm, and converting the genetic parameters into decimal systems for acquiring corresponding coverage paths;
inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding predicted path, searching to obtain a similar path in a target path according to the predicted path, inputting the current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path;
if the calculated fitness value is 1, determining that the target path is covered, deleting the target path, and storing the current test data;
and when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path coverage information.
2. The method as claimed in claim 1, wherein in the step one, for each submodel, the input data format of the training samples of each submodel during training is represented as:
wherein the content of the first and second substances,respectively representkThe sample patterns of the different sub-models are seeded,indicating that one of the types of samples containsqThe number of the data is set to be,indicating sequential acquisition of test data after input to the instrumentation programPath node states corresponding to the submodels, wherein the types of sample patterns of the submodels are the same as the number of the path node states;
wherein, all the sub-models output the prediction results in turn to form a complete prediction pathI.e. by,;
3. The method as claimed in claim 2, wherein in the step one, there is a sub-model of the support vector machinehingeThe loss function, in particular, represents:
wherein, the first and the second end of the pipe are connected with each other,in order to separate the coefficients of the hyperplane,is the number of samples to be tested,in order to make the parameters of the penalty,,in order to be a function of the relaxation variable,,to separate the constant parameters of the hyperplane,is a firstThe number of the samples is one,is as followsThe predicted value of the number of samples,in order to perform the transposition operation,the representation is constrained to be constrained to,all indicate sample numbers.
4. The method for testing chain multi-path coverage as claimed in claim 3, wherein the testing method is based on chain multi-path coveragehingeThe loss function is an objective function with constraint conditions, and is converted into an unconstrained objective function through a Lagrange multiplier method, and the corresponding unconstrained objective function is expressed as follows:
wherein the content of the first and second substances,is as followsThe lagrange multiplier in the objective function for each sample,is as followsThe lagrange multiplier in the objective function for each sample,,is a firstThe number of the samples is one,is as followsThe predicted value of the number of samples,is a Gaussian kernel function;
the optimal solution derived from the unconstrained objective function is represented as:
wherein, the first and the second end of the pipe are connected with each other,for a general representation of the optimal solution found in the unconstrained objective function,for the first derived from the unconstrained objective functionAnd (4) the optimal solution corresponding to each sample.
5. The method as claimed in claim 4, wherein in the step one, the extreme gradient boost submodel is composed ofAdditive model composed of individual basic models and extreme gradient extractionThe objective function of the liter model is represented as:
wherein, the first and the second end of the pipe are connected with each other,denotes the firstThe objective function corresponding to each basic model,representing true valuesAnd the predicted valueIs used to determine the loss function of (c),represents all ofComplexity of individual basis modelsAnd (4) summing.
6. The method as claimed in claim 5, wherein in the step one, the path level depth is expressed as:
wherein, the first and the second end of the pipe are connected with each other,the depth of the path level is represented,indicates the number of nodes of a complete path,indicating the current state of the node of the path,a constant value is represented as a function of time,representing the number of path node states;
sub-model precision is expressed as:
7. The method as claimed in claim 6, wherein the step of obtaining the SVM-DGL chain model by linking the obtained optimal SVM sub-model and optimal DGL sub-model according to the order of each path node of the test path is represented by a corresponding formula:
wherein, the first and the second end of the pipe are connected with each other,denotes the firstThe submodel corresponding to each path node,the submodel corresponding to the maximum value of the model precision is represented,model of support vector machineThe precision value at the node of each path,represents the extreme gradient lifting chain model in the firstThe precision value at each of the path nodes is,a chain model representing the final path prediction,representing the chain of modes of a complete path prediction model,indicating from the 1 st path node to the 1 stkAll submodels corresponding to 1 path node are chain-combined according to the appearance order of the path nodes.
8. The method for testing chain multi-path coverage according to claim 7, wherein in the second step, the formula of the path similarity is represented as:
10. A chained multi-path coverage based test system, the system comprising:
a build module to:
inputting randomly generated test data into a pile inserting program to obtain a test path, and calculating according to the number of path nodes of the test path and the state of the path nodes to obtain the corresponding path level depth;
selecting a corresponding pre-training model according to the type of the test data; the pre-training model comprises a support vector machine model and a limit gradient lifting model;
training the submodel corresponding to each path node in the test path according to the selected pre-training model to calculate and obtain the corresponding submodel precision, wherein the submodel comprises a support vector machine submodel and a limit gradient lifting submodel;
when the sub-model precision reaches a preset optimal sub-model threshold value, storing the corresponding sub-models and adding 1 to the number of the sub-models until the number of the sub-models is equal to the number of the path nodes, and stopping constructing the sub-model of each path node to obtain an optimal support vector machine sub-model and an optimal limit gradient lifting sub-model;
linking the obtained optimal support vector machine submodel and the optimal extreme gradient lifting submodel according to the sequence of each path node of the test path to obtain the support vector machine-extreme gradient lifting chain model;
a genetic algorithm module to:
initializing genetic parameters of a genetic algorithm, and converting the genetic parameters into decimal systems for acquiring corresponding coverage paths;
inputting test data into the support vector machine-extreme gradient lifting chain model to obtain a corresponding prediction path, searching to obtain a similar path in a target path according to the prediction path, inputting the current test data into a pile inserting program after obtaining a plurality of similar paths to obtain an accurate path, and calculating to obtain a fitness value according to a fitness function based on the accurate path;
if the calculated fitness value is 1, determining that the target path is covered, deleting the target path, and storing the current test data;
and when all the target paths are judged to be covered, completing the test, and outputting corresponding new test data and target path covering information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210894579.5A CN114968824B (en) | 2022-07-28 | 2022-07-28 | Testing method and system based on chain multi-path coverage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210894579.5A CN114968824B (en) | 2022-07-28 | 2022-07-28 | Testing method and system based on chain multi-path coverage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114968824A CN114968824A (en) | 2022-08-30 |
CN114968824B true CN114968824B (en) | 2022-09-30 |
Family
ID=82970447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210894579.5A Active CN114968824B (en) | 2022-07-28 | 2022-07-28 | Testing method and system based on chain multi-path coverage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114968824B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117632770B (en) * | 2024-01-25 | 2024-04-19 | 江西财经大学 | Multipath coverage test case generation method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083531A (en) * | 2019-04-12 | 2019-08-02 | 江西财经大学 | It improves the shared multi-goal path coverage test method of individual information and realizes system |
CN110533150A (en) * | 2019-07-05 | 2019-12-03 | 江西财经大学 | Self -adaptive and reuse system and method based on Support vector regression model |
CN111240995A (en) * | 2020-01-21 | 2020-06-05 | 江西财经大学 | Multi-path covering method and system combining key point probability and path similarity |
US11169288B1 (en) * | 2017-12-07 | 2021-11-09 | Triad National Security, Llc | Failure prediction and estimation of failure parameters |
CN114780439A (en) * | 2022-06-13 | 2022-07-22 | 江西财经大学 | Reuse method of test cases among similar programs facing to parameter path flow graph |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763055B (en) * | 2018-04-19 | 2020-08-25 | 北京航空航天大学 | Construction method of test case constraint control technology based on epigenetic inheritance |
-
2022
- 2022-07-28 CN CN202210894579.5A patent/CN114968824B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11169288B1 (en) * | 2017-12-07 | 2021-11-09 | Triad National Security, Llc | Failure prediction and estimation of failure parameters |
CN110083531A (en) * | 2019-04-12 | 2019-08-02 | 江西财经大学 | It improves the shared multi-goal path coverage test method of individual information and realizes system |
CN110533150A (en) * | 2019-07-05 | 2019-12-03 | 江西财经大学 | Self -adaptive and reuse system and method based on Support vector regression model |
CN111240995A (en) * | 2020-01-21 | 2020-06-05 | 江西财经大学 | Multi-path covering method and system combining key point probability and path similarity |
CN114780439A (en) * | 2022-06-13 | 2022-07-22 | 江西财经大学 | Reuse method of test cases among similar programs facing to parameter path flow graph |
Non-Patent Citations (4)
Title |
---|
基于分支覆盖的回归测试路径选择;吴川等;《软件学报》;20160114;第27卷(第04期);全文 * |
基于支持向量机回归模型的测试用例生成与重用;钱忠胜等;《电子学报》;20210731;第49卷(第7期);全文 * |
架空线路改造工程造价的组合预测方法;俞敏等;《电力科学与技术学报》;20200128(第01期);全文 * |
结合关键点概率与路径相似度的多路径覆盖策略;钱忠胜等;《软件学报》;20220228;第33卷(第2期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114968824A (en) | 2022-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | A survey on evolutionary construction of deep neural networks | |
Gaier et al. | Data-efficient exploration, optimization, and modeling of diverse designs through surrogate-assisted illumination | |
US20120130929A1 (en) | Controlling quarantining and biasing in cataclysms for optimization simulations | |
Kyriakides et al. | An introduction to neural architecture search for convolutional networks | |
CN110533150B (en) | Test generation and reuse system and method based on support vector machine regression model | |
US20210133378A1 (en) | Methods and systems for the estimation of the computational cost of simulation | |
Tjanaka et al. | pyribs: A bare-bones python library for quality diversity optimization | |
JP7411977B2 (en) | Machine learning support method and machine learning support device | |
CN114968824B (en) | Testing method and system based on chain multi-path coverage | |
Hartmann et al. | Meta-modelling meta-learning | |
CN116629352A (en) | Hundred million-level parameter optimizing platform | |
Yi et al. | Intergroup cascade broad learning system with optimized parameters for chaotic time series prediction | |
Shahshahani et al. | Resource and performance estimation for CNN models using machine learning | |
CN115310355A (en) | Multi-energy coupling-considered multi-load prediction method and system for comprehensive energy system | |
CN113010687B (en) | Exercise label prediction method and device, storage medium and computer equipment | |
CN113128771B (en) | Expensive function optimization method and device for parallel differential evolution algorithm | |
CN111026661B (en) | Comprehensive testing method and system for software usability | |
CN113240094A (en) | SVM-based LSTM hyper-parameter optimization method, system, medium and device | |
Luo et al. | A new approach to building the Gaussian process model for expensive multi-objective optimization | |
CN112667591A (en) | Data center task interference prediction method based on mass logs | |
Al-Helali et al. | GP-based feature selection and weighted KNN-based instance selection for symbolic regression with incomplete data | |
Bahnsen et al. | Effect analysis of low-level hardware faults on neural networks using emulated inference | |
CN117474125B (en) | Automatic training machine learning model system | |
de Kievit | Effect volume loss term in graph neural networks predicting material behavior | |
Lu et al. | Noise-Tolerant Hardware-Aware Pruning for Deep Neural Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |