CN110533150B - Test generation and reuse system and method based on support vector machine regression model - Google Patents
Test generation and reuse system and method based on support vector machine regression model Download PDFInfo
- Publication number
- CN110533150B CN110533150B CN201910606331.2A CN201910606331A CN110533150B CN 110533150 B CN110533150 B CN 110533150B CN 201910606331 A CN201910606331 A CN 201910606331A CN 110533150 B CN110533150 B CN 110533150B
- Authority
- CN
- China
- Prior art keywords
- support vector
- vector machine
- test
- fitness
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012360 testing method Methods 0.000 title claims abstract description 219
- 238000012706 support-vector machine Methods 0.000 title claims abstract description 139
- 238000000034 method Methods 0.000 title claims abstract description 123
- 238000012549 training Methods 0.000 claims abstract description 73
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 71
- 230000002068 genetic effect Effects 0.000 claims abstract description 68
- 230000008569 process Effects 0.000 claims abstract description 46
- 239000013598 vector Substances 0.000 claims description 12
- 108090000623 proteins and genes Proteins 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000004088 simulation Methods 0.000 claims description 8
- 230000004083 survival effect Effects 0.000 claims description 8
- 230000035772 mutation Effects 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 6
- 230000002349 favourable effect Effects 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 238000004806 packaging method and process Methods 0.000 claims description 3
- 238000012966 insertion method Methods 0.000 claims 1
- 230000006978 adaptation Effects 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 29
- 238000013522 software testing Methods 0.000 description 6
- 238000011156 evaluation Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 235000015392 Sesbania grandiflora Nutrition 0.000 description 1
- 244000275021 Sesbania grandiflora Species 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a test generation and reuse system and a test generation and reuse method based on a support vector machine regression model, wherein the test generation and reuse system comprises the following steps: the test case generation unit based on the support vector machine regression model comprises a support vector machine regression module and a genetic algorithm module, wherein the support vector machine regression module is used for training a model for simulating and calculating fitness, and the genetic algorithm module is used for referring to the model trained by the support vector machine regression module to generate a test case; the test case reuse unit integrated with the support vector machine regression model comprises a test case reuse module, when a genetic algorithm is utilized to test a program, the support vector machine regression model is trained by taking an adaptation value calculated by a population individual in evolution and a pile inserting method as a sample, test data with higher adaptation degree is queried in a test case set of the program by utilizing the support vector machine regression model, and in the process of introducing the test data into new population iteration, cross operation is carried out on each selected individual and randomly selected referenced individuals with a certain probability.
Description
Technical Field
The invention relates to the technical field of software development, in particular to a test generation and reuse system and method based on a support vector machine regression model.
Background
Software testing is an integral part of software development throughout the entire process of software development, and typically approximately 40% of the time and effort is spent on software testing during the software lifecycle. Studies have shown that the cost on software testing is up to 3 to 5 times the sum of all other software engineering stage costs. The test efficiency is effectively improved, the cost on software test and the labor intensity of test engineers are reduced, and the test method is one of the popular research objects in the software field.
Reuse of software tests refers to reuse of already generated test resources in new software tests by using or simply modifying already existing test cases and applying them to the test as the test engineer performs new tests or regression tests. The method aims to utilize the existing test resources for multiple times, so that the one-time generated result plays the maximum role, and the test resources do not need to be regenerated for each test, thereby improving the efficiency of software test and the reliability of software. The test cases are used as core resources of the software test, and reuse of the test cases is key content of whole software test reuse.
As a global search method inspired by the biological evolution and genetic variation mechanism in the nature, genetic algorithms have achieved great research results in recent years by application in software testing. The genetic algorithm comprises the steps of population initialization, individual evaluation, selection operation, crossover operation, mutation operation, evolution termination condition judgment and the like. The initial population of the genetic algorithm is usually generated in a random mode, the individual evaluation calculates the fitness value of each individual in the population through a corresponding fitness function, the conventional method often needs to input the individual into a pile inserting program to calculate the fitness of the individual, the fitness determines the performance of the individual, and the survival rule of the winner and the winner is adopted to select the individual evolution population.
There are many repetitive efforts in the software testing work, and new test cases often need to be regenerated when testing new programs, and the failure of the test cases to fully utilize can result in increased testing cost and reduced labor efficiency of test engineers. When generating new test cases using genetic algorithms, conventional methods take a significant amount of run time to calculate individual fitness.
Disclosure of Invention
In view of the foregoing, it is necessary to provide a method and a system for generating and reusing test cases based on a support vector machine regression model, which improve the generation efficiency of test cases and reduce the workload of software testing.
A test generation and reuse system based on a support vector machine regression model, comprising:
the test case generation unit based on the support vector machine regression model comprises a support vector machine regression module and a genetic algorithm module, wherein the support vector machine regression module is used for training a model for simulating and calculating fitness, a trained sample is from a test case output by the genetic algorithm module, and the genetic algorithm module is used for referring to the model trained by the support vector machine regression module to generate the test case;
the test case reuse unit integrated with the support vector machine regression model comprises a test case reuse module, wherein the test case reuse module is used for initializing a preset number of individuals when a genetic algorithm test program is utilized, training the support vector machine regression model by taking fitness values calculated by population individuals and a stake inserting method in the population evolution process as samples, enabling the support vector machine regression algorithm to be integrated into test cases generated by the genetic algorithm module, introducing the trained models into the generation of the test cases, introducing individuals with the adaptability of model selection higher than a preset value into the genetic population evolution process, introducing test data into the new population iteration process, and carrying out cross operation on each selected individual and randomly selected reference individuals with preset probability to realize the reuse of the test cases.
Further, in the test case generation unit based on the support vector machine regression model, when the genetic algorithm module is used for carrying out test case generation by using the genetic algorithm, the population is evolved through operations such as population initialization, selection, crossover, mutation and the like, the fitness value of individuals in the population is an accurate value output by an input instrumentation program, the individuals in the population are used as samples of a training model to be input into the support vector machine regression model for training, after the training of the prediction model is completed, the trained model is adopted for predicting the fitness value of the individuals in the population when the population is evolved again, other genetic operations of the population are still carried out according to a method in a traditional mode, the range of the fitness value under the coverage of a target path is set according to the prediction condition of the training model, and if the fitness value of the population is in a range in the evolution process, the individuals need to be input into the instrumentation program for calculating the accurate value.
Further, the test case reuse unit integrated with the support vector machine regression model integrates the support vector machine regression module and the genetic algorithm module, the trained training support vector machine regression model is used as a prediction model, and after the individual fitness is trained by the prediction model, the fitness of the population individuals is simulated and calculated by the prediction model; inquiring test data with the fitness higher than a preset value in a test case set of a program by using the support vector machine regression model; when the number of times a reference individual is referred exceeds a predetermined value, the predetermined value is set to three times, and the reference individual is removed so as not to fall into local optimum; if the individual fitness training effect of the support vector machine regression model is good, the fact that the individual with high fitness selected by the support vector machine regression model has excellent genes favorable for survival of the individual is indicated, the reference individual carries excellent genes favorable for survival of the population, and the introduction of the excellent genes accelerates evolution of the population by combining with the population individual.
Further, in the process of generating the test case, a preset number of samples need to be used for training a constructed regression model of the support vector machine, and the samples are [ ]X,d) Including input featuresAnd the feature corresponds to a true value->The method comprises the steps of carrying out a first treatment on the surface of the Features (e.g. a character)xIn order to test the input vector of the case,dfor its fitness value.
Further, the test case is expressed asWhereinaFitness value for the number of samplesObtaining the adaptability true value obtained by inputting the pile inserting program to obtain the capacity with the size ofaIs a sample of (2)The method comprises the steps of carrying out a first treatment on the surface of the Inputting the samples into a support vector machine regression model, and fitting the weight of each input feature according to the sample data and the fitness value>The relation formula of the predicted value and the input characteristic value is as follows:
further, in the process of generating the test case by the genetic algorithm, the training model refers to the generated test case and the fitness obtained by the pile inserting method as a sample training support vector machine regression model, and the trained model can simulate and calculate the fitness of the test case of the program to be tested.
The test generation and reuse system based on the support vector machine regression model is used for predicting the fitness value by using the support vector machine regression model, searching for individuals with higher fitness in a test case library of a program to be tested by using the support vector machine regression model, and reusing the found individuals with higher fitness into the test of the program, and comprises the following steps:
step one, designing a training model of a support vector machine for simulating and calculating the fitness value of a test case, calculating the fitness value of an individual by using a pile inserting method in the test process of a program, and inputting population individuals and the fitness value thereof into the training model of the support vector machine;
step two, a method for simulating and calculating fitness by using a support vector machine regression model is provided, wherein in the process of generating a test case by using a genetic algorithm, a predetermined population individual and the fitness thereof are used as samples to train the support vector machine regression model, and in the next process of population evolution, the model is used for calculating the individual fitness to replace the traditional pile inserting calculation fitness;
thirdly, providing a method for searching test data by using a support vector machine regression model and reusing the test data into program tests, using a model trained by a simulation calculation individual fitness method, searching test data with higher fitness in a test case library of a corresponding program by using the model trained by the simulation calculation individual fitness method, and referring to the test data when the program is tested by using a genetic algorithm to generate a corresponding test case, so that the test data and an individual in evolution are subjected to cross operation.
Further, the method for simulating and calculating the fitness by using the support vector machine regression model comprises the following steps:
step a, dividing the obtained sample into a training sample and a prediction sample according to the proportion of 8:2;
step b, calculating weight of each feature of the training sample, wherein the weight comprises a risk function R, lagrange factor alpha under the condition of error, and predicting the fitness value of the test case by using a kernel mapping method;
step c, correcting the weight of each sample, and verifying by using a crisscross method;
step d, after training of all training samples is completed, the training of the prediction model is finished;
and e, finishing model training, and making a range interval covering the fitness of the target path individual by using the prediction sample.
Further, the method for generating the test case by using the genetic algorithm comprises the following steps:
step S1, initializing a population;
step S2, inputting population individuals into a pile inserting program to obtain the fitness value of each individual, and inputting the individual and the fitness value thereof into a support vector machine model to train a support vector machine regression model;
step S3, calculating the approximate value of the fitness of each evolving individual by the trained model;
s4, solving an accurate value of an individual with the fitness value within a preset threshold by using a pile inserting program;
step S5, the algorithm termination condition is met or the maximum iteration times are reached, the algorithm is terminated, otherwise, the step S6 is carried out;
step S6, individual selection, crossing and mutation operations are carried out, and the process goes to step S3.
Further, the method for searching test data and reusing the test data into program test by using the support vector machine regression model comprises the following steps:
step (1), selecting an individual with higher fitness by using a trained model, and formulating a range interval covering the fitness of the target path individual in a support vector machine regression module;
step (2), packaging the data selected to be tested as genetic individuals into a database, and introducing the database into a genetic process of a genetic algorithm;
step (3), the population performs cross operation with individuals randomly introduced in a database with a certain probability in the process of evolving a new population of the next generation;
step (4), if the number of times the individual is referenced is greater than three, removing the individual from the database;
step (5), the algorithm termination condition is met or the maximum iteration times are reached, the algorithm is terminated, otherwise, the step (6) is carried out;
and (6) repeating the step (3) and the step (4) in the new population evolution process of each generation.
The invention mainly contributes to the following aspects:
1) A model capable of simulating and calculating the adaptability of the test case is trained. In the process of generating the test cases by the genetic algorithm, the generated test cases and the fitness obtained by the pile inserting method are cited as sample training support vector machine regression models, and the trained models can simulate and calculate the fitness of the test cases of the program to be tested.
2) The training model is applied to the process of generating the test case by the genetic algorithm. When the traditional pile inserting method calculates the fitness value of the population individuals, a program needs to be run, so that a great deal of time is consumed. The support vector machine regression model can simulate the fitness value according to individuals, does not need to run a program to be tested, and reduces the time consumed by individual fitness evaluation. The model calculates that the individual fitness value is not an accurate value, so the software establishes a fitness value interval range belonging to excellent individuals according to the specific situation of the individual fitness simulated by the model, the individuals with fitness values in the interval are possibly the optimal individuals covering a target path, and accurate value calculation is needed for the individual fitness. The method minimally uses the instrumentation method to overlay test data of the target path. Experimental results also show that the method reduces the generation time of the test data more effectively and improves the test efficiency.
3) The training model is applied to reuse of test cases. In the aspect of reuse of test cases, a trained test model is used for searching individuals with higher fitness in a test case database of a corresponding program, and the individuals carry excellent genes which are beneficial to population survival. In the process of generating test cases by using a genetic algorithm, population individuals are combined with introduced individuals with a certain probability. Experimental results show that the test case generation method can further reduce the consumption of test case generation time and improve the test efficiency of the program.
In the test generation and reuse system and method based on the support vector machine regression model, the test case generation and reuse method integrated into the support vector machine regression model is provided, the value of fitness is calculated by using the support vector machine regression model, and compared with a neural network model, the support vector machine regression shows a plurality of special advantages in solving the problems of small sample, nonlinearity and high-dimensional pattern recognition. And predicting the fitness value by using a support vector machine regression model, searching an individual with higher fitness in a test case library of the program to be tested by using the model, and reusing the individual into the test of the program. The generation efficiency of test cases is improved in terms of reducing the time consumption for calculating the fitness of individuals and accelerating the population evolution.
Drawings
FIG. 1 is a flow chart of a technical architecture of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.
FIG. 2 is a flow chart of a test case generation framework of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.
FIG. 3 is a flow chart of a test case reuse framework of a test generation and reuse method based on a support vector machine regression model according to an embodiment of the present invention.
Fig. 4 is a software running interface of a support vector machine regression model training and its application based on a test generation and reuse method of a support vector machine regression model according to an embodiment of the present invention.
Detailed Description
In this embodiment, a test generation and reuse system and method based on a support vector machine regression model are taken as an example, and the present invention will be described in detail with reference to specific embodiments and drawings.
Referring to fig. 1, fig. 2 and fig. 3, a test generation and reuse system and method based on a regression model of a support vector machine according to an embodiment of the present invention are shown.
The support vector machine (Support Vector Machine, abbreviated as SVM) mainly comprises the definition of an objective function, the processing of noise in the objective function, the further optimization of the objective function and the solution of the nonlinear regression problem. The support vector machine regression model for predicting the adaptability of the test case is trained according to the following relevant knowledge of the support vector machine.
The support vector machine is a VC dimension theory based on a statistical learning theory and a structural risk minimization principle, and is a generalized classifier for binary classification of data according to a supervised learning (Supervised Learning) mode to obtain the best generalization capability. Support vector machine regression (Support Vector Machine Regression, called SVR) is a regression method based on penalty learning. The goal of SVR is analog inputxSum resultyRegression relationship betweenf(x) The formula is:
when SVR is used for classification, noise points often exist in data, in order to divide the noise points correctly, hyperplanes are closed to samples of another class, so that geometric intervals for dividing the hyperplanes are reduced, and generalization performance of a model is reduced. To minimize the effect of noise points on the modelIntroducing relaxation factorsAnd->Indicating excess and not excessεPenalty interval case. And combining parametersCTo construct the final risk functionR. The problem of SVR is that the SVR is converted into a risk function in the presence of errorsRIs expressed as:
the optimization problem is a constraint quadratic programming problem which can be solved by introducing Lagrange factorsα-iAndα+iconverting the problem into a quadratic form, fusing constraint conditions into an objective function through a Lagrange function, traversing the objective function to obtain a partial derivative, and obtaining Lagrange factors and Lagrange coefficients by enabling the partial derivative to be 0wIs a relationship of (3). Thus, the problem of clear expression can be solved by only using one function expression, namely, the original parameter is requiredwAndbreduced to Lagrange factor onlyαThe initial regression objective can thus be expressed as:
in order to solve the nonlinear regression problem, a kernel mapping method is introduced, and a conversion function is usedΦVariable is changedxMapping to high-dimensional non-linear space and by introducing kernel functionsK,To avoid inner products +.>And finally obtaining the final form of the nonlinear regression expression:
the accuracy of the SVR model is determined by the selection of the kernel function, and the radial basis function is selected by the software(RBF)As a kernel function, the function is good for nonlinear data simulation and is suitable for predicting the fitness value of the test case.
All parameters satisfyξ-i,ξ+iInput data set of =0Then it is called a support vector. The support vector is the falling positionεVectors on error boundaries. The support vector represents the features of the regression model. In addition to the support vector, the addition and deletion of other samples has little effect on the training effect of the model, which makes the SVR method require a smaller number of samples to model than other machine learning methods. This is more pronounced when the feature dimension is greater than the data volume.
Supervised learning is in fact an optimization problem of empirical risk or structural risk functions. The risk function measures the predicted quality of the model in the average sense, and the predicted quality of the model is measured by the loss function. Hypothesis spaceFIn selecting a modelfAs a decision function, for a given inputXBy the following constitutionf(X) Give corresponding outputYThe predicted value of this outputf(X) And true valueYThere may be some difference and a loss function may be used to measure the degree of prediction error. The loss function is recorded asL(Y,f(X)). The loss function selected by the software is a square loss function, and the formula is expressed as:
the software combines the machine learning method with the genetic algorithm, utilizes the regression model of the support vector machine to simulate the individual fitness value, does not need to input each population individual into the pile inserting program to calculate the fitness, reduces the time consumed by evaluating the fitness, and improves the test efficiency.
The software comprises the generation of test cases and the reuse of the test cases.
1. Test case generation based on support vector machine regression model
In the process of generating test cases by using a genetic algorithm, the population is subjected to population initialization, selection, crossover, mutation and other operations according to a traditional algorithm, the fitness value of the population individuals is an accurate value output by an input pile inserting program, and the population individuals are input into a support vector machine regression model as samples of a training model for training. When the prediction model is trained, the population evolves again, the trained model is used for predicting the fitness value of individuals of the population, and other genetic operations of the population are still carried out according to the method under the traditional mode. And setting a range of fitness values under the coverage of the target path according to the prediction condition of the training model, and if the fitness value of the population is in a section in the evolution process, inputting the individual into a pile inserting program to calculate an accurate value.
When the test case is generated, a certain number of samples are needed to train the constructed regression model of the support vector machine, and one sample is usedX,d) Is composed of input featuresTrue value corresponding to each characteristicThe method comprises the steps of carrying out a first treatment on the surface of the Features (e.g. a character)xIn order to test the input vector of the case,dfor its fitness value.
A certain number of test cases are generated first,aThe fitness value of these test data is +.>Is the transfusionThe adaptability truth value obtained by the pile inserting program is obtained, and the capacity is obtainedaIs a sample of (2). The sample input support vector machine regression model fits the weight of each input feature according to the sample data and the fitness value thereof>The relation between the predicted value and the input characteristic value is expressed as follows by a formula:
the support vector machine regression model trains a model for predicting the adaptability value of the test case according to the test case of a certain program. In order to make sample data representative, test cases selected for training samples are data randomly generated by a genetic algorithm. It should be noted that the sample size should not be too large or too small. The model with small sample capacity and training is low in accuracy, and the internal linearity rule of the sample cannot be reflected; the more computing resources that are consumed with too much capacity, the longer the time it takes to reduce the efficiency of the test.
The test case generation integrated into the support vector machine model is divided into two modules, namely a support vector machine regression module and a genetic algorithm module. The support vector machine regression module is mainly used for training a model for simulating and calculating fitness, and a trained sample is from a test case output by the genetic algorithm module. The genetic algorithm module refers to a model generation test case trained by the support vector machine module.
(1) Regression module of support vector machine
The test case generated by the genetic algorithm is used as a sample to train a support vector machine regression model, and the main steps are as follows:
1) Obtaining a certain amount of samples, and dividing the samples into training samples and prediction samples according to the proportion of 8:2;
2) Converting the objective function into a risk function in the presence of errors according to equation (2)RIs a minimum problem;
3) Fusing Lagrange function as constraint condition to the objective function according to formula (3) to obtainwWith Lagrange factoraIs a relation of (2);
4) The kernel mapping method is introduced in the formula (4), and the problem of nonlinear high dimensionality is solved by adopting the kernel function in the formula (5);
5) Correcting the weight according to the steps 2), 3) and the formula (6) along with the increase of samples, verifying by using a crisscross method, and ending the training of the prediction model after the training of all samples is finished;
6) And (3) finishing model training, and utilizing the prediction samples to formulate a range interval covering the fitness of the target path individual.
In the process of generating test cases by a genetic algorithm, a support vector machine regression model for simulating the fitness of individuals is trained, and the model is applied to evolution after population. The genetic algorithm module comprises a method for supporting training of a vector machine regression model and using the model.
(2) Genetic algorithm module
The genetic algorithm is adopted to generate the test case, and the main steps are as follows:
1) Initializing a population;
2) The population individual input pile inserting program obtains the fitness value of each individual, and the individual and the fitness value thereof are input into the support vector machine model to train the support vector machine regression model;
3) The fitness of each evolving individual is calculated by a trained model to approximate the fitness;
4) For individuals with fitness values within a preset threshold, solving the accurate value by using a pile inserting program;
5) The algorithm termination condition is met or the maximum iteration number is reached, the algorithm is terminated, otherwise, the step 6) is carried out;
6) And (3) performing selection, crossover and mutation on the individual, and turning to the step (3).
When the genetic algorithm test program is utilized, before the support vector machine regression model is not trained successfully, the genetic population evolves by adopting the fitness calculated by the pile inserting method, after the population evolves to a certain algebra, the model training is successful, and in the next population evolution, the individual fitness is calculated by using the training model in a simulation mode. The adaptation value is calculated by using a training model instead of the pile-inserting method, so as to reduce the time required for population evolution.
The test case reuse method utilizes the support vector machine regression model again, applies the training model to reuse the test cases, and aims to accelerate the evolution speed of individuals covering the target path and improve the test efficiency.
2. Test case reuse incorporating support vector machine regression model
The support vector machine regression algorithm is fused into the genetic algorithm to generate test cases, the use of a training model cannot accelerate the evolution of the population, and the test efficiency needs to be improved from the aspect of reducing the time for individuals to run the plug-in program. The reuse of the software test case refers to the trained model to the generation of the test case, and the generation efficiency of the test case is improved again according to the model. Namely, individuals with higher adaptability of model selection are introduced into the genetic population evolution process, so that the population evolution is accelerated, and the generation efficiency of test data is further improved.
The test case reuse unit of the support vector machine regression model comprises a test case reuse module, and the test case reuse is completed by using the trained support vector machine regression model as follows:
1) And selecting individuals with higher fitness by using the trained model. A range interval capable of covering the fitness of the target path individual is established in the support vector machine regression module, and the individual with the fitness in the interval is called an individual with higher fitness or an excellent individual;
2) Packaging the selected tested data as genetic individuals into a database and introducing the database into a genetic process of a genetic algorithm;
3) The population performs cross operation with individuals (random selection) introduced in a database with a certain probability in the process of evolving the next generation;
4) To avoid local optimality, an individual is removed from the database if the individual is referenced more than three times;
5) The algorithm termination condition is met or the maximum iteration number is reached, the algorithm is terminated, otherwise, the step 6) is carried out;
6) Repeating the steps 3) and 4) in the new population evolution process of each generation.
When a certain program is tested by utilizing a genetic algorithm, a certain number of individuals are initialized, and a support vector machine regression model is trained by taking fitness values calculated by the population individuals in evolution and a pile-inserting method as samples in the process of population evolution. After the prediction model training of individual fitness is completed, the fitness of the population individuals is calculated by the model simulation. In addition, the model is used for inquiring test data with higher fitness in the test case set of the program, and the referenced individuals are called as referenced individuals in the process of leading the test data to new population iteration. Each selected individual cross-operates with a probability with a randomly selected reference individual. In order to maintain the diversity of population genes and avoid the situation of being in local optimum, an individual is removed if the number of times that the individual is referenced exceeds a certain value (here set to 3). If the effect of the support vector machine regression model on simulating the fitness of the individual is good, the model shows that the individual with high fitness has excellent genes favorable for survival of the individual, the introduction of the excellent genes accelerates the evolution of the population, and the generation efficiency of test cases is improved.
Referring to fig. 4, an operation interface for supporting the training of the regression model of the vector machine and the application thereof is shown, and a plug-in prototype for supporting the training of the regression model of the vector machine and the application thereof serves a software project main body, so that functions of host software are effectively expanded and perfected. The training of the support vector machine regression prediction model and the process of generating and reusing the test cases by using the model are realized in the form of plug-ins, so that the test process of the program is further simplified. The plug-in is developed by selecting java as an editing language, and the development environment is Myeclipse 2010. The computer is configured as a Windows (Intel (R) Core (TM) CPU i5-6500,3.20GHz,8.00GB RAM,64 bit operating system.
The "Program" in the menu option gives the instrumentation method of the Program to be tested, the "SVR Model" includes the selection and analysis of the regression Model parameters of the training support vector machine, the "TestCase" analyzes the role and requirement of the test case database, the "Options" provides the environment selection such as language and font color size, and the "Help" contains the description of the use process of the plug-in.
When the plug-in is used, buttons of the interface are sequentially executed according to the serial number sequence, and the operation process is as follows:
1) Clicking a button '1. Select a Program', selecting a file in which the source code of the Program to be tested is located in a file selection interface, and storing the Program to be tested after being inserted in the file.
2) Clicking the button "2. Train SVR Model" to support training of the vector machine regression fitness prediction Model, the text box display "Successful Training of SVR Model" indicates that the Model training was successful.
3) Clicking the button '3. Find Test Cases', selecting the file of the use case database of the program under Test on the pop-up interface.
4) Clicking the button "4. Reuse and Generate Test Cases", then when the genetic algorithm generates a test case, the population individuals use the training model to solve for fitness. And the training model searches the test case with higher fitness in the file and uses the test case as an excellent individual to be reused in the generation of the test case, and the text box under the button outputs the test result.
The invention mainly contributes to the following aspects:
1) A model capable of simulating and calculating the adaptability of the test case is trained. In the process of generating the test cases by the genetic algorithm, the generated test cases and the fitness obtained by the pile inserting method are cited as sample training support vector machine regression models, and the trained models can simulate and calculate the fitness of the test cases of the program to be tested.
2) The training model is applied to the process of generating the test case by the genetic algorithm. When the traditional pile inserting method calculates the fitness value of the population individuals, a program needs to be run, so that a great deal of time is consumed. The support vector machine regression model can simulate the fitness value according to individuals, does not need to run a program to be tested, and reduces the time consumed by individual fitness evaluation. The model calculates that the individual fitness value is not an accurate value, so the software establishes a fitness value interval range belonging to excellent individuals according to the specific situation of the individual fitness simulated by the model, the individuals with fitness values in the interval are possibly the optimal individuals covering a target path, and accurate value calculation is needed for the individual fitness. The method minimally uses the instrumentation method to overlay test data of the target path. Experimental results also show that the method reduces the generation time of the test data more effectively and improves the test efficiency.
3) The training model is applied to reuse of test cases. In the aspect of reuse of test cases, a trained test model is used for searching individuals with higher fitness in a test case database of a corresponding program, and the individuals carry excellent genes which are beneficial to population survival. In the process of generating test cases by using a genetic algorithm, population individuals are combined with introduced individuals with a certain probability. Experimental results show that the test case generation method can further reduce the consumption of test case generation time and improve the test efficiency of the program.
In the test generation and reuse system and method based on the support vector machine regression model, the test case generation and reuse method integrated into the support vector machine regression model is provided, the value of fitness is calculated by using the support vector machine regression model, and compared with a neural network model, the support vector machine regression shows a plurality of special advantages in solving the problems of small sample, nonlinearity and high-dimensional pattern recognition. And predicting the fitness value by using a support vector machine regression model, searching an individual with higher fitness in a test case library of the program to be tested by using the model, and reusing the individual into the test of the program. The generation efficiency of test cases is improved in terms of reducing the time consumption for calculating the fitness of individuals and accelerating the population evolution.
It should be noted that the above-mentioned embodiments are merely preferred embodiments of the present invention, and are not intended to limit the present invention, but various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A test generation and reuse system based on a support vector machine regression model, comprising:
the test case generation unit based on the support vector machine regression model comprises a support vector machine regression module and a genetic algorithm module, wherein the support vector machine regression module is used for training a model for simulating and calculating fitness, a trained sample is from a test case output by the genetic algorithm module, and the genetic algorithm module is used for referring to the model trained by the support vector machine regression module to generate the test case;
the test case reuse unit integrated with the support vector machine regression model comprises a test case reuse module, wherein the test case reuse module is used for initializing a preset number of individuals when a genetic algorithm test program is utilized, training the support vector machine regression model by taking fitness values calculated by population individuals and a stake inserting method in the population evolution process as samples, enabling the support vector machine regression algorithm to be integrated into test cases generated by the genetic algorithm module, introducing the trained models into the generation of the test cases, introducing individuals with the adaptability of model selection higher than a preset value into the genetic population evolution process, introducing test data into the new population iteration process, and carrying out cross operation on each selected individual and randomly selected reference individuals with preset probability to realize the reuse of the test cases.
2. The test generation and reuse system based on the support vector machine regression model according to claim 1, wherein in the test case generation unit based on the support vector machine regression model, the genetic algorithm module evolves a population through operations such as population initialization, selection, crossover, mutation and the like when the test case generation is performed by using a genetic algorithm, the fitness value of individuals in the population is an accurate value output by an input instrumentation program, the individuals in the population are input into the support vector machine regression model as samples of a training model for training, when the training of the prediction model is completed, the trained model is adopted for predicting the fitness value of the individuals in the population, other genetic operations of the population are still performed according to a method in a traditional mode, the range of the fitness value under which a target path is covered is set according to the prediction condition of the training model, and if the fitness value of the population is in an interval in the evolution process, the individuals need to be input into the instrumentation program for calculating the accurate value.
3. The test generation and reuse system based on the support vector machine regression model according to claim 1, wherein the test case reuse unit blended with the support vector machine regression model uses the trained training support vector machine regression model as a prediction model by integrating the support vector machine regression module and the genetic algorithm module, and the individual fitness is calculated by the prediction model simulation after the prediction model is trained; inquiring test data with the fitness higher than a preset value in a test case set of a program by using the support vector machine regression model; when the number of times a reference individual is referred exceeds a predetermined value, the predetermined value is set to three times, and the reference individual is removed so as not to fall into local optimum; if the individual fitness training effect of the support vector machine regression model is good, the fact that the individual with high fitness selected by the support vector machine regression model has excellent genes favorable for survival of the individual is indicated, the reference individual carries excellent genes favorable for survival of the population, and the introduction of the excellent genes accelerates evolution of the population by combining with the population individual.
4. The test generation and reuse system based on a support vector machine regression model of claim 1 wherein the test case generation requires training the constructed support vector machine regression model with a predetermined number of samples, the samples [ ]X,d) Including input featuresAnd the feature corresponds to a true value->The method comprises the steps of carrying out a first treatment on the surface of the Features (e.g. a character)xIn order to test the input vector of the case,dfor its fitness value.
5. The system for generating and reusing a test based on a support vector machine regression model of claim 4 wherein the test case is expressed asWhereinaFor the number of samples, fitness value +.>Obtaining the adaptability true value obtained by inputting the pile inserting program to obtain the capacity with the size ofaSample->The method comprises the steps of carrying out a first treatment on the surface of the Inputting the samples into a support vector machine regression model, fitting the weight of each input feature according to the sample data and the fitness valueThe relation formula of the predicted value and the input characteristic value is as follows: />
6. The test generation and reuse system based on a support vector machine regression model according to claim 2, wherein the training model is a model for training the support vector machine regression model by referring to the generated test cases and the fitness obtained by the pile insertion method as a sample in the process of generating the test cases by the genetic algorithm, and the trained model can simulate and calculate the fitness of the test cases of the program to be tested.
7. A method for generating and reusing a test based on a support vector machine regression model, characterized in that the method uses the test generating and reusing system based on the support vector machine regression model according to any one of claims 1-6, predicts the value of fitness by using the support vector machine regression model, searches individuals with higher fitness in a test case library of a program to be tested by using the support vector machine regression model, and reuses the found individuals with higher fitness into the test of the program, comprising the following steps:
step one, designing a training model of a support vector machine for simulating and calculating the fitness value of a test case, calculating the fitness value of an individual by using a pile inserting method in the test process of a program, and inputting population individuals and the fitness value thereof into the training model of the support vector machine;
step two, a method for simulating and calculating fitness by using a support vector machine regression model is provided, wherein in the process of generating a test case by using a genetic algorithm, a predetermined population individual and the fitness thereof are used as samples to train the support vector machine regression model, and in the next process of population evolution, the model is used for calculating the individual fitness to replace the traditional pile inserting calculation fitness;
thirdly, providing a method for searching test data by using a support vector machine regression model and reusing the test data into program tests, using a model trained by a simulation calculation individual fitness method, searching test data with higher fitness in a test case library of a corresponding program by using the model trained by the simulation calculation individual fitness method, and referring to the test data when the program is tested by using a genetic algorithm to generate a corresponding test case, so that the test data and an individual in evolution are subjected to cross operation.
8. The method for generating and reusing a test based on a support vector machine regression model according to claim 7, wherein the method for simulating calculation of fitness using the support vector machine regression model comprises the steps of:
step a, dividing the obtained sample into a training sample and a prediction sample according to the proportion of 8:2;
step b, calculating weight of each feature of the training sample, wherein the weight comprises a risk function R, lagrange factor alpha under the condition of error, and predicting the fitness value of the test case by using a kernel mapping method;
step c, correcting the weight of each sample, and verifying by using a crisscross method;
step d, after training of all training samples is completed, the training of the prediction model is finished;
and e, finishing model training, and making a range interval covering the fitness of the target path individual by using the prediction sample.
9. The method for generating and reusing test based on support vector machine regression model according to claim 7, wherein the method for generating test cases by using genetic algorithm comprises the steps of:
step S1, initializing a population;
step S2, inputting population individuals into a pile inserting program to obtain the fitness value of each individual, and inputting the individual and the fitness value thereof into a support vector machine model to train a support vector machine regression model;
step S3, calculating the approximate value of the fitness of each evolving individual by the trained model;
s4, solving an accurate value of an individual with the fitness value within a preset threshold by using a pile inserting program;
step S5, the algorithm termination condition is met or the maximum iteration times are reached, the algorithm is terminated, otherwise, the step S6 is carried out;
step S6, individual selection, crossing and mutation operations are carried out, and the process goes to step S3.
10. The method for generating and reusing a test based on a support vector machine regression model according to claim 7, wherein the method for finding and reusing test data into a program test using the support vector machine regression model comprises the steps of:
step (1), selecting an individual with higher fitness by using a trained model, and formulating a range interval covering the fitness of the target path individual in a support vector machine regression module;
step (2), packaging the data selected to be tested as genetic individuals into a database, and introducing the database into a genetic process of a genetic algorithm;
step (3), the population performs cross operation with individuals randomly introduced in a database with a certain probability in the process of evolving a new population of the next generation;
step (4), if the number of times the individual is referenced is greater than three, removing the individual from the database;
step (5), the algorithm termination condition is met or the maximum iteration times are reached, the algorithm is terminated, otherwise, the step (6) is carried out;
and (6) repeating the step (3) and the step (4) in the new population evolution process of each generation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910606331.2A CN110533150B (en) | 2019-07-05 | 2019-07-05 | Test generation and reuse system and method based on support vector machine regression model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910606331.2A CN110533150B (en) | 2019-07-05 | 2019-07-05 | Test generation and reuse system and method based on support vector machine regression model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110533150A CN110533150A (en) | 2019-12-03 |
CN110533150B true CN110533150B (en) | 2023-05-23 |
Family
ID=68659531
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910606331.2A Active CN110533150B (en) | 2019-07-05 | 2019-07-05 | Test generation and reuse system and method based on support vector machine regression model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110533150B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111127146B (en) * | 2019-12-19 | 2023-05-26 | 江西财经大学 | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder |
CN113297060A (en) * | 2020-05-11 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data testing method and device |
CN112015636A (en) * | 2020-07-14 | 2020-12-01 | 北京淇瑀信息科技有限公司 | Decision engine testing method and device based on support vector machine and electronic equipment |
CN114968824B (en) * | 2022-07-28 | 2022-09-30 | 江西财经大学 | Testing method and system based on chain multi-path coverage |
CN116010291A (en) * | 2023-03-28 | 2023-04-25 | 江西财经大学 | Multipath coverage test method based on equalization optimization theory and gray prediction model |
CN116303094B (en) * | 2023-05-10 | 2023-07-21 | 江西财经大学 | Multipath coverage test method based on RBF neural network and individual migration |
CN117632770B (en) * | 2024-01-25 | 2024-04-19 | 江西财经大学 | Multipath coverage test case generation method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012116208A2 (en) * | 2011-02-23 | 2012-08-30 | New York University | Apparatus, method, and computer-accessible medium for explaining classifications of documents |
CN107292406A (en) * | 2016-03-30 | 2017-10-24 | 中国石油化工股份有限公司 | Seismic properties method for optimizing based on vector regression and genetic algorithm |
CN108446214A (en) * | 2018-01-31 | 2018-08-24 | 浙江理工大学 | Test case evolution generation method based on DBN |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7346209B2 (en) * | 2002-09-30 | 2008-03-18 | The Board Of Trustees Of The Leland Stanford Junior University | Three-dimensional pattern recognition method to detect shapes in medical images |
-
2019
- 2019-07-05 CN CN201910606331.2A patent/CN110533150B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012116208A2 (en) * | 2011-02-23 | 2012-08-30 | New York University | Apparatus, method, and computer-accessible medium for explaining classifications of documents |
CN107292406A (en) * | 2016-03-30 | 2017-10-24 | 中国石油化工股份有限公司 | Seismic properties method for optimizing based on vector regression and genetic algorithm |
CN108446214A (en) * | 2018-01-31 | 2018-08-24 | 浙江理工大学 | Test case evolution generation method based on DBN |
Non-Patent Citations (4)
Title |
---|
Feature selection for support vector machines by;H. Frohlich et al.;《 Proceedings. 15th IEEE International Conference on Tools》;20031208;第1-7页 * |
Using genetic algorithms for test case generation;Izzat Alsmadi;《CCECE 2010》;20100916;第1-4页 * |
基于模式组合的粒子群优化测试用例生成方法;姜淑娟等;《软件学报》;20161231;第27卷(第4期);第785-801页 * |
融入神经网络的路径覆盖测试数据进化生成;姚香娟等;《软件学报》;20161231;第27卷(第4期);第828-838页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110533150A (en) | 2019-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110533150B (en) | Test generation and reuse system and method based on support vector machine regression model | |
Mrazek et al. | Libraries of approximate circuits: Automated design and application in CNN accelerators | |
Knowles | ParEGO: A hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems | |
US20200250555A1 (en) | Method and system for creating a game operation scenario based on gamer behavior prediction model | |
CN110414003B (en) | Method, device, medium and computing equipment for establishing text generation model | |
CN116432570A (en) | Method and device for generating test case of chip and storage medium | |
Moss et al. | Inducing point allocation for sparse Gaussian processes in high-throughput Bayesian optimisation | |
Chaudhuri et al. | Functional criticality analysis of structural faults in AI accelerators | |
US20210390451A1 (en) | Analysis device, machine learning device, analysis system, analysis method, and recording medium | |
Amorim et al. | A new word embedding approach to evaluate potential fixes for automated program repair | |
Long et al. | Challenges of ELA-guided function evolution using genetic programming | |
Eyraud et al. | TAYSIR Competition: Transformer+\textscrnn: Algorithms to Yield Simple and Interpretable Representations | |
Gandomi et al. | A multiobjective evolutionary framework for formulation of nonlinear structural systems | |
Patelli et al. | Model updating strategy of the DLR-AIRMOD test structure | |
KR101090892B1 (en) | Method of providing information for predicting enzyme selectivity of metabolism phase ii reactions | |
CN108595810A (en) | Digital satellite simulation report intelligent generation method | |
Yu et al. | A novel single-loop Kriging importance sampling method for estimating failure probability upper bound under random-interval mixed uncertainties | |
CN114968824B (en) | Testing method and system based on chain multi-path coverage | |
Azzeh et al. | An application of classification and class decomposition to use case point estimation method | |
CN111753992A (en) | Screening method and screening system | |
CN110554952A (en) | search-based hierarchical regression test data generation method | |
CN113656279A (en) | Code odor detection method based on residual error network and measurement attention mechanism | |
US20230161325A1 (en) | Manufacturing or controlling a technical system using an optimized parameter set | |
Bales et al. | Selecting the metric in hamiltonian monte carlo | |
Letras et al. | Decision tree-based throughput estimation to accelerate design space exploration for multi-core applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |