CN108563555A - Failure based on four objective optimizations changes code prediction method - Google Patents
Failure based on four objective optimizations changes code prediction method Download PDFInfo
- Publication number
- CN108563555A CN108563555A CN201810021354.2A CN201810021354A CN108563555A CN 108563555 A CN108563555 A CN 108563555A CN 201810021354 A CN201810021354 A CN 201810021354A CN 108563555 A CN108563555 A CN 108563555A
- Authority
- CN
- China
- Prior art keywords
- code
- chromosome
- change
- value
- failure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000008859 change Effects 0.000 claims abstract description 90
- 238000007689 inspection Methods 0.000 claims abstract description 18
- 230000007547 defect Effects 0.000 claims abstract description 11
- 230000002068 genetic effect Effects 0.000 claims abstract description 6
- 210000000349 chromosome Anatomy 0.000 claims description 66
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000012986 modification Methods 0.000 claims description 12
- 230000004048 modification Effects 0.000 claims description 12
- 230000035772 mutation Effects 0.000 claims description 10
- 230000003044 adaptive effect Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 4
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000011161 development Methods 0.000 claims description 3
- 238000010186 staining Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 230000009467 reduction Effects 0.000 abstract description 6
- 238000013459 approach Methods 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 6
- 238000004043 dyeing Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 230000004888 barrier function Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 210000001726 chromosome structure Anatomy 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
- G06F11/3608—Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Stored Programmes (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of, and the failure based on four objective optimizations changes code prediction method, belongs to Software Quality Assurance field.Include the following steps:(1) by excavating the version control system and defect tracking system of software project trustship, the data set for building change code failure predication model is collected.(2) multiple models for having non-Pareto dominance relation are eventually constructed by genetic algorithm based on four optimization aims.Wherein four optimization aims are respectively the wrong report quantity that the maximization approach failure code change quantity identified, the code inspection amount for minimizing developer's execution, the context switching times of reduction developer and reduction method change failure code.(3) after step (2), multiple models for having non-Pareto dominance relation can be constructed, therefore when actual prediction, which can flexibly be selected according to developer to the preference of target from these models.
Description
Technical field
The invention belongs to Software Quality Assurance technical fields, and in particular to a kind of failure change generation based on four objective optimizations
Code prediction technique.
Background technology
Software fault prediction is by excavating software history library (such as version control system, defect tracking system etc.), structure
Failure predication model, to identify the incipient fault program module in tested project in advance, by distributing more test resources
It, can be with the distribution of optimal inspection resource, to effectively improve the quality of software product onto these program modules.
The present invention is paid close attention to carries out failure predication to the change code for being submitted to version control system.It has following excellent
Point:
(1) in general, it is less to be submitted to the lines of code that the change code of version control system is related to by developer
(normally only changing tens line codes), if therefore predict it and may contain faulty, the difficulty of fault restoration is simultaneously little.
(2) invention can be deployed in the version control system of enterprises, after user submits change code, the hair
Prediction result is simultaneously returned to developer by the bright prediction that can carry out failure in time, and such developer sets being also familiar with code
The positioning and removal of failure are quickly completed when meter.
The code inspection of the failure code change quantity, minimum developer's execution that are identified for maximization approach
The wrong report quantity that amount, the context switching times for reducing developer and reduction method change failure code, it is necessary to examine
Consider this four optimization aims, and designs the construction method for the change code prediction model that is out of order.Therefore the present invention gives birth to.
Invention content
Goal of the invention:The purpose of the present invention is to solve deficiencies in the prior art, provide a kind of effectively for more
Change the method that code is predicted in time, after developer submits change code to version control system, can carry out in time
Failure predication, to help to quickly complete the positioning and removal of failure, the final quality for improving software product.In structure model
When, while considering the failure code that maximization approach identifies and changing quantity, minimize the code that developer executes
The wrong report quantity that examination amount, the context switching times for reducing developer and reduction method change failure code these four
Different optimization aims.
Technical solution:A kind of failure based on four objective optimizations of the present invention changes code prediction method, including such as
Lower step:
(1) it by excavating the version control system and defect tracking system of software project trustship, collects for building change
The data set of code failure predication model extracts all of project first by the version control system of analysis project trustship
History changes code, is secondly measured to the change code extracted;
(2) four optimization aims are based on, by genetic algorithm, finally constructs and multiple has non-Pareto dominance relation
Model;
(3) after constructing multiple models for having non-Pareto dominance relation, according to developer to the preference of target, from
It is flexibly selected in these models.
Further, the index of step (1) vacuum metrics includes:(a) degree of scatter of code is changed;(b) code is changed
Modification amount;(c) the modification purpose of code is changed;(d) history of code is changed;(e) warp of code dependent developer is changed
It tests.
Further, it is based on four optimization aims in step (2), has non-Pareto branch by genetic algorithm structure is multiple
Model with relationship, includes the following steps:
2-1) initialization population:It is returned using logarithm probability and changes code prediction model to build failure, it is assumed that use n
Measure Indexes measure change code, then the coefficient availability vector w={ w of logarithm probability regression model1, w2..., wnCome
It indicates, the type of each vector element is type real;The change code m that the coefficient vector w and needs of setting models are predictedi,
It is wherein v to the metric of the change code with j-th of Measure IndexesI, j, then the model pair can be calculated using following formula
Change code prediction go out its contain faulty probability:
0.5 is set a threshold to, if the probability value of prediction is more than 0.5, then it is assumed that the change code may introduce event
Barrier, if the probability value of prediction is less than 0.5, then it is assumed that the change code may realize that correctly calculation formula indicates as follows:
Chromosome in population is encoded using the vector, when initialization of population, can generate N number of dyeing at random
Body, the vector element random assignment of each chromosome are then based on four optimization aims, calculate each chromosome at this four
Adaptive value in optimization aim:
Optimization aim 1:The quantity of the failure code change identified on data set is maximized, value is the bigger the better,
Assuming that all history change code that data set contains constitutes set M, the corresponding candidate solution of chromosome is w, and calculation formula is:
Wherein buggy (mi) whether code change is indicated containing faulty, if being 1 containing faulty value, otherwise value is
0;
Optimization aim 2:Code inspection amount is minimized, value is the smaller the better, and calculation formula is:
Wherein LOC (mi) indicate that code changes the lines of code being related to;
According to model to data concentrate code change prediction probability value, from big to small, by all codes in M change into
Row sorts and carries out code inspection successively, then can calculate the adaptive value with latter two optimization aim;
Optimization aim 3:Since distribution of the defect in tested project substantially meets sixteen principles, which is
After spending 20% code inspection amount, the code variation examined is needed, value is the smaller the better, and all codes are examined here
The amount of looking into isIts value is higher, indicates under identical code inspection amount, and the code examined change is needed to get over
It is more, it means that developer needs to carry out more context switchings, to be had an impact to their development efficiency;
Optimization aim 4:When its return developer examines code change successively, become when encountering first real failure code
The code tested before more changes quantity, and value is the smaller the better, and the optimization aim value is higher, indicates the wrong report problem of model
It is more serious, and the confidence of developer and patience may be impacted.
It 2-2) is based on a upper population, crossover operator and mutation operator are executed successively, and generate new chromosome, wherein intersecting
Operator can select two chromosomes from a upper population at random according to crossover probability, intersected and generate two new dyeing
Body, mutation operator can then select a chromosome at random according to mutation probability, into row variation and generate a new chromosome;
2-3) by the chromosome and the merging of new chromosome, formation set B in a upper population, then it is based on Pareto and dominates
Relationship is that each chromosome in set B calculates NDR values, is defined first to Pareto dominance relation:
Assuming that there are two candidate solution wiAnd wj, then wiPareto dominates wj, and if only if:Under four optimization aims, w is solvedi
No worse than solution wj, and at least there is an optimization aim, solve wiIt is better than solution wj;
Then based in NDR value selective staining bodies to new population, selects NDR values for 1 chromosome first, then select
The chromosome that NDR values are 2, when the chromosome quantitative selected is equal to N (N is population scale), which terminates;
Step 2-2 and step 2-3 2-4) are repeated, after reaching the iterations that population is specified, returns to current population
In it is all not by other chromosome Paretos dominate chromosomes, wherein each chromosome correspond to a model.
Further, the calculating process of NDR values is as follows:It is identified from B first all not by other chromosome Paretos
Their NDR values are set as 1, and they are removed from set B by the chromosome of domination.It continues thereafter with and identifies institute from B
There is the chromosome not dominated by other chromosome Paretos, their NDR values is set as 2, and they are moved from set B
It removes.The above process is repeated, until set B is sky.
Advantageous effect:A kind of effective method predicted in time for change code of the present invention, works as developer
After submitting change code to version control system, failure predication can be carried out in time, to help to quickly complete determining for failure
Position and removal, the final quality for improving software product.When building model, while considering what maximization approach identified
Failure code changes quantity, the code inspection amount for minimizing developer's execution, the context switching times for reducing developer
And wrong report quantity these four different optimization aims that reduction method changes failure code.
Description of the drawings
Fig. 1 is the overview flow chart of the present invention;
Fig. 2 is the schematic diagram of some specific code modification;
Fig. 3 is the schematic diagram of the labeling process to introducing change code;
Fig. 4 is the sectional drawing for the data set that the present invention is collected for some open source projects;
Fig. 5 is the execution schematic diagram of crossover operator of the present invention;
Fig. 6 is the execution schematic diagram of mutation operator of the present invention.
Specific implementation mode
In order to which more the technology path of statement foregoing invention, following present invention people enumerate for specific embodiment in detail
Bright technique effect;It is emphasized that these embodiments are to be not limited to limit the scope of the invention for illustrating the present invention.
Embodiment 1
The present embodiment based on four objective optimizations failure change code prediction method overview flow chart as shown in Figure 1,
It is characterized by comprising following steps:
(1) it by excavating the version control system and defect tracking system of software project trustship, collects for building change
The data set of code failure predication model.First by the version control system of analysis project trustship, all of project are extracted
History changes code.Some specific code revision is as shown in Figure 2.The change code is modified password.c, often
If being labeled as "+" before a line, expression is specifically to change the newly-increased code of code, if being labeled as "-", table before every a line
Show it is the current code for changing code deletion.
Secondly the change code extracted is measured, Measure Indexes include:(a) degree of scatter of code, example are changed
Such as:The subsystem quantity of current change code revision, the catalogue quantity for changing code revision, the quantity of documents for changing code revision
Deng.(b) the modification amount of code is changed, such as:Respective code row before the lines of code of newly-increased lines of code, deletion, modification
Number etc..(c) the modification purpose of code is changed, i.e., specifically whether the purpose of change code is for removing failure.(d) code is changed
History, such as:Developer's number that change code is related to.(e) experience of code dependent developer is changed.1 sieve of table
Common Measure Indexes and its meaning are arranged.
The classification of 1 common Measure Indexes of table, title extremely meaning
Finally by the corresponding defect report of analysis change code and modification daily record, the mark to changing code can be completed
Note is labeled as introducing the change code of failure and realizes correctly change code.Specifically, it is identified first for repairing
The change code of failure.Mainly by searching the keywords such as Fixed or Bug in changing daily record.Then further confirm that this more
Change whether code is for repairing failure.Mainly by using the information in defect tracking system.By failure ID number (such as
Bug 12345) linking relationship for changing code and failure in defect tracking system can be set up.It finally determines when to introduce and be somebody's turn to do
Failure.The specific code repaired the code of failure and changed is determined using diff orders, be then act through annotate lives first
It enables to search the newest change code for changing these specific codes.Labeling process can briefly be introduced by Fig. 3.Such as
Change code from version B to version C is to repair failure, we by modified code (version C) and change first
Preceding code (version B), which is compared, can obtain the specific code that change code is related to (by diff orders).Then from this
It goes to search in the code revision history of file and has modified more changing the time (the change code for introducing failure) for these codes recently, and
And by the change code signing be introduce failure change code.
Based on the measurement and label for changing code to history, the collection of data set is completed.Fig. 4 is to be directed to some open source projects
The sectional drawing of the data set of collection.
(2) four optimization aims are based on, by genetic algorithm, eventually constructs and multiple has non-Pareto dominance relation
Model:
2-1) initialization population.The present invention returns (Logistic regression) to build failure using logarithm probability
Change code prediction model.Assuming that being measured to change code using n Measure Indexes, then logarithm probability regression model is
Number availability vector w={ w1, w2..., wnIndicate, the type of each vector element is type real.The coefficient of setting models to
The change code m that amount w and needs are predictedi, wherein being v to the metric of the change code with j-th of Measure IndexesI, j.It then can be with
Using following formula calculate the model to change code prediction go out its contain faulty probability.
The present invention sets a threshold to 0.5, i.e., if the probability value of prediction is more than 0.5, then it is assumed that the change code may
Failure can be introduced, if the probability value of prediction is less than 0.5, then it is assumed that the change code may be realized correctly.Its calculation formula indicates
It is as follows:
Chromosome in population is encoded using the vector, when initialization of population, can generate N number of dyeing at random
Body, the vector element random assignment of each chromosome.Four optimization aims are then based on, calculate each chromosome at this four
Adaptive value in optimization aim.
Optimization aim 1:The quantity that failure code change is identified on data set is maximized, value is the bigger the better.It is false
If all history change code that data set contains constitutes set M, the corresponding candidate solution of chromosome is w, and calculation formula is:
Wherein buggy (mi) whether code change is indicated containing faulty, if being 1 containing faulty value, otherwise value is
0。
Optimization aim 2:Code inspection amount is minimized, value is the smaller the better.Its calculation formula is:
Wherein LOC (mi) indicate that code changes the lines of code being related to.
According to model to data concentrate code change prediction probability value, from big to small, by all codes in M change into
Row sorts and carries out code inspection successively, then can calculate the adaptive value with latter two optimization aim.
Optimization aim 3:Since distribution of the defect in tested project substantially meets sixteen principles, which is
After spending 20% code inspection amount, need the code variation examined, value the smaller the better.Here all codes are examined
The amount of looking into isIts value is higher, indicates under identical code inspection amount, and the code examined change is needed to get over
It is more, it means that developer needs to carry out more context switchings, to be had an impact to their development efficiency.
Optimization aim 4:When its return developer examines code change successively, become when encountering first real failure code
The code tested before more changes quantity, and value is the smaller the better.The optimization aim value is higher, indicates the wrong report problem of model
It is more serious, and the information of developer and patience may be impacted.
We are based on a simplified example, to introduce the computational methods of this four optimization aims successively.Wherein change code
The lines of code being related to is by by the value of Measure Indexes LA (i.e. newly-increased lines of code) and the Measure Indexes LD (generations deleted
Code line number) the Calais Zhi Xiang obtain.
Assuming that (for easy analysis, LA values are added by we with LD values to be related to data set as change code as shown in table 2
The lines of code arrived), the prediction result of the model based on some chromosome structure is as shown in table 3, we will own in the table
Change code is sorted from big to small according to prediction probability value.
2 raw data set of table
ID | NS | ND | …… | LA+LD | Actual type |
1 | 2 | 3 | …… | 100 | 1 |
2 | 1 | 2 | …… | 25 | 0 |
3 | 3 | 3 | …… | 50 | 1 |
4 | 1 | 2 | …… | 100 | 0 |
5 | 2 | 2 | …… | 50 | 0 |
6 | 1 | 4 | …… | 100 | 0 |
7 | 4 | 2 | …… | 30 | 1 |
8 | 3 | 3 | …… | 75 | 0 |
9 | 2 | 4 | …… | 200 | 1 |
10 | 1 | 2 | …… | 70 | 0 |
11 | 5 | 1 | …… | 200 | 1 |
Data set after the sequence of table 3
ID | NS | ND | …… | LA+LD | Actual type | Prediction probability | Type of prediction |
10 | 1 | 2 | …… | 70 | 0 | 0.95 | 1 |
7 | 4 | 2 | …… | 30 | 1 | 0.90 | 1 |
1 | 2 | 3 | …… | 100 | 1 | 0.85 | 1 |
8 | 3 | 3 | …… | 75 | 0 | 0.82 | 1 |
2 | 1 | 2 | …… | 25 | 0 | 0.75 | 1 |
11 | 5 | 1 | …… | 200 | 1 | 0.65 | 1 |
3 | 3 | 3 | …… | 50 | 1 | 0.60 | 1 |
4 | 1 | 2 | …… | 100 | 0 | 0.55 | 1 |
5 | 2 | 2 | …… | 50 | 0 | 0.50 | 0 |
6 | 1 | 4 | …… | 100 | 0 | 0.40 | 0 |
9 | 2 | 4 | …… | 200 | 1 | 0.30 | 0 |
According to table 3, we can calculate adaptive value of the chromosome in four optimization aims successively.
The result of calculation of optimization aim 1 is:0×1+1×1+1×1+0×1+0×1+1×1+1×1+0×1+0×0+0
× 0+1 × 0=4.
The result of calculation of optimization aim 2 is:70+30+100+75+25+200+50+100=650.
When calculation optimization target 3, when examining to the 3rd change code, it is used for 20% test
(because having examined 200 line codes, and total lines of code is that 1000), therefore its optimization target values is 3 to resource.
When calculation optimization target 4, when examining to the 2nd change code, just find that it is really containing event
The coding change of barrier, therefore its optimization target values is 1.
It 2-2) is based on a upper population, crossover operator and mutation operator are executed successively, and generate new chromosome, wherein intersecting
Operator can select two chromosomes from a upper population at random according to crossover probability, intersected and generate two new dyeing
Body.Mutation operator can then select a chromosome at random according to mutation probability, into row variation and generate a new chromosome.
Wherein the schematic diagram of crossover operator is as shown in figure 5, it is intersected between element 3 and element 4, and is generated
Two new chromosomes.
The schematic diagram of mutation operator is as shown in fig. 6, it has carried out value variation on element 4, and generates new dyeing
Body.
2-3) by the chromosome and the merging of new chromosome, formation set B in a upper population, then it is based on Pareto and dominates
Relationship is that each chromosome in set B calculates NDR (non-dominated rank) value.First to Pareto dominance relation into
Row definition:
Assuming that there are two candidate solution wiAnd wj, then wiPareto dominates wj, and if only if:Under four optimization aims, w is solvedi
No worse than solution wj, and at least there is an optimization aim, solve wiIt is better than solution wj。
Pareto dominance relation is explained by example, for example, if there are three candidate solutions:
● candidate solution 1 is respectively 4,650,3,1 in the value of four optimization aims.
● candidate solution 2 is respectively 5,630,3,1 in the value of four optimization aims.
● candidate solution 3 is respectively 5,630,3,2 in the value of four optimization aims.
It is noted herein that optimization aim 1 is that the higher the better for value, and optimization aim 2 takes to optimization aim 4
It is worth the smaller the better.
Therefore according to definition, it cannot make out 2 Pareto of candidate solution and dominate candidate solution 1, because it is in optimization aim 1 and optimization
Value is more preferable in target 2, and value is equal in other optimization aims.
But candidate solution 3 is unable to Pareto and dominates candidate solution 1, because its value in optimization aim 2 is more preferable, but in optimization mesh
Value is worse on mark 4.
The calculating process of NDR values is as follows:All dyeing not dominated by other chromosome Paretos are identified from B first
Their NDR values are set as 1, and they are removed from set B by body.Continue thereafter with identified from B it is all not by other
The chromosome that chromosome Pareto dominates, is set as 2, and they are removed from set B by their NDR values.It repeats
The above process, until set B is sky.
Then based in NDR value selective staining bodies to new population, selects NDR values for 1 chromosome first, then select
The chromosome that NDR values are 2, when the chromosome quantitative of selection is equal to N, which terminates.
Step 2-2 and step 2-3 2-4) are repeated, after reaching the iterations that population is specified, returns to current population
In it is all not by other chromosome Paretos dominate chromosomes.Each chromosome corresponds to a model.
(3) after step (2), multiple models for having non-Pareto dominance relation can be constructed, therefore in reality
When prediction, which can flexibly be selected according to developer to the preference of target from these models.If such as
Developer more lays particular stress on optimization aim 1, then in these models selection the better model of 1 time performance of optimization aim come into
Row prediction.If developer more lays particular stress on optimization aim 2, selected in these models more preferable in 2 times performances of optimization aim
Model predicted.
A kind of effective method predicted in time for change code of the present invention, when developer is to Version Control
After system submits change code, failure predication can be carried out in time, to help to quickly complete the positioning and removal of failure, most
The quality of software product is improved eventually.When building model, while considering the failure code that maximization approach identifies and becoming
More quantity, the code inspection amount for minimizing developer's execution, the context switching times for reducing developer and reduction side
Wrong report quantity these four different optimization aims that method changes failure code.
The above described is only a preferred embodiment of the present invention, be not intended to limit the present invention in any form, though
So the present invention has been disclosed as a preferred embodiment, and however, it is not intended to limit the invention, any technology people for being familiar with this profession
Member, without departing from the scope of the present invention, when the technology contents using the disclosure above make a little change or modification
For the equivalent embodiment of equivalent variations, as long as being the content without departing from technical solution of the present invention, according to the technical essence of the invention
To any simple modification, equivalent change and modification made by above example, in the range of still falling within technical solution of the present invention.
Claims (4)
1. a kind of failure based on four objective optimizations changes code prediction method, it is characterised in that:Include the following steps:
(1) it by excavating the version control system and defect tracking system of software project trustship, collects for building change code
The data set of failure predication model extracts all history of project first by the version control system of analysis project trustship
Code is changed, secondly the change code extracted is measured;
(2) multiple moulds for having non-Pareto dominance relation are finally constructed by genetic algorithm based on four optimization aims
Type;
(3) after constructing multiple models for having non-Pareto dominance relation, according to developer to the preference of target, from these
It is flexibly selected in model.
2. a kind of failure based on four objective optimizations according to claim 1 changes code prediction method, it is characterised in that:
The index of step (1) vacuum metrics includes:(a) degree of scatter of code is changed;(b) the modification amount of code is changed;(c) code is changed
Modification purpose;(d) history of code is changed;(e) experience of code dependent developer is changed.
3. a kind of failure based on four objective optimizations according to claim 1 changes code prediction method, it is characterised in that:
It is based on four optimization aims in step (2), multiple models for having non-Pareto dominance relation are built by genetic algorithm, including
Following steps:
2-1) initialization population:It is returned using logarithm probability and changes code prediction model to build failure, it is assumed that use n measurement
Index measures change code, then the coefficient availability vector w={ w of logarithm probability regression model1, w2..., wnIndicate,
The type of each vector element is type real;The change code m that the coefficient vector w and needs of setting models are predictedi, wherein using
J-th of Measure Indexes is v to the metric of the change codeI, j, then the model can be calculated using following formula to changing generation
Code predict its contain faulty probability:
0.5 being set a threshold to, if the probability value of prediction is more than 0.5, then it is assumed that the change code may introduce failure, if
The probability value of prediction is less than 0.5, then it is assumed that the change code may realize that correctly calculation formula indicates as follows:
Chromosome in population is encoded using the vector, when initialization of population, can generate N number of chromosome at random,
The vector element random assignment of each chromosome, is then based on four optimization aims, it is excellent at this four to calculate each chromosome
Change the adaptive value in target:
Optimization aim 1:The quantity of the failure code change identified on data set is maximized, value is the bigger the better, it is assumed that
All history change code that data set contains constitutes set M, and the corresponding candidate solution of chromosome is w, and calculation formula is:
Wherein buggy (mi) whether code change is indicated containing faulty, if being 1 containing faulty value, otherwise value is 0;
Optimization aim 2:Code inspection amount is minimized, value is the smaller the better, and calculation formula is:
Wherein LOC (mi) indicate that code changes the lines of code being related to;
According to model data are concentrated with the prediction probability value of code change, from big to small, all codes change in M is arranged
Sequence simultaneously carries out code inspection successively, then can calculate the adaptive value with latter two optimization aim;
Optimization aim 3:Since distribution of the defect in tested project substantially meets sixteen principles, which is when flower
After taking 20% code inspection amount, the code variation examined is needed, value is the smaller the better, here all code inspection amounts
ForIts value is higher, indicates under identical code inspection amount, needs the code examined change more, this
Mean that developer needs to carry out more context switchings, to be had an impact to their development efficiency;
Optimization aim 4:When its return developer examines code change successively, it is changed when encountering first real failure code
The code of preceding test changes quantity, and value is the smaller the better, and the optimization aim value is higher, indicates that the wrong report problem of model is tighter
Weight, and the confidence of developer and patience may be impacted.
It 2-2) is based on a upper population, executes crossover operator and mutation operator successively, and generate new chromosome, wherein crossover operator
Two chromosomes can be selected from a upper population at random according to crossover probability, be intersected and generated two new chromosomes, become
Exclusive-OR operator can then select a chromosome at random according to mutation probability, into row variation and generate a new chromosome;
2-3) by a upper population chromosome and new chromosome merge, set B is formed, then based on Pareto dominance relation
NDR values are calculated for each chromosome in set B, Pareto dominance relation is defined first:
Assuming that there are two candidate solution wiAnd wj, then wiPareto dominates wj, and if only if:Under four optimization aims, w is solvediNot
It is worse than solution wj, and at least there is an optimization aim, solve wiIt is better than solution wj;
Then based in NDR value selective staining bodies to new population, selects NDR values for 1 chromosome first, then select NDR
Value is 2 chromosome, and when the chromosome quantitative selected is equal to N (N is population scale), which terminates;
Step 2-2 and step 2-3 2-4) are repeated, after reaching the iterations that population is specified, returns to institute in current population
There is the chromosome not dominated by other chromosome Paretos, wherein each chromosome corresponds to a model.
4. a kind of failure based on four objective optimizations according to claim 3 changes code prediction method, it is characterised in that:
The calculating process of NDR values is as follows:All chromosomes not dominated by other chromosome Paretos are identified from B first, by it
NDR values be set as 1, and they are removed from set B.Continue thereafter with identified from B it is all not by other chromosomes
The chromosome that Pareto dominates, is set as 2, and they are removed from set B by their NDR values.Repeat above-mentioned mistake
Journey, until set B is sky.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810021354.2A CN108563555B (en) | 2018-01-10 | 2018-01-10 | Fault change code prediction method based on four-target optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810021354.2A CN108563555B (en) | 2018-01-10 | 2018-01-10 | Fault change code prediction method based on four-target optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108563555A true CN108563555A (en) | 2018-09-21 |
CN108563555B CN108563555B (en) | 2020-03-31 |
Family
ID=63530900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810021354.2A Expired - Fee Related CN108563555B (en) | 2018-01-10 | 2018-01-10 | Fault change code prediction method based on four-target optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108563555B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710306A (en) * | 2017-10-25 | 2019-05-03 | 歌乐株式会社 | Source code resolver, source code analysis method, computer readable recording medium |
CN110083514A (en) * | 2019-03-19 | 2019-08-02 | 深圳壹账通智能科技有限公司 | Software test defect estimation method, apparatus, computer equipment and storage medium |
CN111597122A (en) * | 2020-07-24 | 2020-08-28 | 四川新网银行股份有限公司 | Software fault injection method based on historical defect data mining |
CN111858323A (en) * | 2020-07-11 | 2020-10-30 | 南京工业大学 | Code representation learning-based instant software defect prediction method |
WO2021104027A1 (en) * | 2019-11-28 | 2021-06-03 | 深圳前海微众银行股份有限公司 | Code performance testing method, apparatus and device, and storage medium |
CN113448821A (en) * | 2020-03-25 | 2021-09-28 | 北京京东振世信息技术有限公司 | Method and device for identifying engineering defects |
CN113656284A (en) * | 2021-07-26 | 2021-11-16 | 深圳技术大学 | Software defect prediction method and device, electronic equipment and storage medium |
CN114443476A (en) * | 2022-01-11 | 2022-05-06 | 阿里云计算有限公司 | Code review method and device |
CN114880206A (en) * | 2022-01-13 | 2022-08-09 | 南通大学 | Interpretability method of mobile application program code submission fault prediction model |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207928A (en) * | 2012-01-13 | 2013-07-17 | 利弗莫尔软件技术公司 | Multi-objective engineering design optimization using sequential adaptive sampling in the pareto optimal regio |
CN104978612A (en) * | 2015-01-27 | 2015-10-14 | 厦门大学 | Distributed big data system risk predicating method based on AHP-RBF |
CN105893256A (en) * | 2016-03-30 | 2016-08-24 | 西北工业大学 | Software failure positioning method based on machine learning algorithm |
CN107066384A (en) * | 2017-03-28 | 2017-08-18 | 东南大学 | Software Evolution appraisal procedure based on Halstead complexity metrics |
CN107423219A (en) * | 2017-07-21 | 2017-12-01 | 北京航空航天大学 | A kind of construction method of the software fault prediction technology based on static analysis |
-
2018
- 2018-01-10 CN CN201810021354.2A patent/CN108563555B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103207928A (en) * | 2012-01-13 | 2013-07-17 | 利弗莫尔软件技术公司 | Multi-objective engineering design optimization using sequential adaptive sampling in the pareto optimal regio |
CN104978612A (en) * | 2015-01-27 | 2015-10-14 | 厦门大学 | Distributed big data system risk predicating method based on AHP-RBF |
CN105893256A (en) * | 2016-03-30 | 2016-08-24 | 西北工业大学 | Software failure positioning method based on machine learning algorithm |
CN107066384A (en) * | 2017-03-28 | 2017-08-18 | 东南大学 | Software Evolution appraisal procedure based on Halstead complexity metrics |
CN107423219A (en) * | 2017-07-21 | 2017-12-01 | 北京航空航天大学 | A kind of construction method of the software fault prediction technology based on static analysis |
Non-Patent Citations (1)
Title |
---|
李丙栋: "超多目标演化算法及其应用研究", 《中国博士学位论文全文数据库信息科技辑》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710306B (en) * | 2017-10-25 | 2022-06-07 | 株式会社日立制作所 | Source code analysis device, source code analysis method, and computer-readable recording medium |
CN109710306A (en) * | 2017-10-25 | 2019-05-03 | 歌乐株式会社 | Source code resolver, source code analysis method, computer readable recording medium |
CN110083514A (en) * | 2019-03-19 | 2019-08-02 | 深圳壹账通智能科技有限公司 | Software test defect estimation method, apparatus, computer equipment and storage medium |
CN110083514B (en) * | 2019-03-19 | 2023-03-10 | 深圳壹账通智能科技有限公司 | Software test defect evaluation method and device, computer equipment and storage medium |
WO2021104027A1 (en) * | 2019-11-28 | 2021-06-03 | 深圳前海微众银行股份有限公司 | Code performance testing method, apparatus and device, and storage medium |
CN113448821A (en) * | 2020-03-25 | 2021-09-28 | 北京京东振世信息技术有限公司 | Method and device for identifying engineering defects |
CN113448821B (en) * | 2020-03-25 | 2023-12-08 | 北京京东振世信息技术有限公司 | Method and device for identifying engineering defects |
CN111858323B (en) * | 2020-07-11 | 2021-06-01 | 南京工业大学 | Code representation learning-based instant software defect prediction method |
CN111858323A (en) * | 2020-07-11 | 2020-10-30 | 南京工业大学 | Code representation learning-based instant software defect prediction method |
CN111597122A (en) * | 2020-07-24 | 2020-08-28 | 四川新网银行股份有限公司 | Software fault injection method based on historical defect data mining |
CN113656284A (en) * | 2021-07-26 | 2021-11-16 | 深圳技术大学 | Software defect prediction method and device, electronic equipment and storage medium |
CN114443476A (en) * | 2022-01-11 | 2022-05-06 | 阿里云计算有限公司 | Code review method and device |
CN114880206A (en) * | 2022-01-13 | 2022-08-09 | 南通大学 | Interpretability method of mobile application program code submission fault prediction model |
CN114880206B (en) * | 2022-01-13 | 2024-06-11 | 南通大学 | Interpretability method for submitting fault prediction model by mobile application program code |
Also Published As
Publication number | Publication date |
---|---|
CN108563555B (en) | 2020-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108563555A (en) | Failure based on four objective optimizations changes code prediction method | |
CN106201871B (en) | Based on the Software Defects Predict Methods that cost-sensitive is semi-supervised | |
CN104798043B (en) | A kind of data processing method and computer system | |
Miller et al. | Automatic test data generation using genetic algorithm and program dependence graphs | |
CN109165159B (en) | Multi-defect positioning method based on program frequency spectrum | |
CN107544905B (en) | Regression test case set optimization method and system | |
CN109947652A (en) | A kind of improvement sequence learning method of software defect prediction | |
EP4075281A1 (en) | Ann-based program test method and test system, and application | |
CN105653450A (en) | Software defect data feature selection method based on combination of modified genetic algorithm and Adaboost | |
CN105808426A (en) | Path coverage test data generation method used for weak mutation test | |
CN107066389A (en) | The Forecasting Methodology that software defect based on integrated study is reopened | |
CN117236278B (en) | Chip production simulation method and system based on digital twin technology | |
CN116340726A (en) | Energy economy big data cleaning method, system, equipment and storage medium | |
Malhotra et al. | Mining the impact of object oriented metrics for change prediction using machine learning and search-based techniques | |
Clarke | Improving SLEUTH calibration with a genetic algorithm | |
CN104335161B (en) | Efficient evaluation of network robustness with a graph | |
Guo et al. | Automatic design for shop scheduling strategies based on hyper-heuristics: A systematic review | |
Quintana et al. | ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profiles | |
Yu et al. | A multi-objective effort-aware defect prediction approach based on NSGA-II | |
CN117149615A (en) | Method and corresponding device for generating test case execution path | |
Groß | A prediction system for evolutionary testability applied to dynamic execution time analysis | |
CN116010291A (en) | Multipath coverage test method based on equalization optimization theory and gray prediction model | |
CN106096635A (en) | The warning sorting technique of cost-sensitive neutral net based on threshold operation | |
Alba et al. | Comparative analysis of modern optimization tools for the p-median problem | |
Ruiz-Andino et al. | A hybrid evolutionary approach for solving constrained optimization problems over finite domains |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200331 |
|
CF01 | Termination of patent right due to non-payment of annual fee |