CN106528417A - Intelligent detection method and system of software defects - Google Patents
Intelligent detection method and system of software defects Download PDFInfo
- Publication number
- CN106528417A CN106528417A CN201610964353.2A CN201610964353A CN106528417A CN 106528417 A CN106528417 A CN 106528417A CN 201610964353 A CN201610964353 A CN 201610964353A CN 106528417 A CN106528417 A CN 106528417A
- Authority
- CN
- China
- Prior art keywords
- software
- sample
- module
- vector
- defect
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The invention relates to an intelligent detection method and system of software defects. The method comprises the following steps of obtaining and preprocessing a sample software module to obtain a software sample set; carrying out clustering processing on a software no-defect sample set to obtain clustering subsets; carrying out random sampling on the clustering subsets to obtain a balanced software no-defect sample set; according to a software defect sample set and the balanced software no-defect sample set, obtaining an updated software sample set; according to the updated software sample set, carrying out training to obtain a defect detection model; and according to the defect detection model, carrying out defect prediction on a software module to be detected, and outputting a prediction result. The sample software module is classified, and the software no-defect sample set is subjected to clustering and sample extraction to guarantee the balance of the sample. According to the balanced software no-defect sample set, the defect detection model is trained, so that the defect prediction model can more favorably estimate and fit defect data, the prediction of the defect data can be obviously improved, and prediction accuracy is improved.
Description
Technical field
The present invention relates to technical field of software security, more particularly to a kind of software defect intelligent detecting method and system.
Background technology
With the development of information technology, complexity of software is improved constantly, software size constantly increases, especially for multiple
Miscellaneous software system, good software quality control and testing mechanism can not only help enterprise development to go out high-quality software product
Product, reduce production and maintenance cost, and are increasing customer satisfaction degree, set up good corporate image and enhancing enterprise exists
The aspects such as the competitiveness on market suffer from important meaning.
Traditional software defect intelligent detecting method using the software defect forecast model based on machine learning, with software mould
The metric data vector of block, is realized to software by steps such as pretreatment, feature extraction, model training and predictions as input
Module is predicted with the presence or absence of defect.The model due to built in problems such as the interpretational criteria of its performance, induction bias, to software
Defective module and software zero defect module are processed on an equal basis, using overall maximum predicted precision as target, but to software defect
Recall rate is not still high.Therefore, traditional software defect intelligent detecting method haves the shortcomings that prediction accuracy is low.
The content of the invention
Based on this, it is necessary to for the problems referred to above, there is provided a kind of software defect Intelligent Measurement for improving prediction accuracy
Method and system.
A kind of software defect intelligent detecting method, comprises the following steps:
Obtaining sample software module carries out pretreatment, obtains software sample collection;It is scarce that the software sample collection includes that software has
Sunken sample set and software zero defect sample set;
Clustering processing is carried out to the software zero defect sample set, obtains clustering subset;
Stochastic sampling is carried out to the cluster subset, the software zero defect sample set after being balanced;
Software zero defect sample set according to the defective sample set of the software and after the balance obtains updating software sample
This collection;
It is trained according to the renewal software sample collection and obtains defects detection model;
Failure prediction is carried out according to the defects detection model to software module to be measured, and output predicts the outcome.
A kind of software defect intelligent checking system, including:
Sample set sets up module, carries out pretreatment for obtaining sample software module, obtains software sample collection;The software
Sample set includes the defective sample set of software and software zero defect sample set;
Clustering processing module, for carrying out clustering processing to the software zero defect sample set, obtains clustering subset;
Sample process module, for carrying out stochastic sampling to the cluster subset, the software zero defect sample after being balanced
This collection;
Sample set update module, for the software zero defect sample according to the defective sample set of the software and after the balance
This collection obtains updating software sample collection;
Model training module, obtains defects detection model for being trained according to the renewal software sample collection;
Failure prediction module is for carrying out failure prediction according to the defects detection model to software module to be measured and defeated
Go out to predict the outcome.
Above-mentioned software defect intelligent detecting method and system, obtaining sample software module carries out pretreatment, obtains software sample
This collection.Clustering processing is carried out to software zero defect sample set, obtains clustering subset.Stochastic sampling is carried out to clustering subset, is obtained
Software zero defect sample set after balance.Obtained more according to the software zero defect sample set after the defective sample set of software and balance
New software sample collection.It is trained according to renewal software sample collection and obtains defects detection model.Treated according to defects detection model
Surveying software module carries out failure prediction, and output predicts the outcome.It is by classifying to sample software module and intact to software
Sunken sample set is clustered and sample drawn, it is ensured that the balance of sample.Trained according to the software zero defect sample set after balance
Defects detection model, enables bug prediction model preferably to be estimated to defective data and be fitted, to defective data
Prediction can be significantly improved, and improve prediction accuracy.
Description of the drawings
Fig. 1 is the flow chart of software defect intelligent detecting method in an embodiment;
Fig. 2 is the structure chart of software defect intelligent checking system in an embodiment.
Specific embodiment
In one embodiment, a kind of software defect intelligent detecting method, as shown in figure 1, comprise the following steps:
Step S110:Obtaining sample software module carries out pretreatment, obtains software sample collection.
Software sample collection includes the defective sample set of software and software zero defect sample set.Sample software module refers to and is known to be
The software module of no existing defects.For example, for known Q module in software, wherein software zero defect module has M, soft
Part defective module has N number of, M+N=Q.According to whether existing defects are classified after sample software module is acquired,
Obtain the defective sample set of software and software zero defect sample set.
In one embodiment, step S110 includes step 112 to step 116.
Step 112:Process is marked to sample software module respectively, the flaw labeling of each sample software module is obtained.
To each module i, i=1,2 ..., Q, whether there is flaw labeling flag respectively to whichi。flagi=1, exist
Defect;flagi=0, no defect.It is appreciated that the mark mode of each sample software module and the flaw labeling that obtains
Value is not unique, in other embodiments, or make existing defects sample software module flaw labeling be 0, order
The flaw labeling of the sample software module of existing defects is not 1 etc..
Step 114:Staticametric is carried out to sample software module respectively, the sample vector of each sample software module is obtained.
To each module i, i=1,2 ..., Q, staticametric is carried out to its source code respectively.It is static in the present embodiment
Tolerance specifically may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc., obtain common k degree
Value, and these metrics are respectively labeled as into ti1,ti2,...,tik, constitute sample vector Ti={ ti1,ti2,...,tik}。
Step 116:According to the flaw labeling of each sample software module, the sample vector of correspondence sample software module is carried out
Classification, obtains the defective sample set of software and software zero defect sample set.
The sample vector of software defective module is divided into into a class, the defective sample set { T of software is obtainedi|flagi=1 }.
The sample vector of software zero defect module is divided into into a class, software zero defect sample set { T is obtainedi|flagi=0 }.Software has scarce
Sunken sample set and software zero defect sample set constitute software sample collection { Ti| i=1,2 ..., Q }.
Step S120:Clustering processing is carried out to software zero defect sample set, obtains clustering subset.
To software zero defect sample set { Ti|flagi=0 } clustered, obtain clustering subset, cluster the concrete number of subset
Amount is not unique.In the present embodiment, clustered using MeanShift methods, amount of calculation is little, cluster analyses speed can be improved.
Step S120 includes step 122 to step 128.
Step 122:With the sample vector in software zero defect sample set as starting point, the meanshift of sample vector is calculated
Vector.Specially:
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) represent higher-dimension ball region of the radius for constant h
It is interior, meet relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in.Need explanation
It is, (T-Ti)T(T-Ti)<h2In, the T in bracket represents sample vector, and the operative symbol T in the bracket upper right corner represents transposition.
Step 124:Whether the meanshift vectors of judgment sample vector are more than predetermined threshold value.If so, then by sample vector
With meanshift vector sums as new sample vector, and return to step 122.If it is not, then carrying out step 126.Predetermined threshold value
ε is to preset and can be adjusted according to practical situation, if meanshift vector MshMore than ε, with Ti+MhAs new starting point,
New meanshift vector Ms are calculated againh。
Step 126:Using sample vector and meanshift vector sum as sample vector central point.If MhIt is less than
Or be equal to ε, then confirm Ti+MhCentered on point.Repeat step 122 is to step 126 until all sample vectors of traversal, generate in P
Heart point.
Step 128:Sample vector is clustered according to the central point of sample vector, obtain clustering subset.To tend to same
The sample vector of one central point is divided into a class, forms P cluster subset.
By multiple subsets of clustering method software for calculation defect distribution, software defect distribution is preferably portrayed, for entering one
The sample balance sampling of step lays the foundation.
Step S130:Stochastic sampling is carried out to clustering subset, the software zero defect sample set after being balanced.
The P cluster subset formed to cluster is sampled, to ensure that sample is balanced.For the sample of j-th cluster subset
This number is mj, then haveThe quantity being sampled to each cluster subset is not unique, in the present embodiment, to described
Cluster subset carries out the sample drawn number of stochastic sampling:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor j-th cluster subset
Sample number, M for software zero defect module sum, N for software defective module sum.
Based on defective data sample distribution, the method using stochastic sampling under each subset is chosen sample, realizes sample
Between balance.Software zero defect collection { T after being balancedi'|flagi=0 }, number of samples is
Step S140:Obtain updating software sample according to the software zero defect sample set after the defective sample set of software and balance
This collection.
Software defective sample set is merged with the software zero defect sample set after balance, as renewal software sample collection
{Ti' | i=1,2 ..., M'+N }.
Step S150:It is trained according to renewal software sample collection and obtains defects detection model.
Under the higher-dimension sample space after balance, the data digging method training defects detection model being adapted to is selected.According to
Software zero defect sample set training defects detection model after balance, enables bug prediction model preferably to enter to defective data
Row is estimated and is fitted.In one embodiment, step S150 includes:
Wherein, Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、
TjBetween kernel function;flagi、flagjRespectively update software sample and concentrate the defect mark corresponding to i-th, j sample vector
Note, λi、λjFor defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;
S.t. represent constraints, C is penalty factor, M '+N represent the number for updating that software sample concentrates sample vector.
RepresentThe parameter lambda of defects detection model when taking maximum
Value;The sample vector T that software sample is concentrated will be updated respectivelyi、TjSubstitute into
Sample vector T is determined when maximum is takeniWeights λiValue, finally give renewal software sample and concentrate all sample vectors
Weights.
Step S160:Failure prediction is carried out according to defects detection model to software module to be measured, and output predicts the outcome.
Failure prediction is carried out to unknown software under testing module using bug prediction model, is predicted the outcome and is exported,
Inform that staff completes the failure prediction to software module to be measured.In one embodiment, step S160 include step 162 and
Step 164.
Step 162:Staticametric is carried out to software module to be measured, the sample vector of software under testing module is obtained.Specifically,
Carry out staticametric to software module source code to be measured, staticametric equally may include Halstead tolerance, MaCabe tolerance,
Khoshgoftaar is measured and CK is measured etc..
Step 164:Software module to be measured is carried out according to the sample vector of software under testing module and bug prediction model
Failure prediction.Specially:
Wherein, sample vectors of the T for software under testing module, g (T) represent the flaw labeling of software under testing module, and sgn is represented
It is rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiI-th sample vector, flag are concentrated for updating software samplei
For updating the flaw labeling corresponding to software sample i-th sample vector of concentration, λiExpression is obtained more by bug prediction model
New software sample concentrates the weights of i-th sample vector, M '+N to represent the number for updating that software sample concentrates sample vector, and b is
Constant.Equally, K (T in the present embodimenti, T) and represent sample vector Ti, kernel function between T.Carry out the mode of integer variable and lack
The definition correspondence of sunken labelling.
In actual applications, as the probability of software failure is compared, normal probability is relatively low, and software failure module accounts for software
Total number of modules is relatively fewer.However, when these a small amount of software failure modules are undiscovered, once put into actually used, institute
The economic loss and social loss brought is immeasurable.Also, as software failure module accounts for software module sum relatively
Less, this causes which when software defect detection model training sample data differ greatly, and intelligent detecting method produces skewed popularity.
Above-mentioned software defect intelligent detecting method, constantly increases for complexity of software continuous improvement, software size, especially
Which is huge for complicated software system defects detection workload, the problems such as defect location is difficult.By to sample software module
Classified, and software zero defect sample set is clustered and sample drawn, it is ensured that the balance of sample.After balance
Software zero defect sample set trains defects detection model, enable bug prediction model preferably defective data is carried out estimating and
Fitting, the prediction to defective data can be significantly improved, and improve prediction accuracy.
In one embodiment, a kind of software defect intelligent checking system, as shown in Fig. 2 set up module including sample set
110th, clustering processing module 120, sample process module 130, sample set update module 140, model training module 150 and defect are pre-
Survey module 160.
Sample set sets up module 110 carries out pretreatment for obtaining sample software module, obtains software sample collection.
Software sample collection includes the defective sample set of software and software zero defect sample set.Sample software module refers to and is known to be
The software module of no existing defects.According to whether existing defects are classified after sample software module is acquired, obtain
The defective sample set of software and software zero defect sample set.In one embodiment, sample set sets up module 110 includes the first sample
This collection sets up that unit, the second sample set set up unit and the 3rd sample set sets up unit.
First sample collection sets up unit for being marked process to sample software module respectively, obtains each sample software mould
The flaw labeling of block.
To each module i, i=1,2 ..., Q, whether there is flaw labeling flag respectively to whichi。flagi=1, exist
Defect;flagi=0, no defect.It is appreciated that the mark mode of each sample software module and the flaw labeling that obtains
Value is not unique, in other embodiments, or make existing defects sample software module flaw labeling be 0, order
The flaw labeling of the sample software module of existing defects is not 1 etc..
Second sample set sets up unit for carrying out staticametric to sample software module respectively, obtains each sample software mould
The sample vector of block.
To each module i, i=1,2 ..., Q, staticametric is carried out to its source code respectively.It is static in the present embodiment
Tolerance specifically may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc., obtain common k degree
Value, and these metrics are respectively labeled as into ti1,ti2,...,tik, constitute sample vector Ti={ ti1,ti2,...,tik}。
3rd sample set sets up unit for the flaw labeling according to each sample software module, to correspondence sample software module
Sample vector classified, obtain the defective sample set of software and software zero defect sample set.
The sample vector of software defective module is divided into into a class, the defective sample set { T of software is obtainedi|flagi=1 }.
The sample vector of software zero defect module is divided into into a class, software zero defect sample set { T is obtainedi|flagi=0 }.Software has scarce
Sunken sample set and software zero defect sample set constitute software sample collection { Ti| i=1,2 ..., Q }.
Clustering processing module 120 obtains clustering subset for carrying out clustering processing to software zero defect sample set.
To software zero defect sample set { Ti|flagi=0 } clustered, obtain clustering subset, cluster the concrete number of subset
Amount is not unique.In the present embodiment, clustered using MeanShift methods, amount of calculation is little, cluster analyses speed can be improved.
Clustering processing module 120 includes first processing units, second processing unit, the 3rd processing unit and fourth processing unit.
First processing units calculate sample vector for the sample vector in software zero defect sample set as starting point
Meanshift is vectorial.Specially:
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) represent higher-dimension ball region of the radius for constant h
It is interior, meet relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in.Need explanation
It is, (T-Ti)T(T-Ti)<h2In, the T in bracket represents sample vector, and the operative symbol T in the bracket upper right corner represents transposition.
Second processing unit is used for whether the meanshift vectors of judgment sample vector to be more than predetermined threshold value.Predetermined threshold value ε
For presetting and can be adjusted according to practical situation.
3rd processing unit, in sample vector meanshift vector more than predetermined threshold value when, by sample vector with
Meanshift vector sums are used as new sample vector, and control first processing units again with software zero defect sample set
Sample vector be starting point, calculate sample vector meanshift it is vectorial;And it is little in the meanshift vectors of sample vector
In or when being equal to predetermined threshold value, using sample vector and meanshift vector sum as sample vector central point.
If meanshift vector MshMore than ε, with Ti+MhAs new starting point, new meanshift is calculated again vectorial
Mh.If MhLess than or equal to ε, then T is confirmedi+MhCentered on point.Repeat to calculate until all sample vectors of traversal, generate
P central point.
Fourth processing unit is clustered to sample vector for the central point according to sample vector, obtains clustering subset.
The sample vector for tending to same central point is divided into into a class, P cluster subset is formed.
By multiple subsets of clustering method software for calculation defect distribution, software defect distribution is preferably portrayed, for entering one
The sample balance sampling of step lays the foundation.
Sample process module 130 for cluster subset carry out stochastic sampling, the software zero defect sample after being balanced
Collection.
The P cluster subset formed to cluster is sampled, to ensure that sample is balanced.For the sample of j-th cluster subset
This number is mj, then haveThe quantity being sampled to each cluster subset is not unique, in the present embodiment, to described
Cluster subset carries out the sample drawn number of stochastic sampling:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor j-th cluster subset
Sample number, M for software zero defect module sum, N for software defective module sum.
Based on defective data sample distribution, the method using stochastic sampling under each subset is chosen sample, realizes sample
Between balance.Software zero defect collection { T after being balancedi'|flagi=0 }, number of samples is
Sample set update module 140 for according to the defective sample set of software and balance after software zero defect sample set obtain
To renewal software sample collection.
Software defective sample set is merged with the software zero defect sample set after balance, as renewal software sample collection
{Ti' | i=1,2 ..., M'+N }.
Model training module 150 obtains defects detection model for being trained according to renewal software sample collection.In balance
Under higher-dimension sample space afterwards, the data digging method training defects detection model being adapted to is selected.According to the software after balance without
Defect sample collection training defects detection model, enables bug prediction model preferably to be estimated to defective data and be fitted.
In one embodiment, model training module 150 is trained according to renewal software sample collection and obtains defects detection model, wraps
Include:
Wherein, Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、
TjBetween kernel function;flagi、flagjRespectively update software sample and concentrate the defect mark corresponding to i-th, j sample vector
Note, λi、λjFor defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;
S.t. represent constraints, C is penalty factor, M '+N represent the number for updating that software sample concentrates sample vector.
RepresentWhen taking maximum, the parameter lambda of defects detection model takes
Value;The sample vector T that software sample is concentrated will be updated respectivelyi、TjSubstitute into
Sample vector T is determined when maximum is takeniWeights λiValue, finally give renewal software sample and concentrate all sample vectors
Weights.
Failure prediction module 160 is for carrying out failure prediction according to defects detection model to software module to be measured, and exports
Predict the outcome.
Failure prediction is carried out to unknown software under testing module using bug prediction model, is predicted the outcome and is exported,
Inform that staff completes the failure prediction to software module to be measured.In one embodiment, failure prediction module 160 includes
One predicting unit and the second predicting unit.
First predicting unit for carrying out staticametric to software module to be measured, obtain the sample of software under testing module to
Amount.Specifically, carry out staticametric to software module source code to be measured, staticametric equally may include Halstead tolerance,
MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc..
Second predicting unit is for the sample vector according to software under testing module and bug prediction model to software under testing
Module carries out failure prediction.Specially:
Wherein, sample vectors of the T for software under testing module, g (T) represent the flaw labeling of software under testing module, and sgn is represented
It is rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiI-th sample vector, flag are concentrated for updating software samplei
For updating the flaw labeling corresponding to software sample i-th sample vector of concentration, λiExpression is obtained more by bug prediction model
New software sample concentrates the weights of i-th sample vector, M '+N to represent the number for updating that software sample concentrates sample vector, and b is
Constant.Equally, K (T in the present embodimenti, T) and represent sample vector Ti, kernel function between T.Carry out the mode of integer variable and lack
The definition correspondence of sunken labelling.
Above-mentioned software defect intelligent checking system, constantly increases for complexity of software continuous improvement, software size, especially
Which is huge for complicated software system defects detection workload, the problems such as defect location is difficult.By to sample software module
Classified, and software zero defect sample set is clustered and sample drawn, it is ensured that the balance of sample.After balance
Software zero defect sample set trains defects detection model, enable bug prediction model preferably defective data is carried out estimating and
Fitting, the prediction to defective data can be significantly improved, and improve prediction accuracy.
Each technical characteristic of embodiment described above arbitrarily can be combined, for making description succinct, not to above-mentioned reality
Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, the scope of this specification record is all considered to be.
Embodiment described above only expresses the several embodiments of the present invention, and its description is more concrete and detailed, but and
Therefore can not be construed as limiting the scope of the patent.It should be pointed out that for one of ordinary skill in the art comes
Say, without departing from the inventive concept of the premise, some deformations and improvement can also be made, these belong to the protection of the present invention
Scope.Therefore, the protection domain of patent of the present invention should be defined by claims.
Claims (10)
1. a kind of software defect intelligent detecting method, it is characterised in that comprise the following steps:
Obtaining sample software module carries out pretreatment, obtains software sample collection;The software sample collection includes the defective sample of software
This collection and software zero defect sample set;
Clustering processing is carried out to the software zero defect sample set, obtains clustering subset;
Stochastic sampling is carried out to the cluster subset, the software zero defect sample set after being balanced;
Software zero defect sample set according to the defective sample set of the software and after the balance obtains updating software sample collection;
It is trained according to the renewal software sample collection and obtains defects detection model;
Failure prediction is carried out according to the defects detection model to software module to be measured, and output predicts the outcome.
2. software defect intelligent detecting method according to claim 1, it is characterised in that the acquisition sample software module
Pretreatment is carried out, the step of obtain software sample collection, is comprised the following steps:
Process is marked to sample software module respectively, the flaw labeling of each sample software module is obtained;
Staticametric is carried out to sample software module respectively, the sample vector of each sample software module is obtained;
According to the flaw labeling of each sample software module, the sample vector of correspondence sample software module is classified, obtains soft
The defective sample set of part and software zero defect sample set.
3. software defect intelligent detecting method according to claim 2, it is characterised in that described to the software zero defect
Sample set carries out clustering processing, obtains the step of clustering subset, comprises the following steps:
With the sample vector in the software zero defect sample set as starting point, the meanshift for calculating sample vector is vectorial, specifically
For:
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) expression radius is in the higher-dimension ball region of constant h, completely
Sufficient relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in;
Whether the meanshift vectors of judgment sample vector are more than predetermined threshold value;
If so, then using sample vector and meanshift vector sums as new sample vector, and return described with the software
Sample vector in zero defect sample set is starting point, calculates the step of the meanshift vectors of sample vector;
If it is not, then using sample vector and meanshift vector sum as sample vector central point;
Sample vector is clustered according to the central point of sample vector, obtain clustering subset.
4. software defect intelligent detecting method according to claim 1, it is characterised in that the cluster subset is carried out with
Machine sampling sample drawn number be:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor the sample of j-th cluster subset
This number, sums of the M for software zero defect module, sums of the N for software defective module.
5. software defect intelligent detecting method according to claim 1, it is characterised in that described according to the renewal software
Sample set is trained and obtains defects detection model, including:
Wherein,RepresentThe parameter lambda of defects detection model when taking maximum
Value;
Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、TjBetween
Kernel function;flagi、flagjFlaw labeling corresponding to software sample concentration i-th, j sample vector, λ are updated respectivelyi、λj
For defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;S.t. represent
Constraints, C is penalty factor, and M '+N represent the number for updating that software sample concentrates sample vector.
6. software defect intelligent detecting method according to claim 1, it is characterised in that according to the defects detection model
Failure prediction is carried out to software module to be measured, is comprised the following steps:
Staticametric is carried out to software module to be measured, the sample vector of software under testing module is obtained;
Failure prediction is carried out to software module to be measured according to the sample vector of software under testing module and bug prediction model, specifically
For:
Wherein, sample vectors of the T for software under testing module, K (Ti, T) and represent sample vector Ti, kernel function between T;G (T) table
Show the flaw labeling of software under testing module, sgn is sign function, it is right to representSeek integer variable,
When1 is taken during more than 0, when0 is taken during less than or equal to 0;TiFor
Update software sample and concentrate i-th sample vector, flagiLacking corresponding to i-th sample vector is concentrated for updating software sample
Sunken labelling, λiRepresent that the renewal software sample obtained by bug prediction model concentrates the weights of i-th sample vector, M '+N to represent
The number that software sample concentrates sample vector is updated, b is constant.
7. a kind of software defect intelligent checking system, it is characterised in that include:
Sample set sets up module, carries out pretreatment for obtaining sample software module, obtains software sample collection;The software sample
Collection includes the defective sample set of software and software zero defect sample set;
Clustering processing module, for carrying out clustering processing to the software zero defect sample set, obtains clustering subset;
Sample process module, for carrying out stochastic sampling to the cluster subset, the software zero defect sample set after being balanced;
Sample set update module, for the software zero defect sample set according to the defective sample set of the software and after the balance
Obtain updating software sample collection;
Model training module, obtains defects detection model for being trained according to the renewal software sample collection;
Failure prediction module, for carrying out failure prediction to software module to be measured according to the defects detection model, and exports pre-
Survey result.
8. software defect intelligent checking system according to claim 7, it is characterised in that the cluster subset is carried out with
Machine sampling sample drawn number be:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor the sample of j-th cluster subset
This number, sums of the M for software zero defect module, sums of the N for software defective module.
9. software defect intelligent checking system according to claim 7, it is characterised in that the model training module according to
The renewal software sample collection is trained and obtains defects detection model, including:
Wherein,RepresentThe parameter lambda of defects detection model when taking maximum
Value;
Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、TjBetween
Kernel function;flagi、flagjFlaw labeling corresponding to software sample concentration i-th, j sample vector, λ are updated respectivelyi、λj
For defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;S.t. represent
Constraints, C is penalty factor, and M '+N represent the number for updating that software sample concentrates sample vector.
10. software defect intelligent checking system according to claim 7, it is characterised in that the failure prediction module bag
Include:
First predicting unit, for carrying out staticametric to software module to be measured, obtains the sample vector of software under testing module;
Second predicting unit, for the sample vector according to software under testing module and bug prediction model to software module to be measured
Failure prediction is carried out, specially:
Wherein, sample vectors of the T for software under testing module, K (Ti, T) and represent sample vector Ti, kernel function between T;G (T) table
Show the flaw labeling of software under testing module, sgn represents rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiFor more
New software sample concentrates i-th sample vector, flagiThe defect corresponding to i-th sample vector is concentrated for updating software sample
Labelling, λiRepresent that the renewal software sample obtained by bug prediction model concentrates the weights of i-th sample vector, M '+N to represent more
New software sample concentrates the number of sample vector, and b is constant.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610964353.2A CN106528417A (en) | 2016-10-28 | 2016-10-28 | Intelligent detection method and system of software defects |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610964353.2A CN106528417A (en) | 2016-10-28 | 2016-10-28 | Intelligent detection method and system of software defects |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106528417A true CN106528417A (en) | 2017-03-22 |
Family
ID=58326311
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610964353.2A Pending CN106528417A (en) | 2016-10-28 | 2016-10-28 | Intelligent detection method and system of software defects |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106528417A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247662A (en) * | 2017-05-10 | 2017-10-13 | 中国电子产品可靠性与环境试验研究所 | software defect detection method and device |
CN107391369A (en) * | 2017-07-13 | 2017-11-24 | 武汉大学 | A kind of spanned item mesh failure prediction method based on data screening and data oversampling |
CN107391452A (en) * | 2017-07-06 | 2017-11-24 | 武汉大学 | A kind of software defect estimated number method based on data lack sampling and integrated study |
CN107391370A (en) * | 2017-07-13 | 2017-11-24 | 武汉大学 | A kind of software defect estimated number method based on data oversampling and integrated study |
CN109242106A (en) * | 2018-09-07 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | sample processing method, device, equipment and storage medium |
CN109597748A (en) * | 2017-09-30 | 2019-04-09 | 北京国双科技有限公司 | Aacode defect method for early warning and device |
CN109829483A (en) * | 2019-01-07 | 2019-05-31 | 鲁班嫡系机器人(深圳)有限公司 | Defect recognition model training method, device, computer equipment and storage medium |
CN111611177A (en) * | 2020-06-29 | 2020-09-01 | 中国人民解放军国防科技大学 | Software performance defect detection method based on configuration item performance expectation |
CN116804668A (en) * | 2023-08-23 | 2023-09-26 | 国盐检测(天津)有限责任公司 | Salt iodine content detection data identification method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050071807A1 (en) * | 2003-09-29 | 2005-03-31 | Aura Yanavi | Methods and systems for predicting software defects in an upcoming software release |
CN103823753A (en) * | 2014-01-22 | 2014-05-28 | 浙江大学 | Webpage sampling method oriented at barrier-free webpage content detection |
CN104899135A (en) * | 2015-05-14 | 2015-09-09 | 工业和信息化部电子第五研究所 | Software defect prediction method and system |
-
2016
- 2016-10-28 CN CN201610964353.2A patent/CN106528417A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050071807A1 (en) * | 2003-09-29 | 2005-03-31 | Aura Yanavi | Methods and systems for predicting software defects in an upcoming software release |
CN103823753A (en) * | 2014-01-22 | 2014-05-28 | 浙江大学 | Webpage sampling method oriented at barrier-free webpage content detection |
CN104899135A (en) * | 2015-05-14 | 2015-09-09 | 工业和信息化部电子第五研究所 | Software defect prediction method and system |
Non-Patent Citations (1)
Title |
---|
戴翔等: "基于集成混合采样的软件缺陷预测研究", 《计算机工程与科学》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247662A (en) * | 2017-05-10 | 2017-10-13 | 中国电子产品可靠性与环境试验研究所 | software defect detection method and device |
CN107247662B (en) * | 2017-05-10 | 2019-10-18 | 中国电子产品可靠性与环境试验研究所 | Software defect detection method and device |
CN107391452A (en) * | 2017-07-06 | 2017-11-24 | 武汉大学 | A kind of software defect estimated number method based on data lack sampling and integrated study |
CN107391452B (en) * | 2017-07-06 | 2020-01-07 | 武汉大学 | Software defect number prediction method based on data undersampling and ensemble learning |
CN107391369B (en) * | 2017-07-13 | 2020-03-24 | 武汉大学 | Cross-project defect prediction method based on data screening and data oversampling |
CN107391369A (en) * | 2017-07-13 | 2017-11-24 | 武汉大学 | A kind of spanned item mesh failure prediction method based on data screening and data oversampling |
CN107391370A (en) * | 2017-07-13 | 2017-11-24 | 武汉大学 | A kind of software defect estimated number method based on data oversampling and integrated study |
CN107391370B (en) * | 2017-07-13 | 2020-05-12 | 武汉大学 | Software defect number prediction method based on data oversampling and integrated learning |
CN109597748A (en) * | 2017-09-30 | 2019-04-09 | 北京国双科技有限公司 | Aacode defect method for early warning and device |
CN109242106A (en) * | 2018-09-07 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | sample processing method, device, equipment and storage medium |
CN109242106B (en) * | 2018-09-07 | 2022-07-26 | 百度在线网络技术(北京)有限公司 | Sample processing method, device, equipment and storage medium |
CN109829483A (en) * | 2019-01-07 | 2019-05-31 | 鲁班嫡系机器人(深圳)有限公司 | Defect recognition model training method, device, computer equipment and storage medium |
CN109829483B (en) * | 2019-01-07 | 2021-05-18 | 鲁班嫡系机器人(深圳)有限公司 | Defect recognition model training method and device, computer equipment and storage medium |
CN111611177A (en) * | 2020-06-29 | 2020-09-01 | 中国人民解放军国防科技大学 | Software performance defect detection method based on configuration item performance expectation |
CN111611177B (en) * | 2020-06-29 | 2023-06-09 | 中国人民解放军国防科技大学 | Software performance defect detection method based on configuration item performance expectation |
CN116804668A (en) * | 2023-08-23 | 2023-09-26 | 国盐检测(天津)有限责任公司 | Salt iodine content detection data identification method and system |
CN116804668B (en) * | 2023-08-23 | 2023-11-21 | 国盐检测(天津)有限责任公司 | Salt iodine content detection data identification method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106528417A (en) | Intelligent detection method and system of software defects | |
CN109117883B (en) | SAR image sea ice classification method and system based on long-time memory network | |
CN106469560B (en) | Voice emotion recognition method based on unsupervised domain adaptation | |
WO2020073714A1 (en) | Training sample obtaining method, account prediction method, and corresponding devices | |
CN104899135B (en) | Software Defects Predict Methods and system | |
CN106572493A (en) | Abnormal value detection method and abnormal value detection system in LTE network | |
CN104408153A (en) | Short text hash learning method based on multi-granularity topic models | |
CN105069470A (en) | Classification model training method and device | |
CN105740404A (en) | Label association method and device | |
CN108549817A (en) | A kind of software security flaw prediction technique based on text deep learning | |
CN105740984A (en) | Product concept performance evaluation method based on performance prediction | |
CN111476307B (en) | Lithium battery surface defect detection method based on depth field adaptation | |
CN108628164A (en) | A kind of semi-supervised flexible measurement method of industrial process based on Recognition with Recurrent Neural Network model | |
CN112613375A (en) | Tire damage detection and identification method and device | |
CN108764295A (en) | A kind of soft-measuring modeling method based on semi-supervised integrated study | |
CN115049627B (en) | Steel surface defect detection method and system based on domain self-adaptive depth migration network | |
CN110263934A (en) | A kind of artificial intelligence data mask method and device | |
CN103617146B (en) | A kind of machine learning method and device based on hardware resource consumption | |
CN113516228A (en) | Network anomaly detection method based on deep neural network | |
Zhang et al. | Research on surface defect detection algorithm of strip steel based on improved YOLOV3 | |
CN117152503A (en) | Remote sensing image cross-domain small sample classification method based on false tag uncertainty perception | |
CN111144462A (en) | Unknown individual identification method and device for radar signals | |
CN116152674A (en) | Dam unmanned aerial vehicle image crack intelligent recognition method based on improved U-Net model | |
CN113283467A (en) | Weak supervision picture classification method based on average loss and category-by-category selection | |
Wang et al. | Temperature forecast based on SVM optimized by PSO algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170322 |
|
RJ01 | Rejection of invention patent application after publication |