CN106528417A - Intelligent detection method and system of software defects - Google Patents

Intelligent detection method and system of software defects Download PDF

Info

Publication number
CN106528417A
CN106528417A CN201610964353.2A CN201610964353A CN106528417A CN 106528417 A CN106528417 A CN 106528417A CN 201610964353 A CN201610964353 A CN 201610964353A CN 106528417 A CN106528417 A CN 106528417A
Authority
CN
China
Prior art keywords
software
sample
module
vector
defect
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610964353.2A
Other languages
Chinese (zh)
Inventor
高岩
杨春晖
李冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electronic Product Reliability and Environmental Testing Research Institute
Original Assignee
China Electronic Product Reliability and Environmental Testing Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electronic Product Reliability and Environmental Testing Research Institute filed Critical China Electronic Product Reliability and Environmental Testing Research Institute
Priority to CN201610964353.2A priority Critical patent/CN106528417A/en
Publication of CN106528417A publication Critical patent/CN106528417A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to an intelligent detection method and system of software defects. The method comprises the following steps of obtaining and preprocessing a sample software module to obtain a software sample set; carrying out clustering processing on a software no-defect sample set to obtain clustering subsets; carrying out random sampling on the clustering subsets to obtain a balanced software no-defect sample set; according to a software defect sample set and the balanced software no-defect sample set, obtaining an updated software sample set; according to the updated software sample set, carrying out training to obtain a defect detection model; and according to the defect detection model, carrying out defect prediction on a software module to be detected, and outputting a prediction result. The sample software module is classified, and the software no-defect sample set is subjected to clustering and sample extraction to guarantee the balance of the sample. According to the balanced software no-defect sample set, the defect detection model is trained, so that the defect prediction model can more favorably estimate and fit defect data, the prediction of the defect data can be obviously improved, and prediction accuracy is improved.

Description

Software defect intelligent detecting method and system
Technical field
The present invention relates to technical field of software security, more particularly to a kind of software defect intelligent detecting method and system.
Background technology
With the development of information technology, complexity of software is improved constantly, software size constantly increases, especially for multiple Miscellaneous software system, good software quality control and testing mechanism can not only help enterprise development to go out high-quality software product Product, reduce production and maintenance cost, and are increasing customer satisfaction degree, set up good corporate image and enhancing enterprise exists The aspects such as the competitiveness on market suffer from important meaning.
Traditional software defect intelligent detecting method using the software defect forecast model based on machine learning, with software mould The metric data vector of block, is realized to software by steps such as pretreatment, feature extraction, model training and predictions as input Module is predicted with the presence or absence of defect.The model due to built in problems such as the interpretational criteria of its performance, induction bias, to software Defective module and software zero defect module are processed on an equal basis, using overall maximum predicted precision as target, but to software defect Recall rate is not still high.Therefore, traditional software defect intelligent detecting method haves the shortcomings that prediction accuracy is low.
The content of the invention
Based on this, it is necessary to for the problems referred to above, there is provided a kind of software defect Intelligent Measurement for improving prediction accuracy Method and system.
A kind of software defect intelligent detecting method, comprises the following steps:
Obtaining sample software module carries out pretreatment, obtains software sample collection;It is scarce that the software sample collection includes that software has Sunken sample set and software zero defect sample set;
Clustering processing is carried out to the software zero defect sample set, obtains clustering subset;
Stochastic sampling is carried out to the cluster subset, the software zero defect sample set after being balanced;
Software zero defect sample set according to the defective sample set of the software and after the balance obtains updating software sample This collection;
It is trained according to the renewal software sample collection and obtains defects detection model;
Failure prediction is carried out according to the defects detection model to software module to be measured, and output predicts the outcome.
A kind of software defect intelligent checking system, including:
Sample set sets up module, carries out pretreatment for obtaining sample software module, obtains software sample collection;The software Sample set includes the defective sample set of software and software zero defect sample set;
Clustering processing module, for carrying out clustering processing to the software zero defect sample set, obtains clustering subset;
Sample process module, for carrying out stochastic sampling to the cluster subset, the software zero defect sample after being balanced This collection;
Sample set update module, for the software zero defect sample according to the defective sample set of the software and after the balance This collection obtains updating software sample collection;
Model training module, obtains defects detection model for being trained according to the renewal software sample collection;
Failure prediction module is for carrying out failure prediction according to the defects detection model to software module to be measured and defeated Go out to predict the outcome.
Above-mentioned software defect intelligent detecting method and system, obtaining sample software module carries out pretreatment, obtains software sample This collection.Clustering processing is carried out to software zero defect sample set, obtains clustering subset.Stochastic sampling is carried out to clustering subset, is obtained Software zero defect sample set after balance.Obtained more according to the software zero defect sample set after the defective sample set of software and balance New software sample collection.It is trained according to renewal software sample collection and obtains defects detection model.Treated according to defects detection model Surveying software module carries out failure prediction, and output predicts the outcome.It is by classifying to sample software module and intact to software Sunken sample set is clustered and sample drawn, it is ensured that the balance of sample.Trained according to the software zero defect sample set after balance Defects detection model, enables bug prediction model preferably to be estimated to defective data and be fitted, to defective data Prediction can be significantly improved, and improve prediction accuracy.
Description of the drawings
Fig. 1 is the flow chart of software defect intelligent detecting method in an embodiment;
Fig. 2 is the structure chart of software defect intelligent checking system in an embodiment.
Specific embodiment
In one embodiment, a kind of software defect intelligent detecting method, as shown in figure 1, comprise the following steps:
Step S110:Obtaining sample software module carries out pretreatment, obtains software sample collection.
Software sample collection includes the defective sample set of software and software zero defect sample set.Sample software module refers to and is known to be The software module of no existing defects.For example, for known Q module in software, wherein software zero defect module has M, soft Part defective module has N number of, M+N=Q.According to whether existing defects are classified after sample software module is acquired, Obtain the defective sample set of software and software zero defect sample set.
In one embodiment, step S110 includes step 112 to step 116.
Step 112:Process is marked to sample software module respectively, the flaw labeling of each sample software module is obtained.
To each module i, i=1,2 ..., Q, whether there is flaw labeling flag respectively to whichi。flagi=1, exist Defect;flagi=0, no defect.It is appreciated that the mark mode of each sample software module and the flaw labeling that obtains Value is not unique, in other embodiments, or make existing defects sample software module flaw labeling be 0, order The flaw labeling of the sample software module of existing defects is not 1 etc..
Step 114:Staticametric is carried out to sample software module respectively, the sample vector of each sample software module is obtained.
To each module i, i=1,2 ..., Q, staticametric is carried out to its source code respectively.It is static in the present embodiment Tolerance specifically may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc., obtain common k degree Value, and these metrics are respectively labeled as into ti1,ti2,...,tik, constitute sample vector Ti={ ti1,ti2,...,tik}。
Step 116:According to the flaw labeling of each sample software module, the sample vector of correspondence sample software module is carried out Classification, obtains the defective sample set of software and software zero defect sample set.
The sample vector of software defective module is divided into into a class, the defective sample set { T of software is obtainedi|flagi=1 }. The sample vector of software zero defect module is divided into into a class, software zero defect sample set { T is obtainedi|flagi=0 }.Software has scarce Sunken sample set and software zero defect sample set constitute software sample collection { Ti| i=1,2 ..., Q }.
Step S120:Clustering processing is carried out to software zero defect sample set, obtains clustering subset.
To software zero defect sample set { Ti|flagi=0 } clustered, obtain clustering subset, cluster the concrete number of subset Amount is not unique.In the present embodiment, clustered using MeanShift methods, amount of calculation is little, cluster analyses speed can be improved. Step S120 includes step 122 to step 128.
Step 122:With the sample vector in software zero defect sample set as starting point, the meanshift of sample vector is calculated Vector.Specially:
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) represent higher-dimension ball region of the radius for constant h It is interior, meet relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in.Need explanation It is, (T-Ti)T(T-Ti)<h2In, the T in bracket represents sample vector, and the operative symbol T in the bracket upper right corner represents transposition.
Step 124:Whether the meanshift vectors of judgment sample vector are more than predetermined threshold value.If so, then by sample vector With meanshift vector sums as new sample vector, and return to step 122.If it is not, then carrying out step 126.Predetermined threshold value ε is to preset and can be adjusted according to practical situation, if meanshift vector MshMore than ε, with Ti+MhAs new starting point, New meanshift vector Ms are calculated againh
Step 126:Using sample vector and meanshift vector sum as sample vector central point.If MhIt is less than Or be equal to ε, then confirm Ti+MhCentered on point.Repeat step 122 is to step 126 until all sample vectors of traversal, generate in P Heart point.
Step 128:Sample vector is clustered according to the central point of sample vector, obtain clustering subset.To tend to same The sample vector of one central point is divided into a class, forms P cluster subset.
By multiple subsets of clustering method software for calculation defect distribution, software defect distribution is preferably portrayed, for entering one The sample balance sampling of step lays the foundation.
Step S130:Stochastic sampling is carried out to clustering subset, the software zero defect sample set after being balanced.
The P cluster subset formed to cluster is sampled, to ensure that sample is balanced.For the sample of j-th cluster subset This number is mj, then haveThe quantity being sampled to each cluster subset is not unique, in the present embodiment, to described Cluster subset carries out the sample drawn number of stochastic sampling:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor j-th cluster subset Sample number, M for software zero defect module sum, N for software defective module sum.
Based on defective data sample distribution, the method using stochastic sampling under each subset is chosen sample, realizes sample Between balance.Software zero defect collection { T after being balancedi'|flagi=0 }, number of samples is
Step S140:Obtain updating software sample according to the software zero defect sample set after the defective sample set of software and balance This collection.
Software defective sample set is merged with the software zero defect sample set after balance, as renewal software sample collection {Ti' | i=1,2 ..., M'+N }.
Step S150:It is trained according to renewal software sample collection and obtains defects detection model.
Under the higher-dimension sample space after balance, the data digging method training defects detection model being adapted to is selected.According to Software zero defect sample set training defects detection model after balance, enables bug prediction model preferably to enter to defective data Row is estimated and is fitted.In one embodiment, step S150 includes:
Wherein, Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、 TjBetween kernel function;flagi、flagjRespectively update software sample and concentrate the defect mark corresponding to i-th, j sample vector Note, λi、λjFor defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector; S.t. represent constraints, C is penalty factor, M '+N represent the number for updating that software sample concentrates sample vector.
RepresentThe parameter lambda of defects detection model when taking maximum Value;The sample vector T that software sample is concentrated will be updated respectivelyi、TjSubstitute into Sample vector T is determined when maximum is takeniWeights λiValue, finally give renewal software sample and concentrate all sample vectors Weights.
Step S160:Failure prediction is carried out according to defects detection model to software module to be measured, and output predicts the outcome.
Failure prediction is carried out to unknown software under testing module using bug prediction model, is predicted the outcome and is exported, Inform that staff completes the failure prediction to software module to be measured.In one embodiment, step S160 include step 162 and Step 164.
Step 162:Staticametric is carried out to software module to be measured, the sample vector of software under testing module is obtained.Specifically, Carry out staticametric to software module source code to be measured, staticametric equally may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar is measured and CK is measured etc..
Step 164:Software module to be measured is carried out according to the sample vector of software under testing module and bug prediction model Failure prediction.Specially:
Wherein, sample vectors of the T for software under testing module, g (T) represent the flaw labeling of software under testing module, and sgn is represented It is rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiI-th sample vector, flag are concentrated for updating software samplei For updating the flaw labeling corresponding to software sample i-th sample vector of concentration, λiExpression is obtained more by bug prediction model New software sample concentrates the weights of i-th sample vector, M '+N to represent the number for updating that software sample concentrates sample vector, and b is Constant.Equally, K (T in the present embodimenti, T) and represent sample vector Ti, kernel function between T.Carry out the mode of integer variable and lack The definition correspondence of sunken labelling.
In actual applications, as the probability of software failure is compared, normal probability is relatively low, and software failure module accounts for software Total number of modules is relatively fewer.However, when these a small amount of software failure modules are undiscovered, once put into actually used, institute The economic loss and social loss brought is immeasurable.Also, as software failure module accounts for software module sum relatively Less, this causes which when software defect detection model training sample data differ greatly, and intelligent detecting method produces skewed popularity.
Above-mentioned software defect intelligent detecting method, constantly increases for complexity of software continuous improvement, software size, especially Which is huge for complicated software system defects detection workload, the problems such as defect location is difficult.By to sample software module Classified, and software zero defect sample set is clustered and sample drawn, it is ensured that the balance of sample.After balance Software zero defect sample set trains defects detection model, enable bug prediction model preferably defective data is carried out estimating and Fitting, the prediction to defective data can be significantly improved, and improve prediction accuracy.
In one embodiment, a kind of software defect intelligent checking system, as shown in Fig. 2 set up module including sample set 110th, clustering processing module 120, sample process module 130, sample set update module 140, model training module 150 and defect are pre- Survey module 160.
Sample set sets up module 110 carries out pretreatment for obtaining sample software module, obtains software sample collection.
Software sample collection includes the defective sample set of software and software zero defect sample set.Sample software module refers to and is known to be The software module of no existing defects.According to whether existing defects are classified after sample software module is acquired, obtain The defective sample set of software and software zero defect sample set.In one embodiment, sample set sets up module 110 includes the first sample This collection sets up that unit, the second sample set set up unit and the 3rd sample set sets up unit.
First sample collection sets up unit for being marked process to sample software module respectively, obtains each sample software mould The flaw labeling of block.
To each module i, i=1,2 ..., Q, whether there is flaw labeling flag respectively to whichi。flagi=1, exist Defect;flagi=0, no defect.It is appreciated that the mark mode of each sample software module and the flaw labeling that obtains Value is not unique, in other embodiments, or make existing defects sample software module flaw labeling be 0, order The flaw labeling of the sample software module of existing defects is not 1 etc..
Second sample set sets up unit for carrying out staticametric to sample software module respectively, obtains each sample software mould The sample vector of block.
To each module i, i=1,2 ..., Q, staticametric is carried out to its source code respectively.It is static in the present embodiment Tolerance specifically may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc., obtain common k degree Value, and these metrics are respectively labeled as into ti1,ti2,...,tik, constitute sample vector Ti={ ti1,ti2,...,tik}。
3rd sample set sets up unit for the flaw labeling according to each sample software module, to correspondence sample software module Sample vector classified, obtain the defective sample set of software and software zero defect sample set.
The sample vector of software defective module is divided into into a class, the defective sample set { T of software is obtainedi|flagi=1 }. The sample vector of software zero defect module is divided into into a class, software zero defect sample set { T is obtainedi|flagi=0 }.Software has scarce Sunken sample set and software zero defect sample set constitute software sample collection { Ti| i=1,2 ..., Q }.
Clustering processing module 120 obtains clustering subset for carrying out clustering processing to software zero defect sample set.
To software zero defect sample set { Ti|flagi=0 } clustered, obtain clustering subset, cluster the concrete number of subset Amount is not unique.In the present embodiment, clustered using MeanShift methods, amount of calculation is little, cluster analyses speed can be improved. Clustering processing module 120 includes first processing units, second processing unit, the 3rd processing unit and fourth processing unit.
First processing units calculate sample vector for the sample vector in software zero defect sample set as starting point Meanshift is vectorial.Specially:
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) represent higher-dimension ball region of the radius for constant h It is interior, meet relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in.Need explanation It is, (T-Ti)T(T-Ti)<h2In, the T in bracket represents sample vector, and the operative symbol T in the bracket upper right corner represents transposition.
Second processing unit is used for whether the meanshift vectors of judgment sample vector to be more than predetermined threshold value.Predetermined threshold value ε For presetting and can be adjusted according to practical situation.
3rd processing unit, in sample vector meanshift vector more than predetermined threshold value when, by sample vector with Meanshift vector sums are used as new sample vector, and control first processing units again with software zero defect sample set Sample vector be starting point, calculate sample vector meanshift it is vectorial;And it is little in the meanshift vectors of sample vector In or when being equal to predetermined threshold value, using sample vector and meanshift vector sum as sample vector central point.
If meanshift vector MshMore than ε, with Ti+MhAs new starting point, new meanshift is calculated again vectorial Mh.If MhLess than or equal to ε, then T is confirmedi+MhCentered on point.Repeat to calculate until all sample vectors of traversal, generate P central point.
Fourth processing unit is clustered to sample vector for the central point according to sample vector, obtains clustering subset. The sample vector for tending to same central point is divided into into a class, P cluster subset is formed.
By multiple subsets of clustering method software for calculation defect distribution, software defect distribution is preferably portrayed, for entering one The sample balance sampling of step lays the foundation.
Sample process module 130 for cluster subset carry out stochastic sampling, the software zero defect sample after being balanced Collection.
The P cluster subset formed to cluster is sampled, to ensure that sample is balanced.For the sample of j-th cluster subset This number is mj, then haveThe quantity being sampled to each cluster subset is not unique, in the present embodiment, to described Cluster subset carries out the sample drawn number of stochastic sampling:
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor j-th cluster subset Sample number, M for software zero defect module sum, N for software defective module sum.
Based on defective data sample distribution, the method using stochastic sampling under each subset is chosen sample, realizes sample Between balance.Software zero defect collection { T after being balancedi'|flagi=0 }, number of samples is
Sample set update module 140 for according to the defective sample set of software and balance after software zero defect sample set obtain To renewal software sample collection.
Software defective sample set is merged with the software zero defect sample set after balance, as renewal software sample collection {Ti' | i=1,2 ..., M'+N }.
Model training module 150 obtains defects detection model for being trained according to renewal software sample collection.In balance Under higher-dimension sample space afterwards, the data digging method training defects detection model being adapted to is selected.According to the software after balance without Defect sample collection training defects detection model, enables bug prediction model preferably to be estimated to defective data and be fitted. In one embodiment, model training module 150 is trained according to renewal software sample collection and obtains defects detection model, wraps Include:
Wherein, Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、 TjBetween kernel function;flagi、flagjRespectively update software sample and concentrate the defect mark corresponding to i-th, j sample vector Note, λi、λjFor defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector; S.t. represent constraints, C is penalty factor, M '+N represent the number for updating that software sample concentrates sample vector.
RepresentWhen taking maximum, the parameter lambda of defects detection model takes Value;The sample vector T that software sample is concentrated will be updated respectivelyi、TjSubstitute into Sample vector T is determined when maximum is takeniWeights λiValue, finally give renewal software sample and concentrate all sample vectors Weights.
Failure prediction module 160 is for carrying out failure prediction according to defects detection model to software module to be measured, and exports Predict the outcome.
Failure prediction is carried out to unknown software under testing module using bug prediction model, is predicted the outcome and is exported, Inform that staff completes the failure prediction to software module to be measured.In one embodiment, failure prediction module 160 includes One predicting unit and the second predicting unit.
First predicting unit for carrying out staticametric to software module to be measured, obtain the sample of software under testing module to Amount.Specifically, carry out staticametric to software module source code to be measured, staticametric equally may include Halstead tolerance, MaCabe tolerance, Khoshgoftaar tolerance and CK tolerance etc..
Second predicting unit is for the sample vector according to software under testing module and bug prediction model to software under testing Module carries out failure prediction.Specially:
Wherein, sample vectors of the T for software under testing module, g (T) represent the flaw labeling of software under testing module, and sgn is represented It is rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiI-th sample vector, flag are concentrated for updating software samplei For updating the flaw labeling corresponding to software sample i-th sample vector of concentration, λiExpression is obtained more by bug prediction model New software sample concentrates the weights of i-th sample vector, M '+N to represent the number for updating that software sample concentrates sample vector, and b is Constant.Equally, K (T in the present embodimenti, T) and represent sample vector Ti, kernel function between T.Carry out the mode of integer variable and lack The definition correspondence of sunken labelling.
Above-mentioned software defect intelligent checking system, constantly increases for complexity of software continuous improvement, software size, especially Which is huge for complicated software system defects detection workload, the problems such as defect location is difficult.By to sample software module Classified, and software zero defect sample set is clustered and sample drawn, it is ensured that the balance of sample.After balance Software zero defect sample set trains defects detection model, enable bug prediction model preferably defective data is carried out estimating and Fitting, the prediction to defective data can be significantly improved, and improve prediction accuracy.
Each technical characteristic of embodiment described above arbitrarily can be combined, for making description succinct, not to above-mentioned reality Apply all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, the scope of this specification record is all considered to be.
Embodiment described above only expresses the several embodiments of the present invention, and its description is more concrete and detailed, but and Therefore can not be construed as limiting the scope of the patent.It should be pointed out that for one of ordinary skill in the art comes Say, without departing from the inventive concept of the premise, some deformations and improvement can also be made, these belong to the protection of the present invention Scope.Therefore, the protection domain of patent of the present invention should be defined by claims.

Claims (10)

1. a kind of software defect intelligent detecting method, it is characterised in that comprise the following steps:
Obtaining sample software module carries out pretreatment, obtains software sample collection;The software sample collection includes the defective sample of software This collection and software zero defect sample set;
Clustering processing is carried out to the software zero defect sample set, obtains clustering subset;
Stochastic sampling is carried out to the cluster subset, the software zero defect sample set after being balanced;
Software zero defect sample set according to the defective sample set of the software and after the balance obtains updating software sample collection;
It is trained according to the renewal software sample collection and obtains defects detection model;
Failure prediction is carried out according to the defects detection model to software module to be measured, and output predicts the outcome.
2. software defect intelligent detecting method according to claim 1, it is characterised in that the acquisition sample software module Pretreatment is carried out, the step of obtain software sample collection, is comprised the following steps:
Process is marked to sample software module respectively, the flaw labeling of each sample software module is obtained;
Staticametric is carried out to sample software module respectively, the sample vector of each sample software module is obtained;
According to the flaw labeling of each sample software module, the sample vector of correspondence sample software module is classified, obtains soft The defective sample set of part and software zero defect sample set.
3. software defect intelligent detecting method according to claim 2, it is characterised in that described to the software zero defect Sample set carries out clustering processing, obtains the step of clustering subset, comprises the following steps:
With the sample vector in the software zero defect sample set as starting point, the meanshift for calculating sample vector is vectorial, specifically For:
M h = 1 K &Sigma; T &Element; S h ( T i - T )
Wherein, MhRepresent that the meanshift of sample vector T is vectorial, Sh(T) expression radius is in the higher-dimension ball region of constant h, completely Sufficient relation (T-Ti)T(T-Ti)<h2K sample vector set, TiFor Sh(T) sample vector in;
Whether the meanshift vectors of judgment sample vector are more than predetermined threshold value;
If so, then using sample vector and meanshift vector sums as new sample vector, and return described with the software Sample vector in zero defect sample set is starting point, calculates the step of the meanshift vectors of sample vector;
If it is not, then using sample vector and meanshift vector sum as sample vector central point;
Sample vector is clustered according to the central point of sample vector, obtain clustering subset.
4. software defect intelligent detecting method according to claim 1, it is characterised in that the cluster subset is carried out with Machine sampling sample drawn number be:
m j &prime; = &lsqb; m j * M N &rsqb;
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor the sample of j-th cluster subset This number, sums of the M for software zero defect module, sums of the N for software defective module.
5. software defect intelligent detecting method according to claim 1, it is characterised in that described according to the renewal software Sample set is trained and obtains defects detection model, including:
max &lambda; ( &Sigma; i = 1 M &prime; + N &lambda; i - 1 2 &Sigma; i , j &lambda; i &lambda; j flag i flag j k ( T i , T j ) ) s . t . 0 &le; &lambda; i &le; C i = 1 , 2..... M &prime; + N &Sigma; i = 1 M &prime; + N &lambda; i y i = 0
Wherein,RepresentThe parameter lambda of defects detection model when taking maximum Value;
Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、TjBetween Kernel function;flagi、flagjFlaw labeling corresponding to software sample concentration i-th, j sample vector, λ are updated respectivelyi、λj For defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;S.t. represent Constraints, C is penalty factor, and M '+N represent the number for updating that software sample concentrates sample vector.
6. software defect intelligent detecting method according to claim 1, it is characterised in that according to the defects detection model Failure prediction is carried out to software module to be measured, is comprised the following steps:
Staticametric is carried out to software module to be measured, the sample vector of software under testing module is obtained;
Failure prediction is carried out to software module to be measured according to the sample vector of software under testing module and bug prediction model, specifically For:
g ( T ) = s g n ( &Sigma; i = 1 M &prime; + N &lambda; i flag i K ( T i , T ) + b )
Wherein, sample vectors of the T for software under testing module, K (Ti, T) and represent sample vector Ti, kernel function between T;G (T) table Show the flaw labeling of software under testing module, sgn is sign function, it is right to representSeek integer variable, When1 is taken during more than 0, when0 is taken during less than or equal to 0;TiFor Update software sample and concentrate i-th sample vector, flagiLacking corresponding to i-th sample vector is concentrated for updating software sample Sunken labelling, λiRepresent that the renewal software sample obtained by bug prediction model concentrates the weights of i-th sample vector, M '+N to represent The number that software sample concentrates sample vector is updated, b is constant.
7. a kind of software defect intelligent checking system, it is characterised in that include:
Sample set sets up module, carries out pretreatment for obtaining sample software module, obtains software sample collection;The software sample Collection includes the defective sample set of software and software zero defect sample set;
Clustering processing module, for carrying out clustering processing to the software zero defect sample set, obtains clustering subset;
Sample process module, for carrying out stochastic sampling to the cluster subset, the software zero defect sample set after being balanced;
Sample set update module, for the software zero defect sample set according to the defective sample set of the software and after the balance Obtain updating software sample collection;
Model training module, obtains defects detection model for being trained according to the renewal software sample collection;
Failure prediction module, for carrying out failure prediction to software module to be measured according to the defects detection model, and exports pre- Survey result.
8. software defect intelligent checking system according to claim 7, it is characterised in that the cluster subset is carried out with Machine sampling sample drawn number be:
m j &prime; = &lsqb; m j * M N &rsqb;
Wherein, mj' it is to cluster subset to j-th to carry out the sample drawn number of stochastic sampling, mjFor the sample of j-th cluster subset This number, sums of the M for software zero defect module, sums of the N for software defective module.
9. software defect intelligent checking system according to claim 7, it is characterised in that the model training module according to The renewal software sample collection is trained and obtains defects detection model, including:
max &lambda; ( &Sigma; i = 1 M &prime; + N &lambda; i - 1 2 &Sigma; i , j &lambda; i &lambda; j flag i flag j k ( T i , T j ) ) s . t . 0 &le; &lambda; i &le; C i = 1 , 2..... M &prime; + N &Sigma; i = 1 M &prime; + N &lambda; i y i = 0
Wherein,RepresentThe parameter lambda of defects detection model when taking maximum Value;
Ti、TjRespectively update software sample and concentrate i-th, j sample vector, k (Ti,Tj) represent sample vector Ti、TjBetween Kernel function;flagi、flagjFlaw labeling corresponding to software sample concentration i-th, j sample vector, λ are updated respectivelyi、λj For defects detection model parameter to be trained, represent that updating software sample concentrates the i-th, weights of j sample vector;S.t. represent Constraints, C is penalty factor, and M '+N represent the number for updating that software sample concentrates sample vector.
10. software defect intelligent checking system according to claim 7, it is characterised in that the failure prediction module bag Include:
First predicting unit, for carrying out staticametric to software module to be measured, obtains the sample vector of software under testing module;
Second predicting unit, for the sample vector according to software under testing module and bug prediction model to software module to be measured Failure prediction is carried out, specially:
g ( T ) = s g n ( &Sigma; i = 1 M &prime; + N &lambda; i flag i K ( T i , T ) + b )
Wherein, sample vectors of the T for software under testing module, K (Ti, T) and represent sample vector Ti, kernel function between T;G (T) table Show the flaw labeling of software under testing module, sgn represents rightInteger variable is sought, when1 is taken during more than 0, when0 is taken during less than or equal to 0;TiFor more New software sample concentrates i-th sample vector, flagiThe defect corresponding to i-th sample vector is concentrated for updating software sample Labelling, λiRepresent that the renewal software sample obtained by bug prediction model concentrates the weights of i-th sample vector, M '+N to represent more New software sample concentrates the number of sample vector, and b is constant.
CN201610964353.2A 2016-10-28 2016-10-28 Intelligent detection method and system of software defects Pending CN106528417A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610964353.2A CN106528417A (en) 2016-10-28 2016-10-28 Intelligent detection method and system of software defects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610964353.2A CN106528417A (en) 2016-10-28 2016-10-28 Intelligent detection method and system of software defects

Publications (1)

Publication Number Publication Date
CN106528417A true CN106528417A (en) 2017-03-22

Family

ID=58326311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610964353.2A Pending CN106528417A (en) 2016-10-28 2016-10-28 Intelligent detection method and system of software defects

Country Status (1)

Country Link
CN (1) CN106528417A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247662A (en) * 2017-05-10 2017-10-13 中国电子产品可靠性与环境试验研究所 software defect detection method and device
CN107391369A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of spanned item mesh failure prediction method based on data screening and data oversampling
CN107391452A (en) * 2017-07-06 2017-11-24 武汉大学 A kind of software defect estimated number method based on data lack sampling and integrated study
CN107391370A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of software defect estimated number method based on data oversampling and integrated study
CN109242106A (en) * 2018-09-07 2019-01-18 百度在线网络技术(北京)有限公司 sample processing method, device, equipment and storage medium
CN109597748A (en) * 2017-09-30 2019-04-09 北京国双科技有限公司 Aacode defect method for early warning and device
CN109829483A (en) * 2019-01-07 2019-05-31 鲁班嫡系机器人(深圳)有限公司 Defect recognition model training method, device, computer equipment and storage medium
CN111611177A (en) * 2020-06-29 2020-09-01 中国人民解放军国防科技大学 Software performance defect detection method based on configuration item performance expectation
CN116804668A (en) * 2023-08-23 2023-09-26 国盐检测(天津)有限责任公司 Salt iodine content detection data identification method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050071807A1 (en) * 2003-09-29 2005-03-31 Aura Yanavi Methods and systems for predicting software defects in an upcoming software release
CN103823753A (en) * 2014-01-22 2014-05-28 浙江大学 Webpage sampling method oriented at barrier-free webpage content detection
CN104899135A (en) * 2015-05-14 2015-09-09 工业和信息化部电子第五研究所 Software defect prediction method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050071807A1 (en) * 2003-09-29 2005-03-31 Aura Yanavi Methods and systems for predicting software defects in an upcoming software release
CN103823753A (en) * 2014-01-22 2014-05-28 浙江大学 Webpage sampling method oriented at barrier-free webpage content detection
CN104899135A (en) * 2015-05-14 2015-09-09 工业和信息化部电子第五研究所 Software defect prediction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
戴翔等: "基于集成混合采样的软件缺陷预测研究", 《计算机工程与科学》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247662A (en) * 2017-05-10 2017-10-13 中国电子产品可靠性与环境试验研究所 software defect detection method and device
CN107247662B (en) * 2017-05-10 2019-10-18 中国电子产品可靠性与环境试验研究所 Software defect detection method and device
CN107391452A (en) * 2017-07-06 2017-11-24 武汉大学 A kind of software defect estimated number method based on data lack sampling and integrated study
CN107391452B (en) * 2017-07-06 2020-01-07 武汉大学 Software defect number prediction method based on data undersampling and ensemble learning
CN107391369B (en) * 2017-07-13 2020-03-24 武汉大学 Cross-project defect prediction method based on data screening and data oversampling
CN107391369A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of spanned item mesh failure prediction method based on data screening and data oversampling
CN107391370A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of software defect estimated number method based on data oversampling and integrated study
CN107391370B (en) * 2017-07-13 2020-05-12 武汉大学 Software defect number prediction method based on data oversampling and integrated learning
CN109597748A (en) * 2017-09-30 2019-04-09 北京国双科技有限公司 Aacode defect method for early warning and device
CN109242106A (en) * 2018-09-07 2019-01-18 百度在线网络技术(北京)有限公司 sample processing method, device, equipment and storage medium
CN109242106B (en) * 2018-09-07 2022-07-26 百度在线网络技术(北京)有限公司 Sample processing method, device, equipment and storage medium
CN109829483A (en) * 2019-01-07 2019-05-31 鲁班嫡系机器人(深圳)有限公司 Defect recognition model training method, device, computer equipment and storage medium
CN109829483B (en) * 2019-01-07 2021-05-18 鲁班嫡系机器人(深圳)有限公司 Defect recognition model training method and device, computer equipment and storage medium
CN111611177A (en) * 2020-06-29 2020-09-01 中国人民解放军国防科技大学 Software performance defect detection method based on configuration item performance expectation
CN111611177B (en) * 2020-06-29 2023-06-09 中国人民解放军国防科技大学 Software performance defect detection method based on configuration item performance expectation
CN116804668A (en) * 2023-08-23 2023-09-26 国盐检测(天津)有限责任公司 Salt iodine content detection data identification method and system
CN116804668B (en) * 2023-08-23 2023-11-21 国盐检测(天津)有限责任公司 Salt iodine content detection data identification method and system

Similar Documents

Publication Publication Date Title
CN106528417A (en) Intelligent detection method and system of software defects
CN109117883B (en) SAR image sea ice classification method and system based on long-time memory network
CN106469560B (en) Voice emotion recognition method based on unsupervised domain adaptation
WO2020073714A1 (en) Training sample obtaining method, account prediction method, and corresponding devices
CN104899135B (en) Software Defects Predict Methods and system
CN106572493A (en) Abnormal value detection method and abnormal value detection system in LTE network
CN104408153A (en) Short text hash learning method based on multi-granularity topic models
CN105069470A (en) Classification model training method and device
CN105740404A (en) Label association method and device
CN108549817A (en) A kind of software security flaw prediction technique based on text deep learning
CN105740984A (en) Product concept performance evaluation method based on performance prediction
CN111476307B (en) Lithium battery surface defect detection method based on depth field adaptation
CN108628164A (en) A kind of semi-supervised flexible measurement method of industrial process based on Recognition with Recurrent Neural Network model
CN112613375A (en) Tire damage detection and identification method and device
CN108764295A (en) A kind of soft-measuring modeling method based on semi-supervised integrated study
CN115049627B (en) Steel surface defect detection method and system based on domain self-adaptive depth migration network
CN110263934A (en) A kind of artificial intelligence data mask method and device
CN103617146B (en) A kind of machine learning method and device based on hardware resource consumption
CN113516228A (en) Network anomaly detection method based on deep neural network
Zhang et al. Research on surface defect detection algorithm of strip steel based on improved YOLOV3
CN117152503A (en) Remote sensing image cross-domain small sample classification method based on false tag uncertainty perception
CN111144462A (en) Unknown individual identification method and device for radar signals
CN116152674A (en) Dam unmanned aerial vehicle image crack intelligent recognition method based on improved U-Net model
CN113283467A (en) Weak supervision picture classification method based on average loss and category-by-category selection
Wang et al. Temperature forecast based on SVM optimized by PSO algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170322

RJ01 Rejection of invention patent application after publication