CN104899135B - Software Defects Predict Methods and system - Google Patents
Software Defects Predict Methods and system Download PDFInfo
- Publication number
- CN104899135B CN104899135B CN201510247157.9A CN201510247157A CN104899135B CN 104899135 B CN104899135 B CN 104899135B CN 201510247157 A CN201510247157 A CN 201510247157A CN 104899135 B CN104899135 B CN 104899135B
- Authority
- CN
- China
- Prior art keywords
- mrow
- sample
- defect
- msubsup
- msub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007547 defect Effects 0.000 title claims abstract description 238
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 173
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 19
- 238000005259 measurement Methods 0.000 claims description 15
- 230000003068 static effect Effects 0.000 claims description 13
- 238000004458 analytical method Methods 0.000 abstract description 6
- 230000001965 increasing effect Effects 0.000 abstract description 5
- 230000002950 deficient Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 7
- 238000005315 distribution function Methods 0.000 description 2
- 238000007493 shaping process Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The present invention relates to a kind of Software Defects Predict Methods and system, obtain sample software module and carry out clustering processing, obtain clustering subset.The Gaussian parameter of cluster subset is calculated, and false defect sample is generated according to Gaussian parameter, is obtained updating defect sample collection according to software defect sample set and false defect sample.It is trained according to renewal defect sample collection and obtains bug prediction model, carries out failure prediction to software module to be measured according to bug prediction model and export to predict the outcome.Cluster subset is formed by the way of cluster to sample software module, carrying out Gauss analysis calculating to cluster subset obtains Gaussian parameter, then generates false defect sample according to Gaussian parameter.It is trained by increasing more defective data generation renewal defect sample collection, improves the degree of accuracy of bug prediction model, enable bug prediction model that preferably defective data is estimated and is fitted, improve the forecasting accuracy to software defect.
Description
Technical Field
The invention relates to the technical field of software security, in particular to a software defect prediction method and a software defect prediction system.
Background
With the development of information technology, the software complexity is continuously improved, the software scale is continuously increased, and a good software quality control and prediction mechanism not only can help enterprises to develop high-quality software products and reduce the production and maintenance cost of the products, but also has important significance in the aspects of improving the customer satisfaction, establishing good enterprise images, enhancing the competitiveness of the enterprises in the market and the like. Therefore, the quality of software is more and more emphasized, and how to predict and improve the quality of software becomes one of the hot spots of current research.
The traditional software defect prediction method adopts a software defect prediction model based on machine learning, the model takes a measurement data vector of a software module as input, and whether the software module has defects or not is predicted through the steps of preprocessing, feature extraction, model training, prediction and the like. Due to the inherent problems of evaluation criteria of the performance, induction bias and the like of the model, a software defective module and a software non-defective module are treated equally, the overall maximum prediction accuracy is taken as a target, but the detection rate of software defects is still not high. Therefore, the traditional software defect prediction method has the defect of low prediction accuracy.
Disclosure of Invention
In view of the above, it is necessary to provide a software defect prediction method and system with high prediction accuracy.
A software defect prediction method comprises the following steps:
acquiring a sample software module and carrying out clustering processing to obtain a clustering subset;
calculating Gaussian parameters of the clustering subsets, and generating pseudo-defect samples according to the Gaussian parameters;
obtaining an updated defect sample set according to the software defect sample set and the pseudo defect sample;
training according to the updated defect sample set to obtain a defect prediction model;
and performing defect prediction on the software module to be tested according to the defect prediction model, and outputting a prediction result.
A software bug prediction system comprising:
the clustering module is used for acquiring the sample software module and carrying out clustering processing to obtain a clustering subset;
the calculation module is used for calculating the Gaussian parameters of the clustering subsets and generating pseudo-defect samples according to the Gaussian parameters;
the updating module is used for obtaining an updated defect sample set according to the software defect sample set and the pseudo defect sample;
the training module is used for training according to the updated defect sample set to obtain a defect prediction model;
and the prediction module is used for predicting the defects of the software module to be tested according to the defect prediction model and outputting a prediction result.
According to the software defect prediction method and system, the sample software modules are obtained and subjected to clustering processing, and the clustering subset is obtained. And calculating Gaussian parameters of the clustering subset, generating a pseudo-defect sample according to the Gaussian parameters, and obtaining an updated defect sample set according to the software defect sample set and the pseudo-defect sample. And training according to the updated defect sample set to obtain a defect prediction model, performing defect prediction on the software module to be tested according to the defect prediction model, and outputting a prediction result. And forming a clustering subset for the sample software module in a clustering mode, carrying out Gaussian analysis calculation on the clustering subset to obtain a Gaussian parameter, and then generating a pseudo-defect sample according to the Gaussian parameter. The defect prediction model can better estimate and fit the defect data by increasing more defect data to generate an updated defect sample set for training, so that the accuracy of the defect prediction model is improved, and the prediction accuracy of software defects is improved.
Drawings
FIG. 1 is a flow diagram of a software bug prediction method in one embodiment;
FIG. 2 is a flow chart of obtaining sample software modules and performing clustering to obtain a cluster subset in one embodiment;
FIG. 3 is a flowchart illustrating an embodiment of obtaining a center point of a sample vector by using each sample vector as a starting point;
FIG. 4 is a flowchart illustrating the calculation of Gaussian parameters of a cluster subset and the generation of pseudo-defect samples based on the Gaussian parameters according to an embodiment;
FIG. 5 is a flowchart illustrating defect prediction for a software module under test according to the defect prediction model in an embodiment;
FIG. 6 is a flowchart of a software defect prediction method in another embodiment;
FIG. 7 is a block diagram of a software bug prediction system in accordance with one embodiment;
FIG. 8 is a block diagram of a clustering module in one embodiment;
FIG. 9 is a block diagram of a center point calculation unit in an embodiment;
FIG. 10 is a block diagram of a compute module in one embodiment;
FIG. 11 is a block diagram of a prediction module in one embodiment;
FIG. 12 is a block diagram of a software defect prediction system in accordance with another embodiment.
Detailed Description
A software defect prediction method, as shown in fig. 1, includes the following steps:
step S110: and acquiring a sample software module and carrying out clustering processing to obtain a clustering subset. The sample software module refers to a software module which is known to have defects or not, and the sample software module is classified through clustering processing to obtain a clustering subset. In one embodiment, as shown in fig. 2, step S110 includes steps S112 to S118.
Step S112: and respectively marking the sample software modules to obtain the defect marks of the sample software modules. For example, for the ith sample software module, i 1,2i1 is ═ 1; if there is no defect, the defect mark yi0. It can be understood that the marking mode of each sample software module and the value of the obtained defect mark are not unique, and in other embodiments, the defect of the sample software module with the defect may be marked as 0, and the defect of the sample software module without the defect may be marked as 1, etc.
Step S114: and respectively carrying out static measurement on the sample software modules to obtain sample vectors of the sample software modules. For the ith sample software module, i 1, 2. In this embodiment, the static metric may specifically include Halstead metric, MaCabe metric, Khoshgoftaar metric, CK metric, and the like, n metric values are obtained, and the metric values are respectively marked as xi1,xi2,...,xinSample vector x forming a sample software modulei={xi1,xi2,...,xinAll sample vectors constitute a software defect sample set xi|i=1,2,...,M}。
Step S116: and respectively taking each sample vector as a starting point to obtain the central point of the sample vector. And respectively taking each sample vector as a starting point, and calculating the central point of the corresponding sample vector to be used as a basis for clustering. Further, in the embodiment, the MeanShift method is adopted for clustering, so that the calculated amount is small, and the clustering analysis speed can be improved. As shown in fig. 3, step S116 includes steps S1162 to S1166.
Step S1162: taking the sample vector as a starting point, calculating a meanshift vector of the sample vector. The method specifically comprises the following steps:
wherein M ishMean shift vector, S, representing sample vector xh(x) In a high-dimensional spherical region having a radius of a constant h, a relationship (x-x) is satisfiedi)T(x-xi)<h2Set of K sample vectors, xiIs Sh(x) The sample vector in (1), T, denotes transpose.
Step S1164: and judging whether the mean shift vector of the sample vector is larger than a preset threshold value. If yes, taking the sum of the sample vector and the meanshift vector as a new sample vector, and returning to the step S1162; if not, go to step S1166. The preset threshold is preset and can be adjusted according to the actual situation, if the mean shift vector MhGreater than, with xi+MhAs a new starting point, a new meanshift vector M is again calculatedh。
Step S1166: and taking the sum of the sample vector and the meanshift vector as the central point of the sample vector. If M ishIf x is less than or equal to x is confirmedi+MhIs the center point.
Repeating S1162 to S1166 for each sample vector until all sample vectors are traversed, and generating M central points.
Step S118: and clustering the sample vectors according to the central points of the sample vectors to obtain a clustering subset. And dividing the sample vectors which tend to the same central point into a class to form L clustering subsets.
Step S120: and calculating Gaussian parameters of the clustering subsets, and generating a pseudo-defect sample according to the Gaussian parameters. By constructing a software defect distribution function of mixed gauss, the software defect distribution is better described. And then, calculating the Gaussian parameters by using the relation between the defect distribution and the sample vector, and laying a foundation for further pseudo-defect generation.
And performing Gaussian parameter estimation on each cluster subset. Assume the kth subset of clusters asNumber of samples Mk. In one embodiment, the gaussian parameters include mean and variance. As shown in fig. 4, step S120 includes steps S121 to S125.
Step S121: the mean of the cluster subsets is calculated. The method specifically comprises the following steps:
wherein, mukDenotes the mean value, MkFor sample vectors in cluster subsetsThe number of the (c) component(s),represents the mean value μkThe metric value in the nth dimension.
Step S122: the variance of the cluster subsets in each dimension is calculated. The method specifically comprises the following steps:
wherein,representing the variance of the clustering subset in the j-th dimension, n being the dimension,representing a sample vectorThe value of the metric in the j-th dimension,represents the mean value μkMetric values in the j-th dimension.
Step S123: and correspondingly generating random numbers of the clustering subsets in each dimension according to the variance of the clustering subsets in each dimension.
In particular, for the j-th dimension, according to a Gaussian distributionGenerating random numbersSpecifically, 12 are selected as [0,1 ]]Random variable uniformly distributed onThenIt can be understood that the constant value adopted in the calculation of the random number is not unique and can be adjusted according to the actual situation.
Step S124: and obtaining a random vector of the clustering subset according to the random number of the clustering subset in each dimension. After calculating the random quantity according to step S123 for each dimension, a random vector is obtained, which specifically is:
wherein, Delta ΛkIn the form of a random vector, the vector is,is a random number in the nth dimension for the subset of clusters.
Step S125: and obtaining a pseudo defect sample according to the mean value and the random vector of the clustering subset. The calculation of the pseudo-defect sample is specifically as follows:
where t is a pseudo-defect sample, μkMean of a subset of clusters, Δ ΛkIs a random vector.
Repeating the steps S121 to S125 for each cluster subset to obtain L pseudo-defect samples, where the pseudo-defect samples represent sample vectors of the virtually obtained software modules with defects, and the defect flag corresponding to each pseudo-defect sample may be set to be 0.
Step S130: and obtaining an updated defect sample set according to the software defect sample set and the pseudo defect sample. Assume a set of pseudo-defect samples as tiL, total L pseudo-defect samples. Original software defect sample set xi1,2, M and a set of pseudo-defect samples { t |iL | -1, 2.,. L } are merged to obtain an updated defect sample set { x'ii=1,2,...,M+L}。
Step S140: and training according to the updated defect sample set to obtain a defect prediction model. And more defect data are added to generate an updated defect sample set for training, so that the accuracy of the defect prediction model is improved, and the defect prediction model can better estimate and fit the defect data.
In one embodiment, step S140 is to train a defect prediction model based on a support vector machine according to the updated defect sample set, specifically:
wherein x isi、xjRespectively for updating ith and j sample vectors, y in the defect sample seti、yjRespectively updating defect marks, lambda, corresponding to ith and j sample vectors in the defect sample seti、λjRespectively representing the weights of ith and j sample vectors in the updated defect sample set; and S.T. represents a constraint condition, C is a constant, and M + L represents the number of sample vectors in the updated defect sample set. In this example, k (x)i,xj) Represents a pair xiAnd xjThe dot product may be calculated in other embodiments by other calculation methods.
To representTaking the value of lambda when the maximum value is obtained; respectively updating sample vectors x in defect sample seti、xjSubstitution intoDetermining a sample vector x when taking the maximum valueiWeight of (lambda)jAnd finally obtaining the weight of all sample vectors in the updated defect sample set.
Step S150: and performing defect prediction on the software module to be tested according to the defect prediction model, and outputting a prediction result. And performing defect prediction on the unknown software module to be tested by using the defect prediction model to obtain and output a prediction result, and informing a worker of completing the defect prediction on the software module to be tested.
In one embodiment, as shown in fig. 5, step S150 includes step S152 and step S154.
Step S152: and performing static measurement on the software module to be measured to obtain a sample vector of the software module to be measured. The specific process of performing static measurement on the software module to be tested is similar to step S114, and is not described herein again.
Step S154: and performing defect prediction on the software module to be tested according to the sample vector of the software module to be tested and the defect prediction model. The method specifically comprises the following steps:
wherein g (x) represents the defect mark of the software module to be tested, sgn represents the pairFind an integer variable, whenGreater than 0 is taken as 10 is selected when the value is less than or equal to 0; x is the number ofiFor updating the ith sample vector, y, in the defect sample setiFor updating the defect mark, λ, corresponding to the ith sample vector in the defect sample setiRepresenting the weight of the ith sample vector in the updated defect sample set obtained by the defect prediction model, M + L representing the number of the sample vectors in the updated defect sample set, x being the sample vector of the software module to be tested, b being a constant, and K (x) in the same sample embodimentiAnd x) represents the pair xiAnd x is dot multiplied. The way of shaping the variables in this embodiment and the defect mark y if there is a defect in step S112i1 is ═ 1; defect mark y if no defect is presentiThe correspondence of 0 may be specifically adjusted according to the definition of the defect label.
In practical application, the software failure modules account for relatively less total number of software modules because the probability of software failure is lower than the normal probability. However, when these few software failure modules are mispredicted as being flawless, once put into practical use, the economic and social losses incurred are immeasurable.
According to the software defect prediction method, a sample software module forms a clustering subset in a clustering mode, Gaussian analysis calculation is carried out on the clustering subset to obtain a Gaussian parameter, and then a pseudo defect sample is generated according to the Gaussian parameter. The defect prediction model can better estimate and fit the defect data by increasing more defect data to generate an updated defect sample set for training, so that the accuracy of the defect prediction model is improved, and the prediction accuracy of software defects is improved.
In one embodiment, as shown in fig. 6, after step S150, the software defect prediction method further includes step S160.
Step S160: and outputting alarm information when the software module to be tested has defects. Output alarm information reports to the police, reminds the staff to know to the staff in time distinguishes the software module that has the defect, so that follow-up overhauls, improves the operation convenience. Specifically, the alarm can be performed through sound and light alarm, voice alarm, or alarm through displaying preset pictures or characters on a display screen.
The invention also provides a software defect prediction system, as shown in fig. 7, which comprises a clustering module 110, a calculating module 120, an updating module 130, a training module 140 and a prediction module 150.
The clustering module 110 is configured to obtain a sample software module and perform clustering processing to obtain a clustering subset. The sample software module refers to a software module which is known to have defects or not, and the sample software module is classified through clustering processing to obtain a clustering subset. In one embodiment, as shown in FIG. 8, clustering module 110 includes a labeling unit 112, a metric unit 114, a center point calculation unit 116, and a clustering unit 118.
The marking unit 112 is configured to mark the sample software modules respectively to obtain defect marks of the sample software modules. For example, for the ith sample software module, i is 1,2, and M, if there is a defect, the defect flag yi is 1; if no defect is present, the defect flag yi is 0. It can be understood that the marking mode of each sample software module and the value of the obtained defect mark are not unique, and in other embodiments, the defect of the sample software module with the defect may be marked as 0, and the defect of the sample software module without the defect may be marked as 1, etc.
The measurement unit 114 is configured to perform static measurement on the sample software modules, respectively, to obtain sample vectors of the sample software modules. For the ith sample software module, i 1, 2. In this embodiment, the static metric may specifically include Halstead metric, MaCabe metric, Khoshgoftaar metric, CK metric, and the like, n metric values are obtained, and the metric values are respectively marked as xi1,xi2,...,xinSample vector x forming a sample software modulei={xi1,xi2,...,xinAll sample vectors constitute a software defect sample set xi|i=1,2,...,M}。
The central point calculating unit 116 is configured to obtain a central point of each sample vector by using each sample vector as a starting point. And respectively taking each sample vector as a starting point, and calculating the central point of the corresponding sample vector to be used as a basis for clustering. Further, in the embodiment, the MeanShift method is adopted for clustering, so that the calculated amount is small, and the clustering analysis speed can be improved. As shown in fig. 9 in particular, the center point calculating unit 116 includes a first unit 1162, a second unit 1164, and a third unit 1166.
The first unit 1162 is configured to calculate a meanshift vector of the sample vector using the sample vector as a starting point. The method specifically comprises the following steps:
wherein M ishMean shift vector, S, representing sample vector xh(x) In a high-dimensional spherical region having a radius of a constant h, a relationship (x-x) is satisfiedi)T(x-xi)<h2Of a set of K sample vectors, xi being Sh(x) The sample vector in (1), T, denotes transpose.
The second unit 1164 is configured to determine whether the meanshift vector of the sample vector is greater than a preset threshold. The preset threshold is preset and can be adjusted according to actual conditions.
The third unit 1166 is configured to, when the mean shift vector of the sample vector is greater than the preset threshold, take the sum of the sample vector and the mean shift vector as a new sample vector, and control the first unit 1162 to calculate the mean shift vector again; and when the mean shift vector of the sample vector is less than or equal to a preset threshold value, taking the sum of the sample vector and the mean shift vector as the central point of the sample vector.
If mean shift vector MhGreater than, with xi+MhAs a new starting point, the first unit 1162 is controlled to calculate again a new meanshift vector Mh. If M ishIf x is less than or equal to x is confirmedi+MhIs the center point.
The calculation is repeated for each sample vector until all sample vectors are traversed, generating M center points.
The clustering unit 118 is configured to cluster the sample vectors according to the central points of the sample vectors to obtain a cluster subset. And dividing the sample vectors which tend to the same central point into a class to form L clustering subsets.
The calculating module 120 is configured to calculate a gaussian parameter of the clustering subset, and generate a pseudo-defect sample according to the gaussian parameter. By constructing a software defect distribution function of mixed gauss, the software defect distribution is better described. And then, calculating the Gaussian parameters by using the relation between the defect distribution and the sample vector, and laying a foundation for further pseudo-defect generation.
And performing Gaussian parameter estimation on each cluster subset. Assume the kth subset of clusters asNumber of samples Mk. In one embodiment, the gaussian parameters include mean and variance. As shown in fig. 10, the calculation block 120 includes a mean calculation unit 121, a variance calculation unit 122, a random number generation unit 123, a random vector generation unit 124, and a pseudo-defect sample generation unit 125.
The mean calculation unit 121 is configured to calculate a mean of the cluster subset. The method specifically comprises the following steps:
wherein, mukDenotes the mean value, MkFor sample vectors in cluster subsetsThe number of the (c) component(s),represents the mean value μkThe metric value in the nth dimension.
The variance calculating unit 122 is used for calculating the variance of the clustering subset in each dimension. The method specifically comprises the following steps:
wherein,representing the variance of the clustering subset in the j-th dimension, n being the dimension,representing a sample vectorMetric value in j dimension,Represents the mean value μkMetric values in the j-th dimension.
The random number generating unit 123 is configured to correspondingly generate a random number of the clustering subset in each dimension according to the variance of the clustering subset in each dimension.
In particular, for the j-th dimension, according to a Gaussian distributionGenerating random numbersSpecifically, 12 are selected as [0,1 ]]Random variable uniformly distributed onThenIt can be understood that the constant value adopted in the calculation of the random number is not unique and can be adjusted according to the actual situation.
The random vector generating unit 124 is configured to obtain a random vector of the cluster subset according to the random number of the cluster subset in each dimension. After the random quantity is calculated for each dimension, a random vector is obtained, which specifically comprises:
wherein, Delta ΛkIn the form of a random vector, the vector is,is a random number in the nth dimension for the subset of clusters.
The pseudo-defect sample generating unit 125 is configured to obtain a pseudo-defect sample according to the mean value and the random vector of the clustering subset, and specifically includes:
where t is a pseudo-defect sample, μkMean of a subset of clusters, Δ ΛkIs a random vector.
And repeatedly calculating each clustering subset to obtain L pseudo-defect samples, wherein the pseudo-defect samples represent the virtually obtained sample vectors of the software modules with the defects, and the defect marks corresponding to the pseudo-defect samples can be set to be 0.
The updating module 130 is configured to obtain an updated defect sample set according to the software defect sample set and the pseudo defect sample. Assume a set of pseudo-defect samples as tiL, total L pseudo-defect samples. Original software defect sample set xi1,2, M and a set of pseudo-defect samples { t |iL | -1, 2.,. L } are merged to obtain an updated defect sample set { x'ii=1,2,...,M+L}。
The training module 140 is configured to perform training according to the updated defect sample set to obtain a defect prediction model. And more defect data are added to generate an updated defect sample set for training, so that the accuracy of the defect prediction model is improved, and the defect prediction model can better estimate and fit the defect data.
In one embodiment, the training module 140 trains the defect prediction model based on the support vector machine according to the updated defect sample set, specifically:
wherein x isi、xjRespectively for updating ith and j sample vectors, y in the defect sample seti、yjRespectively updating defect marks, lambda, corresponding to ith and j sample vectors in the defect sample seti、λjRespectively representing the weights of ith and j sample vectors in the updated defect sample set; and S.T. represents a constraint condition, C is a constant, and M + L represents the number of sample vectors in the updated defect sample set. In this example, k (x)i,xj) Represents a pair xiAnd xjThe dot product may be calculated in other embodiments by other calculation methods.
To representTaking the value of lambda when the maximum value is obtained; respectively updating sample vectors x in defect sample seti、xjSubstitution intoDetermining a sample vector x when taking the maximum valueiWeight of (lambda)jAnd finally obtaining the weight of all sample vectors in the updated defect sample set.
The prediction module 150 is configured to perform defect prediction on the software module to be tested according to the defect prediction model, and output a prediction result. And performing defect prediction on the unknown software module to be tested by using the defect prediction model to obtain and output a prediction result, and informing a worker of completing the defect prediction on the software module to be tested.
In one embodiment, as shown in FIG. 11, prediction module 150 includes a processing unit 152 and a prediction unit 154.
The processing unit 152 is configured to perform static measurement on the software module to be tested, so as to obtain a sample vector of the software module to be tested. The specific process of performing static measurement on the software module to be tested is similar to the operation principle of the measurement unit 114, and is not described herein again.
The prediction unit 154 is configured to perform defect prediction on the software module to be tested according to the sample vector of the software module to be tested and the defect prediction model. The method specifically comprises the following steps:
wherein g (x) represents the defect mark of the software module to be tested, sgn represents the pairFind an integer variable, whenGreater than 0 is taken as 10 is selected when the value is less than or equal to 0; x is the number ofiFor updating the ith sample vector, y, in the defect sample setiFor updating the defect mark, λ, corresponding to the ith sample vector in the defect sample setiRepresenting the weight of the ith sample vector in the updated defect sample set obtained by the defect prediction model, M + L representing the number of the sample vectors in the updated defect sample set, x being the sample vector of the software module to be tested, b being a constant, and K (x) in the same sample embodimentiAnd x) represents the pair xiAnd x is dot multiplied. The method for shaping the variables and the marking unit 112 with defect in this embodiment are then the defect mark yi1 is ═ 1; defect mark y if no defect is presentiThe correspondence of 0 may be specifically adjusted according to the definition of the defect label.
In practical application, the software failure modules account for relatively less total number of software modules because the probability of software failure is lower than the normal probability. However, when these few software failure modules are mispredicted as being flawless, once put into practical use, the economic and social losses incurred are immeasurable.
According to the software defect prediction system, the sample software module forms the clustering subset in a clustering mode, Gaussian analysis calculation is carried out on the clustering subset to obtain the Gaussian parameter, and then the pseudo defect sample is generated according to the Gaussian parameter. The defect prediction model can better estimate and fit the defect data by increasing more defect data to generate an updated defect sample set for training, so that the accuracy of the defect prediction model is improved, and the prediction accuracy of software defects is improved.
In one embodiment, as shown in FIG. 12, the software bug prediction system may further include an alarm module 160.
The alarm module 160 is used for outputting alarm information when the software module to be tested has defects. Output alarm information reports to the police, reminds the staff to know to the staff in time distinguishes the software module that has the defect, so that follow-up overhauls, improves the operation convenience. Specifically, the alarm can be performed through sound and light alarm, voice alarm, or alarm through displaying preset pictures or characters on a display screen.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (8)
1. A software defect prediction method is characterized by comprising the following steps:
acquiring a sample software module and carrying out clustering processing to obtain a clustering subset;
calculating Gaussian parameters of the clustering subsets, and generating pseudo-defect samples according to the Gaussian parameters;
obtaining an updated defect sample set according to the software defect sample set and the pseudo defect sample; the software defect sample set is obtained by carrying out static measurement on a sample software module;
training according to the updated defect sample set to obtain a defect prediction model;
performing defect prediction on the software module to be tested according to the defect prediction model, and outputting a prediction result;
the Gaussian parameters comprise a mean and a variance; the method comprises the steps of calculating Gaussian parameters of the clustering subsets and generating pseudo-defect samples according to the Gaussian parameters, and comprises the following steps:
calculating the mean value of the clustering subset, specifically:
<mrow> <msup> <mi>&mu;</mi> <mi>k</mi> </msup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>M</mi> <mi>k</mi> </msub> </mfrac> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>M</mi> <mi>k</mi> </msub> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mi>k</mi> </msubsup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&mu;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
wherein, mukDenotes the mean value, MkFor sample vectors in cluster subsetsThe number of the (c) component(s),represents the mean value μkA metric value in the nth dimension;
calculating the variance of the clustering subset in each dimension, specifically:
<mrow> <msubsup> <mi>&sigma;</mi> <mi>j</mi> <mi>k</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>M</mi> <mi>k</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>M</mi> <mi>k</mi> </msub> </munderover> <msup> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>k</mi> </msubsup> <mo>-</mo> <msubsup> <mi>&mu;</mi> <mi>j</mi> <mi>k</mi> </msubsup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>...</mn> <mo>,</mo> <mi>n</mi> </mrow>
wherein,representing the variance of the clustering subset in the j-th dimension, n being the dimension,representing a sample vectorThe value of the metric in the j-th dimension,represents the mean value μkA metric value in the j-th dimension;
according to the variance of the clustering subsets in each dimension, correspondingly generating random numbers of the clustering subsets in each dimension, which specifically comprises the following steps: according to a Gaussian distribution for dimension jGenerating random numbersTaking 12 as the number of [0,1 ]]Random variable uniformly distributed onThen
Obtaining a random vector of the clustering subset according to the random number of the clustering subset in each dimension, which specifically comprises the following steps:
<mrow> <msup> <mi>&Delta;&Lambda;</mi> <mi>k</mi> </msup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
wherein, Delta ΛkIn the form of a random vector, the vector is,random numbers of the clustering subsets in the nth dimension;
obtaining a pseudo-defect sample according to the mean value and the random vector of the clustering subset, specifically:
<mrow> <mi>t</mi> <mo>=</mo> <msup> <mi>&mu;</mi> <mi>k</mi> </msup> <mo>+</mo> <msup> <mi>&Delta;&Lambda;</mi> <mi>k</mi> </msup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&mu;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
where t is a pseudo-defect sample, μkMean of a subset of clusters, Δ ΛkIs a random vector.
2. The software defect prediction method of claim 1, wherein the step of obtaining sample software modules and performing clustering to obtain a cluster subset comprises the steps of:
respectively marking the sample software modules to obtain defect marks of the sample software modules;
respectively carrying out static measurement on the sample software modules to obtain sample vectors of all the sample software modules;
respectively taking each sample vector as a starting point, and acquiring a central point of the sample vector;
and clustering the sample vectors according to the central points of the sample vectors to obtain a clustering subset.
3. The software defect prediction method of claim 2, wherein the step of calculating the center point of the sample vector using each sample vector as a starting point comprises the steps of:
taking the sample vector as a starting point, calculating a meanshift vector of the sample vector, specifically:
<mrow> <msub> <mi>M</mi> <mi>h</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>K</mi> </mfrac> <munder> <mo>&Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>&Element;</mo> <msub> <mi>S</mi> <mi>h</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> </mrow> </munder> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <mi>x</mi> <mo>)</mo> </mrow> </mrow>
wherein M ishMean shift vector, S, representing sample vector xh(x) In a high-dimensional spherical region having a radius of a constant h, a relationship (x-x) is satisfiedi)T(x-xi)<h2Set of K sample vectors, xiIs Sh(x) Sample vector of (1), T denotes transpose;
judging whether the meanshift vector of the sample vector is larger than a preset threshold value or not;
if not, taking the sum of the sample vector and the meanshift vector as the central point of the sample vector;
and if so, taking the sum of the sample vector and the meanshift vector as a new sample vector, and returning to the step of calculating the meanshift vector of the sample vector by taking the sample vector as a starting point.
4. The software defect prediction method of claim 1, wherein the step of training to obtain a defect prediction model according to the updated defect sample set specifically comprises:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <munder> <mi>max</mi> <mi>&lambda;</mi> </munder> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>M</mi> <mo>+</mo> <mi>L</mi> </mrow> </munderover> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <munder> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </munder> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <msub> <mi>&lambda;</mi> <mi>j</mi> </msub> <msub> <mi>y</mi> <mi>i</mi> </msub> <msub> <mi>y</mi> <mi>j</mi> </msub> <mi>k</mi> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> </mrow> <mo>)</mo> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>&le;</mo> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <mo>&le;</mo> <mi>C</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2......</mn> <mi>M</mi> <mo>+</mo> <mi>L</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>M</mi> <mo>+</mo> <mi>L</mi> </mrow> </munderover> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <msub> <mi>y</mi> <mi>i</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow> </mtd> </mtr> </mtable> </mfenced>
wherein,to representTaking the value of lambda when the maximum value is obtained;
xi、xjrespectively for updating ith and j sample vectors, y in the defect sample seti、yjRespectively updating defect marks, lambda, corresponding to ith and j sample vectors in the defect sample seti、λjRespectively representing the weights of ith and j sample vectors in the updated defect sample set; S.T. represents a constraint condition, C is a constant, and M + L represents the number of sample vectors in the updated defect sample set; m indicates the number of software defect samples, L indicates the number of pseudo defect samples, k (x)i,xj) Represents to the sample xiAnd xjAnd (5) calculating dot product.
5. The software defect prediction method of claim 1, wherein the step of performing defect prediction on the software module to be tested according to the defect prediction model comprises:
performing static measurement on the software module to be measured to obtain a sample vector of the software module to be measured;
performing defect prediction on the software module to be tested according to the sample vector of the software module to be tested and a defect prediction model, specifically:
<mrow> <mi>g</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>sgn</mi> <mrow> <mo>(</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>M</mi> <mo>+</mo> <mi>L</mi> </mrow> </munderover> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <msub> <mi>y</mi> <mi>i</mi> </msub> <mi>K</mi> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <mi>x</mi> </mrow> <mo>)</mo> <mo>+</mo> <mi>b</mi> <mo>)</mo> </mrow> </mrow>
wherein g (x) represents the defect mark of the software module to be tested, sgn represents the pairFind an integer variable, whenGreater than 0 is taken as 10 is selected when the value is less than or equal to 0; x is the number ofiFor updating the ith sample vector, y, in the defect sample setiFor updating the defect mark, λ, corresponding to the ith sample vector in the defect sample setiRepresenting the weight of the ith sample vector in the updated defect sample set obtained by the defect prediction model, wherein M + L represents the number of the sample vectors in the updated defect sample set, x is the sample vector of the software module to be tested, and b is a constant; m indicates the number of software defect samples, L indicates the number of pseudo defect samples, K (x)iAnd x) represents a pair of samples xiAnd x is dot multiplied.
6. The software defect prediction method of claim 1, further comprising a step of outputting alarm information when the software module to be tested has a defect after the step of performing defect prediction on the software module to be tested according to the defect prediction model.
7. A software bug prediction system, comprising:
the clustering module is used for acquiring the sample software module and carrying out clustering processing to obtain a clustering subset;
the calculation module is used for calculating the Gaussian parameters of the clustering subsets and generating pseudo-defect samples according to the Gaussian parameters;
the updating module is used for obtaining an updated defect sample set according to the software defect sample set and the pseudo defect sample; the software defect sample set is obtained by carrying out static measurement on a sample software module;
the training module is used for training according to the updated defect sample set to obtain a defect prediction model;
the prediction module is used for predicting the defects of the software module to be tested according to the defect prediction model and outputting a prediction result;
the Gaussian parameters comprise a mean and a variance; the calculation module comprises:
the mean value calculating unit is used for calculating a mean value of the clustering subset, and specifically comprises the following steps:
<mrow> <msup> <mi>&mu;</mi> <mi>k</mi> </msup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>M</mi> <mi>k</mi> </msub> </mfrac> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>M</mi> <mi>k</mi> </msub> </munderover> <msubsup> <mi>x</mi> <mi>i</mi> <mi>k</mi> </msubsup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&mu;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
wherein, mukDenotes the mean value, MkFor sample vectors in cluster subsetsThe number of the (c) component(s),represents the mean value μkA metric value in the nth dimension;
the variance calculating unit is used for calculating the variance of the clustering subset in each dimension, and specifically comprises the following steps:
<mrow> <msubsup> <mi>&sigma;</mi> <mi>j</mi> <mi>k</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>M</mi> <mi>k</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>M</mi> <mi>k</mi> </msub> </munderover> <msup> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mi>k</mi> </msubsup> <mo>-</mo> <msubsup> <mi>&mu;</mi> <mi>j</mi> <mi>k</mi> </msubsup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>...</mn> <mo>,</mo> <mi>n</mi> </mrow>
wherein,representing the variance of the clustering subset in the j-th dimension, n being the dimension,representing a sample vectorThe value of the metric in the j-th dimension,represents the mean value μkA metric value in the j-th dimension;
randomA number generating unit, configured to correspondingly generate a random number of the clustering subset in each dimension according to the variance of the clustering subset in each dimension, specifically: according to a Gaussian distribution for dimension jGenerating random numbersTaking 12 as the number of [0,1 ]]Random variable uniformly distributed onThen
A random vector generating unit, configured to obtain a random vector of the clustering subset according to the random number of the clustering subset in each dimension, specifically to obtain a random vector of the clustering subset
<mrow> <msup> <mi>&Delta;&Lambda;</mi> <mi>k</mi> </msup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
Wherein, Delta ΛkIn the form of a random vector, the vector is,random numbers of the clustering subsets in the nth dimension;
a pseudo-defect sample generating unit, configured to obtain a pseudo-defect sample according to the mean value and the random vector of the clustering subset, specifically:
<mrow> <mi>t</mi> <mo>=</mo> <msup> <mi>&mu;</mi> <mi>k</mi> </msup> <mo>+</mo> <msup> <mi>&Delta;&Lambda;</mi> <mi>k</mi> </msup> <mo>=</mo> <mo>{</mo> <msubsup> <mi>&mu;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>1</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mn>2</mn> <mi>k</mi> </msubsup> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msubsup> <mi>&mu;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>+</mo> <msubsup> <mi>&Delta;&lambda;</mi> <mi>n</mi> <mi>k</mi> </msubsup> <mo>}</mo> </mrow>
where t is a pseudo-defect sample, μkMean of a subset of clusters, Δ ΛkIs a random vector.
8. The software defect prediction system of claim 7, wherein the clustering module comprises:
the marking unit is used for respectively marking the sample software modules to obtain defect marks of the sample software modules;
the measurement unit is used for respectively carrying out static measurement on the sample software modules to obtain sample vectors of all the sample software modules;
the central point calculating unit is used for respectively taking each sample vector as a starting point and acquiring the central point of the sample vector;
and the clustering unit is used for clustering the sample vectors according to the central points of the sample vectors to obtain a clustering subset.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510247157.9A CN104899135B (en) | 2015-05-14 | 2015-05-14 | Software Defects Predict Methods and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510247157.9A CN104899135B (en) | 2015-05-14 | 2015-05-14 | Software Defects Predict Methods and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104899135A CN104899135A (en) | 2015-09-09 |
CN104899135B true CN104899135B (en) | 2017-10-20 |
Family
ID=54031810
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510247157.9A Active CN104899135B (en) | 2015-05-14 | 2015-05-14 | Software Defects Predict Methods and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104899135B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106021115B (en) * | 2016-06-06 | 2018-07-10 | 重庆大学 | Unsupervised failure prediction method based on probability |
CN106528417A (en) * | 2016-10-28 | 2017-03-22 | 中国电子产品可靠性与环境试验研究所 | Intelligent detection method and system of software defects |
CN108182141B (en) * | 2016-12-08 | 2020-12-08 | 中国电子产品可靠性与环境试验研究所 | Software quality evaluation method and system |
CN106708738B (en) * | 2016-12-23 | 2020-02-11 | 上海斐讯数据通信技术有限公司 | Software test defect prediction method and system |
CN106919505B (en) * | 2017-02-20 | 2019-07-05 | 中国电子产品可靠性与环境试验研究所 | Software Defects Predict Methods and device |
CN107239798B (en) * | 2017-05-24 | 2020-06-09 | 武汉大学 | Feature selection method for predicting number of software defects |
CN107577605A (en) * | 2017-09-04 | 2018-01-12 | 南京航空航天大学 | A kind of feature clustering system of selection of software-oriented failure prediction |
CN109597748A (en) * | 2017-09-30 | 2019-04-09 | 北京国双科技有限公司 | Aacode defect method for early warning and device |
CN109656808B (en) * | 2018-11-07 | 2022-03-11 | 江苏工程职业技术学院 | Software defect prediction method based on hybrid active learning strategy |
CN111782548B (en) * | 2020-07-28 | 2022-04-05 | 南京航空航天大学 | Software defect prediction data processing method and device and storage medium |
CN113204482B (en) * | 2021-04-21 | 2022-09-13 | 武汉大学 | Heterogeneous defect prediction method and system based on semantic attribute subset division and metric matching |
US11645188B1 (en) | 2021-11-16 | 2023-05-09 | International Business Machines Corporation | Pull request risk prediction for bug-introducing changes |
CN114791886B (en) * | 2022-06-21 | 2022-09-23 | 纬创软件(武汉)有限公司 | Software problem tracking method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101556553A (en) * | 2009-03-27 | 2009-10-14 | 中国科学院软件研究所 | Defect prediction method and system based on requirement change |
CN103810101A (en) * | 2014-02-19 | 2014-05-21 | 北京理工大学 | Software defect prediction method and system |
CN103810102A (en) * | 2014-02-19 | 2014-05-21 | 北京理工大学 | Method and system for predicting software defects |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050071807A1 (en) * | 2003-09-29 | 2005-03-31 | Aura Yanavi | Methods and systems for predicting software defects in an upcoming software release |
US8296724B2 (en) * | 2009-01-15 | 2012-10-23 | Raytheon Company | Software defect forecasting system |
-
2015
- 2015-05-14 CN CN201510247157.9A patent/CN104899135B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101556553A (en) * | 2009-03-27 | 2009-10-14 | 中国科学院软件研究所 | Defect prediction method and system based on requirement change |
CN103810101A (en) * | 2014-02-19 | 2014-05-21 | 北京理工大学 | Software defect prediction method and system |
CN103810102A (en) * | 2014-02-19 | 2014-05-21 | 北京理工大学 | Method and system for predicting software defects |
Non-Patent Citations (3)
Title |
---|
基于高斯分布随机样本生成的小样本聚类算法;丁智;《电脑知识与技术》;20131031;第9卷(第29期);第6609页第1节 * |
聚类分析在软件缺陷度量应用中的研究;杨林 等;《计算机工程与应用》;20091231(第45期);第103页第3节 * |
软件缺陷预测技术研究;乔辉;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140315(第03期);第15-16页,图2.3 * |
Also Published As
Publication number | Publication date |
---|---|
CN104899135A (en) | 2015-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104899135B (en) | Software Defects Predict Methods and system | |
CN106528417A (en) | Intelligent detection method and system of software defects | |
WO2021120788A1 (en) | Machine learning-based hydrologic forecasting precision evaluation method and system | |
CN108960303B (en) | Unmanned aerial vehicle flight data anomaly detection method based on LSTM | |
CN108257114A (en) | A kind of transmission facility defect inspection method based on deep learning | |
CN108804784A (en) | A kind of instant learning soft-measuring modeling method based on Bayes's gauss hybrid models | |
CN104699894A (en) | JITL (just-in-time learning) based multi-model fusion modeling method adopting GPR (Gaussian process regression) | |
CN109360604B (en) | Ovarian cancer molecular typing prediction system | |
Hanifah et al. | Smotebagging algorithm for imbalanced dataset in logistic regression analysis (case: Credit of bank x) | |
CN107767191A (en) | A kind of method based on medical big data prediction medicine sales trend | |
CN105069470A (en) | Classification model training method and device | |
CN104899327A (en) | Method for detecting abnormal time sequence without class label | |
US20190360942A1 (en) | Information processing method, information processing apparatus, and program | |
CN106529185B (en) | A kind of combination forecasting method and system of ancient building displacement | |
CN107168063B (en) | Soft measurement method based on integrated variable selection type partial least square regression | |
CN110502277A (en) | A kind of bad taste detection method of code based on BP neural network | |
CN103955714A (en) | Navy detection model construction method and system and navy detection method | |
US20210224664A1 (en) | Relationship analysis device, relationship analysis method, and recording medium | |
CN114266289A (en) | Complex equipment health state assessment method | |
CN116415481A (en) | Regional landslide hazard risk prediction method and device, computer equipment and storage medium | |
CN111143768A (en) | Air quality prediction algorithm based on ARIMA-SVM combined model | |
CN106960433B (en) | It is a kind of that sonar image quality assessment method is referred to based on image entropy and the complete of edge | |
CN103279030B (en) | Dynamic soft measuring modeling method and device based on Bayesian frame | |
CN104750953B (en) | Medium and small-scale airborne substance atmospheric transport collective diffusion simulation method | |
WO2020105468A1 (en) | Information processing device, information processing system, information processing method, and non-transitory computer-readable medium having program stored therein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |