CN111198820B - Cross-project software defect prediction method based on shared hidden layer self-encoder - Google Patents
Cross-project software defect prediction method based on shared hidden layer self-encoder Download PDFInfo
- Publication number
- CN111198820B CN111198820B CN202010001850.9A CN202010001850A CN111198820B CN 111198820 B CN111198820 B CN 111198820B CN 202010001850 A CN202010001850 A CN 202010001850A CN 111198820 B CN111198820 B CN 111198820B
- Authority
- CN
- China
- Prior art keywords
- class
- samples
- encoder
- theta
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
- G06F11/3608—Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Stored Programmes (AREA)
Abstract
The invention discloses a cross-project software defect prediction method based on a shared hidden layer self-encoder, which comprises the following steps of firstly, preprocessing a data set and dividing a training set and a test set; secondly, extracting features by adopting a self-encoder with a sharing mechanism, and respectively extracting the depth features of a training set and a testing set; and finally, introducing a focus loss function and training a classifier. The invention solves the problem of characteristic distribution difference in cross-project software defect prediction and provides a focus loss sharing hidden layer-based self-encoder technology for the first time, so that different data distributions become more similar, different weights are distributed to different types of samples by utilizing a focus loss learning technology to solve class imbalance, and different weights are given to samples which are easy to classify and samples which are difficult to classify so that a classifier can better learn samples which are difficult to classify.
Description
Technical Field
The invention belongs to the field of software engineering, and particularly relates to a cross-project software defect prediction method based on a shared hidden layer self-encoder.
Background
Software defect prediction is a research hotspot in the field of software engineering. The main aim of the method is to discover defects in software in advance in the early process of software development and improve the quality of software products. Most previous studies focused on the problem of intra-project defect prediction, mainly training a prediction model using a portion of historical data of the same project, and then testing the model's ability to predict defects using the remaining data of the same project. However, for a newly launched project, there is not enough historical data to train the model, and the performance of the in-project defect prediction will be poor. Cross-project defect prediction is therefore a viable approach when there is not enough historical defect data to build an accurate prediction model. Cross-project defect prediction is to train a prediction model using historical data of other projects and perform defect prediction on the new project. But the prediction performance of the method is still poor, the main reason is that the data distribution difference between the source project and the target project exists, and if the data distribution difference is smaller, the cross-project defect prediction effect is better. In addition, because the data set itself has class imbalance, that is, the number of non-defective classes is much larger than that of defective classes, the class imbalance problem will reduce the prediction performance of the model, so that the model can more easily identify the samples of non-defective classes, and the model will have poor prediction performance for the defective samples. Therefore, the invention is mainly provided for solving the problem of data distribution difference and class imbalance in software defect prediction.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the defects in the prior art, a cross-project software defect prediction method based on a shared hidden layer self-encoder is provided, and a shared hidden layer self-encoder method is introduced to solve the problem of data distribution difference.
The invention content is as follows: the invention relates to a cross-project software defect prediction method based on a shared hidden layer self-encoder, which comprises the following steps:
(1) dividing a pre-acquired data set into a training set and a testing set, and performing data preprocessing;
(2) extracting features by adopting a self-encoder with a sharing mechanism, and respectively extracting depth features of a training set and a test set;
(3) and (4) introducing a focus loss function and training a classifier.
Further, the preprocessing of the step (1) is realized by the following formula:
wherein, P i The characteristic value after normalization preprocessing is carried out on a given characteristic x, x is the given characteristic, the maximum value and the minimum value of the given characteristic x are respectively expressed as max (x) and min (x), x i For each eigenvalue of the characteristic x.
Further, the self-encoder of the sharing mechanism in step (2) is a self-encoder that adds a sharing parameter mechanism to obtain a sharing hidden layer, and the implementation process is as follows: by minimizing the reconstruction error L (theta) all ) To obtain a depth characterization L (theta) of the hidden layer all ),L(θ all ) Comprises two parts: l (theta) tr ) And L (theta) te ),L(θ tr ) And L (θ) te ) Defined in the form:
wherein, L (theta) tr ) Euclidean distance for input and output of training data to represent reconstruction error between input and output of training data, L (θ) tr ) Is composed of three parts including reconstruction error loss term and intra-class loss termAnd inter-class loss term In order to be a global inter-class loss term,is a local inter-class loss term, L (θ) te ) The euclidean distance between the input and output of the test data, to represent the reconstruction error between the input and output of the test data,means that for each sample of class 0, a distance is selected among the samples of class 1The mean of the k nearest neighbor samples to it;is to select the mean of the k nearest neighbor samples from the class 0 samples for each sample of class 1.Refers to the features of the training data set after decoding,is a feature of the test data set after decoding,are samples of all classes 0 in the training data,are samples of all classes 1 in the training data,andthe decoded training data with class 0 and class 1 respectively,andrespectively, a sample mean value of class 0 and a sample mean value of class 1 after decoding the training data, while optimizing L (theta) tr ) And L (theta) te ) The final objective function is expressed as follows:
L(θ all )=L(θ tr )+rL(θ te ) (9)
wherein, L (theta) all ) All parameters theta of the network that need to be optimized for the depth characteristics of the hidden layer all Comprises the following steps:r is a regularization parameter.
Further, the focus loss function in step (3) is implemented by the following formula:
wherein N is tr Representing the number of training samples, c representing the class of labels, k representing the number of label classes,a label representing the authenticity of the tag,representing the probability of a predictive label, g (-) is an activation function, u is a sample class weight, a small weight u (0 < u < 1) is given for non-defective class samples, a large weight 1-u (0 < u < 1) is given for defective class samples, and the sum of the two weights is 1,for samples that are more easily classified in defect prediction, weights areThe smaller the value is, the more difficult samples to classify in the defect prediction, the weightThe larger the value.
Has the beneficial effects that: compared with the prior art, the invention has the following beneficial effects: the invention solves the problem of characteristic distribution difference in cross-project software defect prediction and provides a focus loss sharing hidden layer-based self-encoder technology for the first time, so that different data distributions become more similar, different weights are distributed to samples of different classes by utilizing a focus loss learning technology to solve class imbalance, different weights are given to samples easy to classify and samples difficult to classify to enable a classifier to better learn samples difficult to classify, and the problems of data distribution difference in software defect prediction and class imbalance of a data set are solved; experimental results on 10 items of the PROMISE dataset show that the proposed method achieves ideal defect prediction.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings: as shown in fig. 1, a cross-project software defect prediction method based on a shared hidden layer self-encoder includes the following steps:
step 1, dividing a training data set and a testing data set, and performing data preprocessing on the data set, wherein the specific method comprises the following steps: first, a PROMISE data set is selected, which has 20 basic metrics, and these 20 basic metrics are not in the same order of magnitude, so we should use the min-max data normalization method to convert all the metrics to the interval of 0 to 1. Given a feature x, its maximum and minimum values are represented as: max (x) and min (x). For each eigenvalue x of the eigenvalue x i The data preprocessing can be expressed as follows:
and 2, extracting features by using an improved self-encoder. A shared hidden layer self-encoder is adopted to extract features, and a sharing mechanism is added in original self-encoding to solve the problem of data distribution difference in cross-project defect prediction. Suppose thatX tr And X te Respectively a training data set and a test data set. X is an element of { X ∈ } tr ∪X te Is the set of shuffled training and test data. Where N is the number of features, N tr And N te The number of instances in the training set and test set, respectively. Conventional self-encoders attempt to find a common depth signature representation from the input data, thereby making the output as equal as possible to the input. Usually comprising two stages, encoding and decoding, given input data x i ∈X tr Then the encoding and decoding stages are represented respectively as follows:
and (3) an encoding stage: y (x) i )=f(w 1 x i +b 1 ) (2)
wherein x is i Is an input to the computer system that is,is the output, y (x) i ) Is the output of the hidden layer, f (-) is a non-linear activation function, usually a sigmoid function, w 1 ∈R m×n And w 2 ∈R n×m Is a weight matrix, b 1 ∈R m And b 2 ∈R n If it is a deviation, the network parameters from the encoder can be expressed as: θ ═ w 1 ,b 1 ,w 2 ,b 2 The updated optimization of the parameters can be achieved by minimizing the reconstruction error function L (θ), which is minimized by Adam optimizer during the training of the self-encoder, and is expressed as follows:
in order to solve the problem of data distribution difference in cross-project defect prediction, the invention improves the original self-encoder, and adds a shared parameter mechanism to obtain the self-encoder of a shared hidden layer. By minimizing the reconstruction error L (theta) all ) To obtain a depth characterization L (theta) of the hidden layer all ),L(θ all ) Comprises 2 parts: l (theta) tr ) And L (theta) te )。L(θ tr ) By calculating Euclidean distance of training data input and outputTo express their reconstruction errors, and at the same time, in order to fully utilize the label information in the source data, add intra-class loss, global inter-class loss and local inter-class loss, in order to maximize the inter-class distance and minimize the intra-class distance of the data in the source domain during the feature learning process. L (theta) tr ) Is composed of three parts including reconstruction error loss term and intra-class loss termAnd inter-class loss termWherein the reconstruction error loss term is to reconstruct the input with better output; the intra-class loss term is to keep samples of the same class in the source data sufficiently close to the class center to achieve intra-class minimization; fully considering global inter-class loss termsAnd local inter-class loss terms In order to have the class centers of the two classes sufficiently distant,in order to make each sample with the class of 0(1) as far as the center of the nearest k adjacent samples with the class of 1(0), the distance between the samples is as far as possible, so that the purpose of maximizing the inter-class distance is achieved. L (theta) te ) The reconstruction error between the input and output of the test data is represented by calculating the euclidean distance between them. L (theta) tr ) And L (theta) te ) Is defined as follows:
whereinMeans that for each sample with class 0, the mean of the k nearest neighbor samples from the sample with class 1 is selected; the same reason is thatSimilar to the previous meaning.Refers to the features of the training data set after decoding,is a feature of the test data set after decoding.Are samples of all classes 0 in the training data,are samples of all classes 1 in the training data.Andthe decoded training data have a class of 0 and a class of 1, respectively.Andrespectively, a sample mean value of class 0 and a sample mean value of class 1 after decoding the training data. Combining the above two formulas, optimizing L (theta) simultaneously tr ) And L (theta) te ) The final objective function is expressed as follows:
L(θ all )=L(θ tr )+rL(θ te ) (9)
all parameters theta of the network that need to be optimized all Comprises the following steps:r is a regularization parameter that facilitates regularization of the behavior of the self-encoder. The purpose of adding the regularization term is to make the feature distributions of the training data and the test data more and more similar by changing the value of r.
And 3, introducing a focus loss technology and training the improved focus loss classifier. In order to enable the network to discover and learn the characteristics of defective modules, the network is able to distinguish between defective modules and non-defective modules, since the number of defective modules is small due to the class imbalance problem of the data set itself. The invention introduces a focus loss technology, samples are balanced by distributing different weights to the samples of different categories in the training process, whether the samples are easy to classify is considered, the samples which are easy to classify are endowed with smaller weights, and the samples which are easy to be wrongly classified have larger weights, so that the category imbalance is relieved. And finally, training the classifier by using the deep characteristic representation of the training data obtained in the step 2. The classifier penalty C may use a cross-entropy penalty function to compute the similarity between the true label and the predicted label, defined as follows:
wherein N is tr Presentation trainingThe number of samples, c represents the category of the label, k represents the number of categories of the label, where the number of categories is 2,a label representing the authenticity of the tag,representing the probability of predicting the label, g (-) is the activation function.
Based on the classifier, we add two-part weight u anda focus loss function is proposed. u is mainly to solve the class imbalance problem, and a small weight u (0 < u < 1) is given to a class (non-defective class) with a large number of samples, and a large weight 1-u (0 < u < 1) is given to a class (defective class) with a small number of samples, and the sum of the two weights is 1, so that the numbers of samples of two different classes are balanced.The method mainly solves the problem of difficult classification of samples in the process of defect prediction learning, and for classes with a large number of samples, namely defect-free classes, the classifier can learn and judge which class belongs to more easily, so that a larger probability value is obtainedThus the easier the samples to classify in defect prediction, the weightsThe smaller the value. Vice versa, the more difficult samples to classify in defect prediction, the weightsThe larger the value is, the more the classifier places more attention to samples difficult to classify, so that the classifier can better learn the difficult classificationCharacteristics of the sample. The final focus loss function can thus be expressed in the form:
in order to verify whether the algorithm has good superiority or not, the cross-project software defect prediction algorithm based on the focus loss shared hidden layer self-encoder is compared with other 5 cross-project defect prediction methods such as TCA +, TDS, Dycom, LT and SHLA (cross-project defect prediction algorithm of the focus loss-free shared hidden layer self-encoder). The 10 items of the project were compared and verified as experimental data, respectively, as shown in table 1: where # instance represents the number of instances, # defect represents the number of defective instances, and% defect represents the proportion of defective instances to all instances.
Table 1 10 entries in the project data set used in the experiment
Datasets | #instance | #defect | %defect |
ant-1.7 | 745 | 166 | 22.28 |
camel-1.6 | 965 | 188 | 19.48 |
jedit-3.2 | 272 | 90 | 33.09 |
log4j-1.0 | 135 | 34 | 25.19 |
lucene-2.0 | 195 | 91 | 46.67 |
poi-1.5 | 237 | 141 | 59.49 |
redaktor | 176 | 27 | 15.34 |
synapse-1.0 | 157 | 16 | 10.19 |
xalan-2.6 | 885 | 411 | 46.44 |
xerces-1.3 | 453 | 69 | 15.23 |
The evaluation indexes of the prediction model are mainly F-measure and Accuracy. May be represented by the TP, FN, FP, and TN defined in Table 2, as shown in Table 2:
TABLE 2 confusion matrix
And (6) recall: the classifier predicts the defective samples in proportion to all the defective samples, i.e., call TP/(TP + FN). precision: the classifier predicts the correct prediction ratio of the defective samples, i.e., precision (TP/(TP + FP), and the ratio evaluates the correctness of the model prediction of the defective module. The F-measure index is a weighted average of the recill and precision, i.e., F-measure ═ (2 precision:)/(precision + precision). The Accuracy index is an index for evaluating the degree to which both defective and non-defective modules are correctly classified, i.e., Accuracy ═ TP + TN)/(TP + TN + FP + FN). The larger the numerical values of F-measure and Accuracy indicate the better the prediction performance of the software defect prediction model.
The experimental setup here is to select 1 of 10 items in the project from the project plan as test data (or target item), and the remaining 9 items in turn as source items (or training items). Thus there are 9 cross-project combinations for each target project, and there are 90 possible cross-project combinations for 10 projects. In the training process of the self-encoder, the model has 4 hidden layers, and the number of nodes of each hidden layer is set as: 20-15-10-10-2, where 20 refers to the characteristic dimension of the input data and 2 refers to the characteristic dimension of the incoming softmax classifier data. In the training process of the model, a ReLU activation function is adopted for each layer and the setting of the number of layers is empirically obtained, and an Adam optimizer is used when parameter optimization is performed. Each mini-batch in the experiment was set to 64, and the range setting for the hyperparameter r was: r ∈ {0.1,0.5,1,5,10,15}, and good effects were obtained when r ∈ 10.
To verify whether the algorithm herein performs well in several comparison algorithms, experiments were performed on 10 items of PROMISE, the experimental results of F-measure are shown in Table 3, and the experimental results of Accuracy are shown in Table 4:
TABLE 3 experimental results of our model and 4 comparison algorithms on F-measure
As can be seen from the experimental results in table 3: the F-measure values of our model exceed other 5-group contrast algorithms and the variation of the F-measure values ranges from 0.257 to 0.649, and our model improves the F-measure results by at least 0.019-0.418.
TABLE 4 results of our model experiments with 5 comparison algorithms on Accuracy
target | TDS | TCA+ | Dycom | LT | SHLA | Ours |
ant-1.7 | 0.680 | 0.684 | 0.674 | 0.675 | 0.631 | 0.721 |
camel-1.6 | 0.742 | 0.618 | 0.769 | 0.722 | 0.731 | 0.639 |
jedit-3.2 | 0.593 | 0.663 | 0.710 | 0.599 | 0.702 | 0.722 |
log4g-1.0 | 0.715 | 0.657 | 0.763 | 0.726 | 0.711 | 0.716 |
lucene-2.0 | 0.538 | 0.621 | 0.600 | 0.533 | 0.621 | 0.637 |
poi-1.5 | 0.559 | 0.576 | 0.435 | 0.527 | 0.611 | 0.618 |
redaktor | 0.579 | 0.556 | 0.386 | 0.648 | 0.361 | 0.495 |
synapse-1.0 | 0.761 | 0.641 | 0.796 | 0.643 | 0.592 | 0.613 |
xalan-2.6 | 0.417 | 0.591 | 0.603 | 0.531 | 0.582 | 0.611 |
xerces-1.3 | 0.714 | 0.627 | 0.764 | 0.757 | 0.810 | 0.814 |
average | 0.630 | 0.623 | 0.650 | 0.636 | 0.635 | 0.659 |
improved | 0.029 | 0.036 | 0.009 | 0.023 | 0.024 | - |
As can be seen from the experimental results in table 4: the Accuracy values of our model are improved somewhat over the other 5-group comparison algorithms, the Accuracy mean of our model is 0.659, and the results of Accuracy are improved by at least 0.009 ═ 0.659 to 0.650.
Through the above experiments, it can be seen that: TCA +, TDS, Dycom, LT and SHLA algorithms can have better F-measure value and Accuracy value on certain projects, but the model provided by the invention has better F-measure average value and Accuracy average value on the whole, and the effect is better than that of the former 5 algorithms, thereby indicating the superiority of the method provided by the invention.
In addition to the above embodiments, the present invention may have other embodiments. All technical solutions formed by adopting equivalent substitutions or equivalent transformations fall within the protection scope of the claims of the present invention.
Claims (2)
1. A cross-project software defect prediction method based on a shared hidden layer self-encoder is characterized by comprising the following steps:
(1) dividing a pre-acquired data set into a training set and a testing set, and performing data preprocessing;
(2) extracting features by adopting a self-encoder with a sharing mechanism, and respectively extracting depth features of a training set and a test set;
(3) introducing a focus loss function, and training a classifier;
the shared mechanism self-encoder in step (2) is a self-encoder for adding a shared parameter mechanism to obtain a shared hidden layer, and the implementation process is as follows: by minimizing the reconstruction error L (theta) all ) To obtain a depth characterization L (theta) of the hidden layer all ),L(θ all ) Comprises two parts: l (theta) tr ) And L (theta) te ),L(θ tr ) And L (theta) te ) Is defined as follows:
wherein, L (theta) tr ) Euclidean distance for input and output of training data to represent reconstruction error between input and output of training data, L (θ) tr ) Is composed of three parts including reconstruction error loss term and intra-class loss termAnd inter-class loss termIn order to be a global inter-class loss term,is a local inter-class loss term, L (θ) te ) The euclidean distance between the input and output of the test data, to represent the reconstruction error between the input and output of the test data,means that for each sample with class 0, the mean of the k nearest neighbor samples from the sample with class 1 is selected;for each sample with class 1, selecting the mean of the k nearest neighbor samples from the samples with class 0,refers to the features of the training data set after decoding,is a feature of the test data set after decoding,are samples of all classes 0 in the training data,are samples of all classes 1 in the training data,andthe decoded training data with class 0 and class 1 respectively,andrespectively, a sample mean value of class 0 and a sample mean value of class 1 after decoding the training data, while optimizing L (theta) tr ) And L (theta) te ) The final objective function is expressed as follows:
L(θ all )=L(θ tr )+rL(θ te ) (9)
wherein, L (θ) all ) All parameters theta of the network that need to be optimized for the depth characteristics of the hidden layer all Comprises the following steps:r is a regularization parameter;
the focus loss function in the step (3) is realized by the following formula:
wherein N is tr Representing the number of training samples, c representing the class of labels, k representing the number of label classes,a label representing the authenticity of the tag,representing the probability of a predictive label, g (-) is an activation function, u is a sample class weight, a small weight u (0 < u < 1) is given for non-defective class samples, a large weight 1-u (0 < u < 1) is given for defective class samples, and the sum of the two weights is 1,for samples that are more easily classified in defect prediction, weights areThe smaller the value is, the more difficult to classify samples in defect prediction, the weightThe larger the value.
2. The method for cross-project software defect prediction based on shared hidden layer self-encoder as claimed in claim 1, wherein the preprocessing of step (1) is implemented by the following formula:
wherein, P i The characteristic value after normalization preprocessing is carried out on a given characteristic x, wherein x is the given characteristic, the maximum value and the minimum value of the given characteristic x are respectively expressed as max (x) and min (x), and x is i For each eigenvalue of the characteristic x.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010001850.9A CN111198820B (en) | 2020-01-02 | 2020-01-02 | Cross-project software defect prediction method based on shared hidden layer self-encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010001850.9A CN111198820B (en) | 2020-01-02 | 2020-01-02 | Cross-project software defect prediction method based on shared hidden layer self-encoder |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111198820A CN111198820A (en) | 2020-05-26 |
CN111198820B true CN111198820B (en) | 2022-08-26 |
Family
ID=70746714
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010001850.9A Active CN111198820B (en) | 2020-01-02 | 2020-01-02 | Cross-project software defect prediction method based on shared hidden layer self-encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111198820B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112015659A (en) * | 2020-09-02 | 2020-12-01 | 三维通信股份有限公司 | Prediction method and device based on network model |
CN112199280B (en) * | 2020-09-30 | 2022-05-20 | 三维通信股份有限公司 | Method and apparatus for predicting software defects, storage medium, and electronic apparatus |
CN117421244B (en) * | 2023-11-17 | 2024-05-24 | 北京邮电大学 | Multi-source cross-project software defect prediction method, device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446711A (en) * | 2018-02-01 | 2018-08-24 | 南京邮电大学 | A kind of Software Defects Predict Methods based on transfer learning |
CN109710512A (en) * | 2018-12-06 | 2019-05-03 | 南京邮电大学 | Neural network software failure prediction method based on geodesic curve stream core |
-
2020
- 2020-01-02 CN CN202010001850.9A patent/CN111198820B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446711A (en) * | 2018-02-01 | 2018-08-24 | 南京邮电大学 | A kind of Software Defects Predict Methods based on transfer learning |
CN109710512A (en) * | 2018-12-06 | 2019-05-03 | 南京邮电大学 | Neural network software failure prediction method based on geodesic curve stream core |
Also Published As
Publication number | Publication date |
---|---|
CN111198820A (en) | 2020-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111198820B (en) | Cross-project software defect prediction method based on shared hidden layer self-encoder | |
CN102567464B (en) | Based on the knowledge resource method for organizing of expansion thematic map | |
CN107545275A (en) | The unbalanced data Ensemble classifier method that resampling is merged with cost sensitive learning | |
CN109919489B (en) | Enterprise asset management system and GA-BP-based enterprise equipment life prediction method | |
CN110659207A (en) | Heterogeneous cross-project software defect prediction method based on nuclear spectrum mapping migration integration | |
CN108875933A (en) | A kind of transfinite learning machine classification method and the system of unsupervised Sparse parameter study | |
CN111325264A (en) | Multi-label data classification method based on entropy | |
Wei | [Retracted] A Method of Enterprise Financial Risk Analysis and Early Warning Based on Decision Tree Model | |
CN114819056B (en) | Single-cell data integration method based on domain countermeasure and variation inference | |
CN109919236A (en) | A kind of BP neural network multi-tag classification method based on label correlation | |
CN111625578B (en) | Feature extraction method suitable for time series data in cultural science and technology fusion field | |
CN115185732A (en) | Software defect prediction method fusing genetic algorithm and deep neural network | |
Zeng et al. | Research on audit opinion prediction of listed companies based on sparse principal component analysis and kernel fuzzy clustering algorithm | |
CN117472789B (en) | Software defect prediction model construction method and device based on ensemble learning | |
CN113379037A (en) | Multi-label learning method based on supplementary label collaborative training | |
CN112418987B (en) | Method and system for rating credit of transportation unit, electronic device and storage medium | |
Chen | Financial Statement Fraud Detection based on Integrated Feature Selection and Imbalance Learning | |
CN117112399A (en) | Method and device for predicting cross-project software aging defects | |
Muthukumaran et al. | Feature Selection with Optimal Variational Auto Encoder for Financial Crisis Prediction. | |
CN116091485A (en) | Flotation process foam image feature selection method and device based on sensitive mutual information | |
CN115600913A (en) | Main data identification method for intelligent mine | |
CN114357869A (en) | Multi-objective optimization agent model design method and system based on data relation learning and prediction | |
Kashef et al. | MLIFT: enhancing multi-label classifier with ensemble feature selection | |
Zhou et al. | Category encoding method to select feature genes for the classification of bulk and single‐cell RNA‐seq data | |
CN116402241B (en) | Multi-model-based supply chain data prediction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 210003 Gulou District, Jiangsu, Nanjing new model road, No. 66 Applicant after: NANJING University OF POSTS AND TELECOMMUNICATIONS Address before: Yuen Road Qixia District of Nanjing City, Jiangsu Province, No. 9 210046 Applicant before: NANJING University OF POSTS AND TELECOMMUNICATIONS |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |