CN110751186B - Cross-project software defect prediction method based on supervised expression learning - Google Patents
Cross-project software defect prediction method based on supervised expression learning Download PDFInfo
- Publication number
- CN110751186B CN110751186B CN201910915935.5A CN201910915935A CN110751186B CN 110751186 B CN110751186 B CN 110751186B CN 201910915935 A CN201910915935 A CN 201910915935A CN 110751186 B CN110751186 B CN 110751186B
- Authority
- CN
- China
- Prior art keywords
- training
- project
- encoder
- migration
- samples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 230000007547 defect Effects 0.000 title claims abstract description 49
- 238000012549 training Methods 0.000 claims abstract description 72
- 238000013508 migration Methods 0.000 claims abstract description 37
- 230000005012 migration Effects 0.000 claims abstract description 37
- 230000008569 process Effects 0.000 claims abstract description 25
- 238000009826 distribution Methods 0.000 claims abstract description 19
- 238000002790 cross-validation Methods 0.000 claims abstract description 9
- 238000012795 verification Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 19
- 239000013598 vector Substances 0.000 claims description 7
- 238000010200 validation analysis Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 238000013526 transfer learning Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001617 migratory effect Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 238000013522 software testing Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3447—Performance evaluation by modeling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a cross-project software defect prediction method for supervised expression learning, which comprises the following steps: (1) selecting a defect data set, and preprocessing defect data; (2) training a migration self-encoder in an unsupervised pre-training mode, wherein the migration self-encoder comprises a characteristic encoding layer and a label encoding layer; (3) selecting a sample which is closest to the hidden layer feature distribution of the target project sample from all the hidden layer feature representations of the source project sample as a verification set by means of a migration cross-validation method, and taking the rest samples as a training set; (4) performing oversampling processing on a training set sample; (5) fine-tuning a migration self-encoder, selecting a model hyper-parameter and stopping a strategy in advance; (6) and inputting the preprocessed data of the target item to a migration self-encoder, and obtaining a final prediction result through the output of a label encoding layer. The method introduces the label information of the source project sample into the feature representation learning process, and improves the prediction performance of the cross-project software defect prediction model.
Description
Technical Field
The invention belongs to the technical field of software defect prediction of software engineering application, and particularly relates to a cross-project software defect prediction method based on supervised expression learning.
Background
Software defect prediction techniques predict defects that may exist in a current software project by learning and building a prediction model from historical defect data. The method can help testers to quickly find defects and greatly improve the software testing efficiency, so that the method becomes a research hotspot in the field of current software engineering.
The general method of software defect prediction is to extract various features from software code, such as Halstead metric, McCabe metric, CK metric, MOOD metric, code change metric and other object-oriented metrics, represent all code segments with feature vectors, mark the code segments according to the existence of actual defects, input the feature vectors and labeled labels to a machine learning model for training, and finally construct a software defect prediction model for predicting possible defects in new software code.
Most of the past software defect prediction methods are based on the traditional machine learning method to build a software defect prediction model. The conventional machine learning method needs to meet the following requirements for obtaining excellent performance: the data distribution of the training sample and the test sample is the same or similar, the positive and negative sample distribution is relatively balanced, and the labeled sample used for training is sufficient. However, in practical application, because of the great difficulty of manual labeling, labeled samples which can be used for training a model are very rare, and the occurrence probability of software defects is extremely low, most of the labeled samples are also non-defective samples, and the defective samples only account for a very small part. Therefore, the problems of rare labeling data and unbalanced category become the biggest two challenges for software defect prediction technology.
For the research of category imbalance, most of the current work is mainly processed by a data resampling method, such as random oversampling or a method of artificially synthesizing a few types of samples, and for the problem of scarcity of training data, one current solution idea is to train a prediction model by using defect data of different projects, which is a cross-project defect prediction technology. Because the labeled samples are rare, it is not enough to train the machine learning model by only using the labeled data acquired in a single project, and the basic idea of the cross-project defect prediction technology is to train the prediction model by using the defect data (also called source project or source domain) in other projects, and then apply the prediction model obtained by training to the software project to be predicted (also called target project or target domain), thereby relieving the problem of the rare training data to a certain extent.
However, one difficulty with cross-project software bug prediction is that the training data and the test data often do not satisfy the same or similar distributions, which is contrary to the assumptions of conventional machine learning models, and thus, conventional machine learning models cannot be used directly for cross-project bug prediction. In recent years, the migration learning method gradually starts to be applied to a cross-project software defect prediction task. One of the most widely used methods is a Transfer Component Analysis (TCA), which belongs to an unsupervised representation learning method and is characterized in that labeled information of a source domain sample cannot be utilized in a learning characterization process. In addition, such methods break apart the unsupervised feature learning process and the training process of the classifier by a divide and conquer approach, such as first learning hidden layer representations of the source and target item samples, and then retraining the machine learning classifier in this new feature space. However, the divide and conquer method itself has a problem: while an optimal solution can be obtained on a sub-problem when solving the sub-problem step by step, optimization on a sub-problem does not mean that an optimal solution of a global problem can be obtained. The features learned in the early stage may not be suitable for training the classifier in the later stage, which may cause the actual prediction capability of the final software defect prediction model to be affected.
Disclosure of Invention
One object of the present invention is: in order to make up for the defects of the method, the invention provides a cross-project software defect prediction method based on supervised expression learning. The method utilizes a migration self-encoder with a double-encoding layer structure, and can utilize label information of a source domain sample simultaneously in the process of learning hidden layer feature representation, thereby belonging to a supervised representation learning mode. In addition, through adjusting the loss function of the network, the training modes of unsupervised pre-training and supervised fine tuning of the network are respectively realized, after the unsupervised pre-training is finished to obtain a primary hidden layer feature representation, reasonable division of a training set and a verification set is realized under the background of transfer learning by means of a transfer cross verification method, and model hyper-parameters are selected according to the prediction performance on the verification set.
Another object of the invention is: a deep learning model named as a migration self-encoder is provided, the model provides an end-to-end learning mode for us, artificial subproblem division is not performed in the whole learning process, and the deep learning model is completely handed over to directly learn mapping from original input to expected output. Compared with a divide-and-conquer strategy, the learning mode of 'end-to-end' has the advantage of synergy, and the global optimal solution can be obtained more greatly. Experiments show that the supervised expression learning method can improve the effect of cross-project software defect prediction.
The technical scheme of the invention is as follows: a cross-project software defect prediction method based on supervised expression learning comprises the following steps:
step 1), defining a target item to be predicted and a source item used for training a model, and carrying out preprocessing operations such as standardization or normalization on the original data of the source item and the target item;
step 2), inputting the feature vectors of all samples in the source project and the target project into a migration self-encoder, preliminarily training the migration self-encoder in an unsupervised pre-training mode, and obtaining preliminary hidden layer feature representations of all samples in the source project and the target project through a feature coding layer of the migration self-encoder;
the migrating self-encoder is a novel self-encoder with a double-encoding-layer structure. The double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding feature vectors of all samples in a source project and a target project into hidden layer feature representation, and the label coding layer realizes classification of the samples on the basis of the hidden layer feature representation. And in the training process, the supervised learning process of the source item samples is realized by minimizing the label loss items of the source item samples. Meanwhile, model weights between the source project and the target project are shared, samples of the target project can be directly input into the trained model, and a final prediction result is obtained through output of a label coding layer of the model, so that the aim of transfer learning is fulfilled.
Step 3), selecting a part of samples (for example, 1/3) which are distributed most closely to the hidden layer feature representation of the target item sample from the hidden layer feature representations of the source item samples by the aid of the migration cross-validation method through the initial hidden layer feature representation obtained in the step 2) as a validation set, and taking the rest source item samples as a training set;
step 4), considering that the samples of the training set are seriously unbalanced in the defect type and the non-defect type, carrying out oversampling treatment (such as a random oversampling method or a manual synthesis oversampling method) on the samples of the training set;
step 5), further fine-tuning the migration self-encoder on the over-sampling processed training set obtained in the step 4), and selecting a model hyper-parameter and early stopping a strategy to realize the training of the model by means of the prediction performance on the verification set;
and 6) after the training of the migration self-encoder is finished, inputting the preprocessed data of the target project into the migration self-encoder, and obtaining a final prediction result by a label coding layer of the network.
Wherein, the migration self-encoder in the step 2) and the step 5) adopts different forms of loss functions. Step 2) belongs to an unsupervised training mode, so that no label information is introduced in the training process, and the loss function at the moment consists of a reconstruction error term and a hidden layer characteristic distribution difference term. By minimizing the loss function, the network can learn the hidden layer feature representation of all samples, the hidden layer feature representation has good reconstruction performance, and the hidden layer feature distribution of the source item samples is close to that of the target item samples. And step 5) belongs to a supervised training mode, namely label information of a source item sample is introduced in the training process, and the loss function at the moment consists of 4 items of content, including a reconstruction error item, a hidden layer feature distribution difference item, a label loss item of the source item sample and a regular loss item. The model pre-training process and the fine-tuning process respectively realize two training modes of no supervision and supervised by adjusting the loss function to enable the loss function to contain or not contain the label loss item.
Compared with the existing software defect prediction method, the cross-project software defect prediction method based on supervised expression learning has the advantages that: the invention breaks through the assumption that the traditional machine learning method requires the training set and the test set to be distributed the same or similar, and can transfer information from related items to improve the learning of the current software item data. Moreover, different from the current cross-project defect prediction method of unsupervised representation learning, the migration self-encoder adopted by the invention can fully utilize the label information of the source domain sample in the process of learning the hidden layer feature representation, and can realize the feature learning and the model construction of a further-in-place end-to-end mode, thereby further improving the cross-project software defect prediction performance.
Drawings
FIG. 1 supervised representation learning method based on migratory autocoder
FIG. 2 cross-project software defect prediction method based on supervised expression learning
Detailed Description
The invention will be further described with reference to the accompanying drawings. First, a migration auto-encoder used in the present invention will be described in detail with reference to fig. 1.
The described migration self-encoder is a new type self-encoder with double-coding layer structure. The double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding feature vectors of all samples in a source project and a target project into hidden layer feature representation, and the label coding layer realizes classification of the samples on the basis of the hidden layer feature representation. And in the training process, the supervised learning process of the source item samples is realized by minimizing the label loss items of the source item samples. Meanwhile, model weights between the source project and the target project are shared, samples of the target project can be directly input into the trained model, and a final prediction result is obtained through output of a label coding layer of the model, so that the aim of transfer learning is fulfilled.
The specific structure of the migration self-encoder is as follows:
given a tagged source domain data setAnd a target domain data set to be predicted m represents the number of features of the input sample, 0 indicates a non-defect class and 1 indicates a defect class. n issAnd ntRepresenting the number of samples of the source and target domains, respectively. The loss function for migrating the self-encoder is as follows:
wherein,
the hidden layer at the 1 st layer of the model is a characteristic coding layer, the coding layer has k (k is less than or equal to m) nodes, and the output of the coding layer isThe weight parameter of the layer is W1∈Rk×mBias parameter is b1∈Rk×1. The network layer 2 is a label coding layer, which has 2 nodes, and the output of the node is z epsilon R2×1The weight parameter of the layer is W2∈R2×kBias parameter is b2∈R2×1. For a test sampleWe can estimate the probability that it belongs to a certain class as:
thus, after model training is complete, the output of the label coding layer can be used to predict the target domain samples. Output of the 3 rd hidden layerIs the reconstructed output of a feature coding layer, the weight parameter of which is W'2∈Rk×2The bias parameter is b'2∈Rk×1. The output of the last 1 layer is the reconstructed output of the sampleWeight parameter W 'of the layer'1∈Rm×kAnd b'1∈Rm×1. Further, f is a nonlinear activation function sigmoid function.
The 2 nd term of the loss function is the distribution variance term, defined here as KL divergence:
Γ(ξ(s),ξ(t))=DKL(Ps||Pt)+DKL(Pt||Ps) (6)
wherein,
the KL divergence is an asymmetric divergence measure that measures the difference between two probability distributions. Suppose that two different probability distributions P ∈ R are givenk×1And Q ∈ Rk×1When P is estimated approximately by Q, the loss of information from P to Q is defined asHere, D is usedKL(P||Q)+DKL(Q | P) to measure the distribution difference between the source domain and the target domain. By narrowing the value of the term, the distribution difference of the source domain and the target domain in the new characterization space can be minimized. The label loss term is defined as follows:
there are 3 coefficients to be selected for the entire loss function: α, β and γ. Together with the number n of hidden layer neuron nodes of the encoder, these belong to the hyper-parameters of the model. The value range of n is not set to be [10,50], and the value interval is 5; the value range of alpha is not set to [10,20,50,100,200], the value range of beta is set to [50,100,200,500,1000], and the value range of gamma is [0.0001,0.001,0.01,0.1 ]. In order to improve the efficiency of searching the hyper-parameters, a random searching mode is adopted, and the maximum searching frequency is 200.
The selection of the hyper-parameters is determined by cross-validation. The following describes a specific process of the migration cross-validation partitioning of the training set and the test set. The characteristic transformation adopted by the invention is obtained by a self-encoder network, so that the Nonlinear Distribution Diversity (NDD) is defined as follows:
here by adjusting the weight of the source domain samples { alpha }i:xi∈XsTo minimize NDD distance:
b is an upper bound value (set to 1) to avoid alpha diverging to infinity. The optimal alpha value is calculated to minimize:
finally, { α [ ]iAre ordered from large to small according to { alpha }iThe order of 1/3 source domain samples are selected from among them as a validation setThe samples, 2/3 remaining, were used as a training set. After the division into data sets, random oversampling operations are performed on the training set samples to mitigate the effects of class imbalance.
A cross-project software defect prediction method based on supervised expression learning is shown in fig. 2. The technical scheme of the invention is explained in detail below with reference to fig. 2, and the specific implementation steps are as follows:
1. and (4) defining a target project and a source project, and preprocessing the target project and the source project. The invention relates to a cross-project defect prediction method, wherein a project to be predicted at present is a target project, and other projects for training are source projects. For unifying dimension, respectively carrying out normalization preprocessing on a source project and a target project to keep each dimension of an input sample of the source project and the target project at [0, 1]]In the meantime. min (x.j) And max (x.j) Represents the most significant value of the j-th dimension:
2. and unsupervised pre-training the network to obtain the initial hidden layer feature representation of the sample. Inputting all original data of a source project and a target project into a migration self-encoder, and preliminarily training the migration self-encoder in an unsupervised pre-training mode, wherein the loss function at the moment has no regular term and label loss term and only has a reconstruction error term and a distribution difference term. The initial learning rate in the pre-training process is fixed to 0.01, and the iteration number is fixed to 500.
3. The training set and validation set are partitioned by a migration cross validation method. And dividing the data set by a migration cross-validation method on the basis of the preliminary hidden layer feature representation of all the samples of the source item and the target item obtained in the step 2, wherein 1/3 source item samples closest to the hidden layer feature distribution of the target domain serve as a validation set, and the rest 2/3 samples serve as a training set. The details of migration cross-validation are as described above.
4. And carrying out oversampling processing on the training set samples. And considering that the defect type and non-defect type samples in the training set samples are seriously unbalanced, performing oversampling processing on the training set samples. The invention mainly adopts a random oversampling mode, namely randomly selecting a few types of samples to simply copy so that the total number of the few types of samples is consistent with or similar to the total number of the majority types of samples. Through the oversampling processing, the problem of category imbalance is alleviated.
5. There is further a supervised fine tuning of the self-encoder network. And further fine-tuning the migrated self-encoder on the training set subjected to sampling processing, and realizing the training of the model by selecting the hyper-parameters of the model and stopping the strategy in advance by means of the verification set. The learning rate in the process of supervised fine tuning is 0.001, and the maximum training iteration number is 5000. And checking the classification performance (mainly referring to Bal value) of the current model every fixed iteration number in the training process, and determining whether the training needs to be stopped early. The Bal value is a comprehensive index for measuring the detection rate and the false alarm rate in the classification problem. Take the two-class confusion matrix of table 1 as an example:
TABLE 1
The whole parameter fine tuning process not only comprises fine tuning of model parameters, but also comprises fine tuning of hyper-parameters. The selection of the hyper-parameter adopts a random search mode, namely, the maximum random search times (for example, 200 times) are set. And after selecting the hyper-parameters each time, retraining the network, verifying the classification performance Bal value of the current model on the verification set at fixed intervals, and storing the current optimal model. And selecting the optimal model as a final defect prediction model after the maximum search times are reached.
6. And inputting the target item data to the self-encoder to obtain a prediction result. And inputting the target item after preprocessing to the migration self-encoder network, and obtaining a final prediction result by a label coding layer of the network.
The above steps can be organized into a complete process as shown in table 2 below:
TABLE 2
The foregoing describes the cross-project software defect prediction method based on supervised expression learning according to the present invention in detail, but it is obvious that the specific implementation form of the present invention is not limited thereto. It will be apparent to those skilled in the art that various obvious changes may be made therein without departing from the spirit of the invention and the scope of the appended claims.
Claims (4)
1. A cross-project software defect prediction method based on supervised expression learning is characterized by comprising the following steps: the method comprises the following steps:
step 1), defining a target item to be predicted and a source item used for training a model, and carrying out standardization or normalization preprocessing operation on original data of the source item and the target item;
step 2), inputting the feature vectors of all samples in the source project and the target project into a migration self-encoder, preliminarily training the migration self-encoder in an unsupervised pre-training mode, and obtaining preliminary hidden layer feature representations of all samples in the source project and the target project through a feature coding layer of the migration self-encoder;
step 3), on the basis of obtaining the preliminary feature representation in the step 2), selecting a part of samples which are distributed most closely to the hidden layer feature representation of the target project sample from the hidden layer feature representation of the source project sample as a verification set by means of a migration cross-validation method, and taking the rest source project samples as a training set;
step 4), oversampling processing is carried out on the training set samples;
step 5), continuing to perform supervised fine tuning on the training set subjected to the oversampling processing in the step 4), and selecting a model hyperparameter and stopping a strategy in advance by virtue of the prediction performance on the verification set to complete the training of the model;
step 6), after the training of the migration self-encoder is finished, inputting the sample data of the target project after pretreatment to the migration self-encoder, and obtaining a final prediction result by a label coding layer of the migration self-encoder;
the migration self-encoder is a self-encoder with a double-encoding layer structure; the double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding the feature vectors of all samples in the source project and the target project into hidden layer feature representation, and the label coding layer realizes the classification of the samples on the basis of the hidden layer feature representation;
the migration self-encoder adopts different forms of loss functions; the model pre-training process and the fine-tuning process respectively realize two training modes of no supervision and supervised by adjusting the loss function to enable the loss function to contain or not contain the label loss item.
2. The method of claim 1, wherein the cross-project software defect prediction method based on supervised expression learning comprises: in the unsupervised training mode, label information is not introduced in the training process, and the loss function consists of a reconstruction error term and a hidden layer characteristic distribution difference term; by minimizing the loss function, the network can learn the hidden layer signature representation of all samples.
3. The method of claim 1, wherein the cross-project software defect prediction method based on supervised expression learning comprises: the supervised training mode is that the training process introduces the label information of the source item sample, and the loss function at the moment consists of 4 items of contents, including a reconstruction error item, a hidden layer feature distribution difference item, a label loss item of the source item sample and a regular loss item.
4. The method of claim 1, wherein the cross-project software defect prediction method based on supervised expression learning comprises: the migration cross validation method in the step 3) selects part of training data which is close to the target project data distribution as a validation set according to the feature distribution difference, and takes the rest of the training data as a training set; the feature transform used is derived by migrating the self-encoder.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910915935.5A CN110751186B (en) | 2019-09-26 | 2019-09-26 | Cross-project software defect prediction method based on supervised expression learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910915935.5A CN110751186B (en) | 2019-09-26 | 2019-09-26 | Cross-project software defect prediction method based on supervised expression learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110751186A CN110751186A (en) | 2020-02-04 |
CN110751186B true CN110751186B (en) | 2022-04-08 |
Family
ID=69277087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910915935.5A Active CN110751186B (en) | 2019-09-26 | 2019-09-26 | Cross-project software defect prediction method based on supervised expression learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110751186B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111582325B (en) * | 2020-04-20 | 2023-04-07 | 华南理工大学 | Multi-order feature combination method based on automatic feature coding |
CN111860592A (en) * | 2020-06-16 | 2020-10-30 | 江苏大学 | Solar cell defect classification detection method under condition of few samples |
CN112148605B (en) * | 2020-09-22 | 2022-05-20 | 华南理工大学 | Software defect prediction method based on spectral clustering and semi-supervised learning |
CN112199280B (en) * | 2020-09-30 | 2022-05-20 | 三维通信股份有限公司 | Method and apparatus for predicting software defects, storage medium, and electronic apparatus |
CN112346974B (en) * | 2020-11-07 | 2023-08-22 | 重庆大学 | Depth feature embedding-based cross-mobile application program instant defect prediction method |
CN112527670B (en) * | 2020-12-18 | 2022-06-03 | 武汉理工大学 | Method for predicting software aging defects in project based on Active Learning |
CN113673251B (en) * | 2021-08-09 | 2024-07-26 | 浙江浙能数字科技有限公司 | Multi-coding system mutual migration method based on unsupervised generation network |
CN113778811A (en) * | 2021-09-28 | 2021-12-10 | 重庆邮电大学 | Fault monitoring method and system based on deep convolution migration learning software system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109416719A (en) * | 2016-04-22 | 2019-03-01 | 谭琳 | Method for determining the defects of software code He loophole |
CN108459955B (en) * | 2017-09-29 | 2020-12-22 | 重庆大学 | Software defect prediction method based on deep self-coding network |
US10521224B2 (en) * | 2018-02-28 | 2019-12-31 | Fujitsu Limited | Automatic identification of relevant software projects for cross project learning |
CN108984613A (en) * | 2018-06-12 | 2018-12-11 | 北京航空航天大学 | A kind of defect report spanned item mesh classification method based on transfer learning |
CN110162475B (en) * | 2019-05-27 | 2023-04-18 | 浙江工业大学 | Software defect prediction method based on deep migration |
-
2019
- 2019-09-26 CN CN201910915935.5A patent/CN110751186B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110751186A (en) | 2020-02-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110751186B (en) | Cross-project software defect prediction method based on supervised expression learning | |
CN109408389B (en) | Code defect detection method and device based on deep learning | |
CN111914644B (en) | Dual-mode cooperation based weak supervision time sequence action positioning method and system | |
CN112069310B (en) | Text classification method and system based on active learning strategy | |
CN110349597A (en) | A kind of speech detection method and device | |
CN105740984A (en) | Product concept performance evaluation method based on performance prediction | |
CN110647830A (en) | Bearing fault diagnosis method based on convolutional neural network and Gaussian mixture model | |
CN111290947B (en) | Cross-software defect prediction method based on countermeasure judgment | |
CN111680788A (en) | Equipment fault diagnosis method based on deep learning | |
CN113342597B (en) | System fault prediction method based on Gaussian mixture hidden Markov model | |
Wan et al. | Supervised representation learning approach for cross-project aging-related bug prediction | |
CN115049627B (en) | Steel surface defect detection method and system based on domain self-adaptive depth migration network | |
CN117171700A (en) | Drilling overflow prediction combined model based on deep learning and model timely silence updating and migration learning method | |
Yang et al. | Zte-predictor: Disk failure prediction system based on lstm | |
CN111723021B (en) | Defect report automatic allocation method based on knowledge base and representation learning | |
CN116089894B (en) | Unknown fault diagnosis method for water chilling unit based on semi-supervised countermeasure variation automatic coding | |
CN117056226A (en) | Cross-project software defect number prediction method based on transfer learning | |
CN117171713A (en) | Cross self-adaptive deep migration learning method and system based on bearing service life | |
CN115599698A (en) | Software defect prediction method and system based on class association rule | |
WO2023172270A1 (en) | Platform for automatic production of machine learning models and deployment pipelines | |
CN113592028A (en) | Method and system for identifying logging fluid by using multi-expert classification committee machine | |
CN114330500A (en) | Storm platform-based online parallel diagnosis method and system for power grid power equipment | |
Yao et al. | Defect Prediction Technology of Aerospace Software Based on Deep Neural Network and Process Measurement | |
CN109919464B (en) | Aging screening method applied to high-power laser | |
CN105354201B (en) | The method and system screened and eliminate false positive results |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |