CN110751186A - Cross-project software defect prediction method based on supervised expression learning - Google Patents

Cross-project software defect prediction method based on supervised expression learning Download PDF

Info

Publication number
CN110751186A
CN110751186A CN201910915935.5A CN201910915935A CN110751186A CN 110751186 A CN110751186 A CN 110751186A CN 201910915935 A CN201910915935 A CN 201910915935A CN 110751186 A CN110751186 A CN 110751186A
Authority
CN
China
Prior art keywords
project
training
encoder
migration
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910915935.5A
Other languages
Chinese (zh)
Other versions
CN110751186B (en
Inventor
郑征
万晓晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Beijing University of Aeronautics and Astronautics
Original Assignee
Beijing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Aeronautics and Astronautics filed Critical Beijing University of Aeronautics and Astronautics
Priority to CN201910915935.5A priority Critical patent/CN110751186B/en
Publication of CN110751186A publication Critical patent/CN110751186A/en
Application granted granted Critical
Publication of CN110751186B publication Critical patent/CN110751186B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3447Performance evaluation by modeling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a cross-project software defect prediction method for supervised expression learning, which comprises the following steps: (1) selecting a defect data set, and preprocessing defect data; (2) training a migration self-encoder in an unsupervised pre-training mode, wherein the migration self-encoder comprises a characteristic encoding layer and a label encoding layer; (3) selecting a sample which is closest to the hidden layer feature distribution of the target project sample from all the hidden layer feature representations of the source project sample as a verification set by means of a migration cross-validation method, and taking the rest samples as a training set; (4) performing oversampling processing on a training set sample; (5) fine-tuning a migration self-encoder, selecting a model hyper-parameter and stopping a strategy in advance; (6) and inputting the preprocessed data of the target item to a migration self-encoder, and obtaining a final prediction result through the output of a label encoding layer. The method introduces the label information of the source project sample into the feature representation learning process, and improves the prediction performance of the cross-project software defect prediction model.

Description

Cross-project software defect prediction method based on supervised expression learning
Technical Field
The invention belongs to the technical field of software defect prediction of software engineering application, and particularly relates to a cross-project software defect prediction method based on supervised expression learning.
Background
Software defect prediction techniques predict defects that may exist in a current software project by learning and building a prediction model from historical defect data. The method can help testers to quickly find defects and greatly improve the software testing efficiency, so that the method becomes a research hotspot in the field of current software engineering.
The general method of software defect prediction is to extract various features from software code, such as Halstead metric, McCabe metric, CK metric, MOOD metric, code change metric and other object-oriented metrics, represent all code segments with feature vectors, mark the code segments according to the existence of actual defects, input the feature vectors and labeled labels to a machine learning model for training, and finally construct a software defect prediction model for predicting possible defects in new software code.
Most of the past software defect prediction methods are based on the traditional machine learning method to build a software defect prediction model. The conventional machine learning method needs to meet the following requirements for obtaining excellent performance: the data distribution of the training sample and the test sample is the same or similar, the positive and negative sample distribution is relatively balanced, and the labeled sample used for training is sufficient. However, in practical application, because of the great difficulty of manual labeling, labeled samples which can be used for training a model are very rare, and the occurrence probability of software defects is extremely low, most of the labeled samples are also non-defective samples, and the defective samples only account for a very small part. Therefore, the problems of rare labeling data and unbalanced category become the biggest two challenges for software defect prediction technology.
For the research of category imbalance, most of the current work is mainly processed by a data resampling method, such as random oversampling or a method of artificially synthesizing a few types of samples, and for the problem of scarcity of training data, one current solution idea is to train a prediction model by using defect data of different projects, which is a cross-project defect prediction technology. Because the labeled samples are rare, it is not enough to train the machine learning model by only using the labeled data acquired in a single project, and the basic idea of the cross-project defect prediction technology is to train the prediction model by using the defect data (also called source project or source domain) in other projects, and then apply the prediction model obtained by training to the software project to be predicted (also called target project or target domain), thereby relieving the problem of the rare training data to a certain extent.
However, one difficulty with cross-project software bug prediction is that the training data and the test data often do not satisfy the same or similar distributions, which is contrary to the assumptions of conventional machine learning models, and thus, conventional machine learning models cannot be used directly for cross-project bug prediction. In recent years, the migration learning method gradually starts to be applied to a cross-project software defect prediction task. One of the most widely used methods is a Transfer Component Analysis (TCA), which belongs to an unsupervised representation learning method and is characterized in that labeled information of a source domain sample cannot be utilized in a learning characterization process. In addition, such methods break apart the unsupervised feature learning process and the training process of the classifier by a divide and conquer approach, such as first learning hidden layer representations of the source and target item samples, and then retraining the machine learning classifier in this new feature space. However, the divide and conquer method itself has a problem: while an optimal solution can be obtained on a sub-problem when solving the sub-problem step by step, optimization on a sub-problem does not mean that an optimal solution of a global problem can be obtained. The features learned in the early stage may not be suitable for training the classifier in the later stage, which may cause the actual prediction capability of the final software defect prediction model to be affected.
Disclosure of Invention
One object of the present invention is: in order to make up for the defects of the method, the invention provides a cross-project software defect prediction method based on supervised expression learning. The method utilizes a migration self-encoder with a double-encoding layer structure, and can utilize label information of a source domain sample simultaneously in the process of learning hidden layer feature representation, thereby belonging to a supervised representation learning mode. In addition, through adjusting the loss function of the network, the training modes of unsupervised pre-training and supervised fine tuning of the network are respectively realized, after the unsupervised pre-training is finished to obtain a primary hidden layer feature representation, reasonable division of a training set and a verification set is realized under the background of transfer learning by means of a transfer cross verification method, and model hyper-parameters are selected according to the prediction performance on the verification set.
Another object of the invention is: a deep learning model named as a migration self-encoder is provided, the model provides an end-to-end learning mode for us, artificial subproblem division is not performed in the whole learning process, and the deep learning model is completely handed over to directly learn mapping from original input to expected output. Compared with a divide-and-conquer strategy, the learning mode of 'end-to-end' has the advantage of synergy, and the global optimal solution can be obtained more greatly. Experiments show that the supervised expression learning method can improve the effect of cross-project software defect prediction.
The technical scheme of the invention is as follows: a cross-project software defect prediction method based on supervised expression learning comprises the following steps:
step 1), defining a target item to be predicted and a source item used for training a model, and carrying out preprocessing operations such as standardization or normalization on the original data of the source item and the target item;
step 2), inputting the feature vectors of all samples in the source project and the target project into a migration self-encoder, preliminarily training the migration self-encoder in an unsupervised pre-training mode, and obtaining preliminary hidden layer feature representations of all samples in the source project and the target project through a feature coding layer of the migration self-encoder;
the migrating self-encoder is a novel self-encoder with a double-encoding-layer structure. The double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding feature vectors of all samples in a source project and a target project into hidden layer feature representation, and the label coding layer realizes classification of the samples on the basis of the hidden layer feature representation. And in the training process, the supervised learning process of the source item samples is realized by minimizing the label loss items of the source item samples. Meanwhile, model weights between the source project and the target project are shared, samples of the target project can be directly input into the trained model, and a final prediction result is obtained through output of a label coding layer of the model, so that the aim of transfer learning is fulfilled.
Step 3), selecting a part of samples (for example, 1/3) which are distributed most closely to the hidden layer feature representation of the target item sample from the hidden layer feature representations of the source item samples by the aid of the migration cross-validation method through the initial hidden layer feature representation obtained in the step 2) as a validation set, and taking the rest source item samples as a training set;
step 4), considering that the samples of the training set are seriously unbalanced in the defect type and the non-defect type, carrying out oversampling treatment (such as a random oversampling method or a manual synthesis oversampling method) on the samples of the training set;
step 5), further fine-tuning the migration self-encoder on the over-sampling processed training set obtained in the step 4), and selecting a model hyper-parameter and early stopping a strategy to realize the training of the model by means of the prediction performance on the verification set;
and 6) after the training of the migration self-encoder is finished, inputting the preprocessed data of the target project into the migration self-encoder, and obtaining a final prediction result by a label coding layer of the network.
Wherein, the migration self-encoder in the step 2) and the step 5) adopts different forms of loss functions. Step 2) belongs to an unsupervised training mode, so that no label information is introduced in the training process, and the loss function at the moment consists of a reconstruction error term and a hidden layer characteristic distribution difference term. By minimizing the loss function, the network can learn the hidden layer feature representation of all samples, the hidden layer feature representation has good reconstruction performance, and the hidden layer feature distribution of the source item samples is close to that of the target item samples. And step 5) belongs to a supervised training mode, namely label information of a source item sample is introduced in the training process, and the loss function at the moment consists of 4 items of content, including a reconstruction error item, a hidden layer feature distribution difference item, a label loss item of the source item sample and a regular loss item. The model pre-training process and the fine-tuning process respectively realize two training modes of no supervision and supervised by adjusting the loss function to enable the loss function to contain or not contain the label loss item.
Compared with the existing software defect prediction method, the cross-project software defect prediction method based on supervised expression learning has the advantages that: the invention breaks through the assumption that the traditional machine learning method requires the training set and the test set to be distributed the same or similar, and can transfer information from related items to improve the learning of the current software item data. Moreover, different from the current cross-project defect prediction method of unsupervised representation learning, the migration self-encoder adopted by the invention can fully utilize the label information of the source domain sample in the process of learning the hidden layer feature representation, and can realize the feature learning and the model construction of a further-in-place end-to-end mode, thereby further improving the cross-project software defect prediction performance.
Drawings
FIG. 1 supervised representation learning method based on migratory autocoder
FIG. 2 cross-project software defect prediction method based on supervised expression learning
Detailed Description
The invention will be further described with reference to the accompanying drawings. First, a migration auto-encoder used in the present invention will be described in detail with reference to fig. 1.
The described migration self-encoder is a new type self-encoder with double-coding layer structure. The double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding feature vectors of all samples in a source project and a target project into hidden layer feature representation, and the label coding layer realizes classification of the samples on the basis of the hidden layer feature representation. And in the training process, the supervised learning process of the source item samples is realized by minimizing the label loss items of the source item samples. Meanwhile, model weights between the source project and the target project are shared, samples of the target project can be directly input into the trained model, and a final prediction result is obtained through output of a label coding layer of the model, so that the aim of transfer learning is fulfilled.
The specific structure of the migration self-encoder is as follows:
given a tagged source domain data setAnd a target domain data set to be predicted
Figure BDA0002216113650000052
Figure BDA0002216113650000053
m represents the number of features of the input sample,
Figure BDA0002216113650000054
Figure BDA0002216113650000055
0 indicates a non-defect class and 1 indicates a defect class. n issAnd ntRepresenting the number of samples of the source and target domains, respectively. The loss function for migrating the self-encoder is as follows:
Figure BDA0002216113650000056
term 1 is the reconstruction error term for the source domain and target domain samples:
Figure BDA0002216113650000057
wherein the content of the first and second substances,
Figure BDA0002216113650000058
Figure BDA0002216113650000059
the hidden layer at the 1 st layer of the model is a characteristic coding layer, the coding layer has k (k is less than or equal to m) nodes, and the output of the coding layer is
Figure BDA00022161136500000510
The weight parameter of the layer is W1∈Rk×mBias parameter is b1∈Rk×1. Layer 2 of the network is a label coding layer having2 nodes with output z ∈ R2×1The weight parameter of the layer is W2∈R2×kBias parameter is b2∈R2×1. For a test sample
Figure BDA0002216113650000061
We can estimate the probability that it belongs to a certain class as:
Figure BDA0002216113650000062
thus, after model training is complete, the output of the label coding layer can be used to predict the target domain samples. Output of the 3 rd hidden layer
Figure BDA0002216113650000063
Is the reconstructed output of a feature coding layer, the weight parameter of which is W'2∈Rk×2The bias parameter is b'2∈Rk×1. The output of the last 1 layer is the reconstructed output of the sample
Figure BDA0002216113650000064
Weight parameter W 'of the layer'1∈Rm×kAnd b'1∈Rm×1. Further, f is a nonlinear activation function sigmoid function.
The 2 nd term of the loss function is the distribution variance term, defined here as KL divergence:
Γ(ξ(s)(t))=DKL(Ps||Pt)+DKL(Pt||Ps) (6)
wherein the content of the first and second substances,
Figure BDA0002216113650000065
Figure BDA0002216113650000066
wherein KL divergence is a measure of the difference between two probability distributionsThe divergence measure is called. Suppose that two different probability distributions P ∈ R are givenk×1And Q ∈ Rk×1When P is estimated approximately by Q, the loss of information from P to Q is defined as
Figure BDA0002216113650000067
Here, D is usedKL(P||Q)+DKL(Q | P) to measure the distribution difference between the source domain and the target domain. By narrowing the value of the term, the distribution difference of the source domain and the target domain in the new characterization space can be minimized. The label loss term is defined as follows:
wherein the content of the first and second substances,
Figure BDA0002216113650000071
i.e. W2Row j of (2). The final term 1 is the canonical loss term of the model:
Figure BDA0002216113650000072
the whole loss function has 3 coefficients α and gamma which need to be selected, and the number n of hidden layer neuron nodes of the encoder, which belong to the hyper-parameters of the model, the value range of n is set to be [10,50], the value interval is 5, the value range of α is set to be [10,20,50,100,200], the value range of β is set to be [50,100,200,500,1000], the value range of gamma is set to be [0.0001,0.001,0.01,0.1], in order to improve the hyper-parameter search efficiency, a random search mode is adopted, and the maximum search frequency is 200.
The selection of the hyper-parameters is determined by cross-validation. The following describes a specific process of the migration cross-validation partitioning of the training set and the test set. The characteristic transformation adopted by the invention is obtained by a self-encoder network, so that the Nonlinear Distribution Diversity (NDD) is defined as follows:
Figure BDA0002216113650000073
here by adjusting the weights of the source domain samples αi:xi∈XsTo minimize NDD distance:
Figure BDA0002216113650000074
b is the upper bound value (set to 1) to avoid α diverging to infinity. the optimal α value is calculated to minimize:
Figure BDA0002216113650000075
finally, will { αiAre ordered from large to small according to αiThe order of which selects 1/3 source domain samples as validation set samples, the rest 2/3 as training set. After the division into data sets, random oversampling operations are performed on the training set samples to mitigate the effects of class imbalance.
A cross-project software defect prediction method based on supervised expression learning is shown in fig. 2. The technical scheme of the invention is explained in detail below with reference to fig. 2, and the specific implementation steps are as follows:
1. and (4) defining a target project and a source project, and preprocessing the target project and the source project. The invention relates to a cross-project defect prediction method, wherein a project to be predicted at present is a target project, and other projects for training are source projects. For unifying dimension, respectively carrying out normalization preprocessing on a source project and a target project to keep each dimension of an input sample of the source project and the target project at [0, 1]]In the meantime. min (x.j) And max (x.j) Represents the most significant value of the j-th dimension:
Figure BDA0002216113650000081
2. and unsupervised pre-training the network to obtain the initial hidden layer feature representation of the sample. Inputting all original data of a source project and a target project into a migration self-encoder, and preliminarily training the migration self-encoder in an unsupervised pre-training mode, wherein the loss function at the moment has no regular term and label loss term and only has a reconstruction error term and a distribution difference term. The initial learning rate in the pre-training process is fixed to 0.01, and the iteration number is fixed to 500.
3. The training set and validation set are partitioned by a migration cross validation method. And dividing the data set by a migration cross-validation method on the basis of the preliminary hidden layer feature representation of all the samples of the source item and the target item obtained in the step 2, wherein 1/3 source item samples closest to the hidden layer feature distribution of the target domain serve as a validation set, and the rest 2/3 samples serve as a training set. The details of migration cross-validation are as described above.
4. And carrying out oversampling processing on the training set samples. And considering that the defect type and non-defect type samples in the training set samples are seriously unbalanced, performing oversampling processing on the training set samples. The invention mainly adopts a random oversampling mode, namely randomly selecting a few types of samples to simply copy so that the total number of the few types of samples is consistent with or similar to the total number of the majority types of samples. Through the oversampling processing, the problem of category imbalance is alleviated.
5. There is further a supervised fine tuning of the self-encoder network. And further fine-tuning the migrated self-encoder on the training set subjected to sampling processing, and realizing the training of the model by selecting the hyper-parameters of the model and stopping the strategy in advance by means of the verification set. The learning rate in the process of supervised fine tuning is 0.001, and the maximum training iteration number is 5000. And checking the classification performance (mainly referring to Bal value) of the current model every fixed iteration number in the training process, and determining whether the training needs to be stopped early. The Bal value is a comprehensive index for measuring the detection rate and the false alarm rate in the classification problem. Take the two-class confusion matrix of table 1 as an example:
TABLE 1
Figure BDA0002216113650000092
Figure BDA0002216113650000093
Figure BDA0002216113650000094
The whole parameter fine tuning process not only comprises fine tuning of model parameters, but also comprises fine tuning of hyper-parameters. The selection of the hyper-parameter adopts a random search mode, namely, the maximum random search times (for example, 200 times) are set. And after selecting the hyper-parameters each time, retraining the network, verifying the classification performance Bal value of the current model on the verification set at fixed intervals, and storing the current optimal model. And selecting the optimal model as a final defect prediction model after the maximum search times are reached.
6. And inputting the target item data to the self-encoder to obtain a prediction result. And inputting the target item after preprocessing to the migration self-encoder network, and obtaining a final prediction result by a label coding layer of the network.
The above steps can be organized into a complete process as shown in table 2 below:
Figure BDA0002216113650000095
Figure BDA0002216113650000101
TABLE 2
The foregoing describes the cross-project software defect prediction method based on supervised expression learning according to the present invention in detail, but it is obvious that the specific implementation form of the present invention is not limited thereto. It will be apparent to those skilled in the art that various obvious changes may be made therein without departing from the spirit of the invention and the scope of the appended claims.

Claims (6)

1. A cross-project software defect prediction method based on supervised expression learning is characterized by comprising the following steps: the method comprises the following steps:
step 1), defining a target item to be predicted and a source item used for training a model, and carrying out standardization or normalization preprocessing operation on original data of the source item and the target item;
step 2), inputting the feature vectors of all samples in the source project and the target project into a migration self-encoder, preliminarily training the migration self-encoder in an unsupervised pre-training mode, and obtaining preliminary hidden layer feature representations of all samples in the source project and the target project through a feature coding layer of the migration self-encoder;
step 3), on the basis of obtaining the preliminary feature representation in the step 2), selecting a part of samples which are distributed most closely to the hidden layer feature representation of the target project sample from the hidden layer feature representation of the source project sample as a verification set by means of a migration cross-validation method, and taking the rest source project samples as a training set;
step 4), oversampling processing is carried out on the training set samples;
step 5), continuing fine tuning the migration self-encoder on the training set subjected to the oversampling processing in the step 4), and selecting a model hyper-parameter and stopping a strategy in advance to finish the training of the model by virtue of the prediction performance on the verification set;
and 6) after the training of the migration self-encoder is finished, inputting the sample data of the target item after the preprocessing to the migration self-encoder, and obtaining a final prediction result by a label encoding layer of the migration self-encoder.
2. The method of claim 1, wherein the cross-project software defect prediction method based on supervised expression learning comprises: the migration self-encoder is a self-encoder with a double-encoding layer structure; the double coding layers are a characteristic coding layer and a label coding layer; the first layer of coding layer is a feature coding layer and is responsible for coding feature vectors of all samples in a source project and a target project into hidden layer feature representation, and the label coding layer realizes classification of the samples on the basis of the hidden layer feature representation.
3. The method of claim 1, wherein the cross-project software defect prediction method based on supervised expression learning comprises: the migration self-encoder adopts different forms of loss functions; the model pre-training process and the fine-tuning process respectively realize two training modes of no supervision and supervised by adjusting the loss function to enable the loss function to contain or not contain the label loss item.
4. The method of claim 3, wherein the cross-project software defect prediction method based on supervised expression learning comprises: in the unsupervised training mode, label information is not introduced in the training process, and the loss function consists of a reconstruction error term and a hidden layer characteristic distribution difference term; by minimizing the loss function, the network can learn the hidden layer signature representation of all samples.
5. The method of claim 3, wherein the cross-project software defect prediction method based on supervised expression learning comprises: the supervised training mode is that the training process introduces the label information of the source item sample, and the loss function at the moment consists of 4 items of contents, including a reconstruction error item, a hidden layer feature distribution difference item, a label loss item of the source item sample and a regular loss item.
6. The cross-project software defect prediction method based on supervised expression learning according to weight claim 1, characterized in that: the migration cross validation method in the step 3) selects part of training data which is close to the target project data distribution as a validation set according to the feature distribution difference, and takes the rest of the training data as a training set; the feature transform used is derived by migrating the self-encoder.
CN201910915935.5A 2019-09-26 2019-09-26 Cross-project software defect prediction method based on supervised expression learning Active CN110751186B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910915935.5A CN110751186B (en) 2019-09-26 2019-09-26 Cross-project software defect prediction method based on supervised expression learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910915935.5A CN110751186B (en) 2019-09-26 2019-09-26 Cross-project software defect prediction method based on supervised expression learning

Publications (2)

Publication Number Publication Date
CN110751186A true CN110751186A (en) 2020-02-04
CN110751186B CN110751186B (en) 2022-04-08

Family

ID=69277087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910915935.5A Active CN110751186B (en) 2019-09-26 2019-09-26 Cross-project software defect prediction method based on supervised expression learning

Country Status (1)

Country Link
CN (1) CN110751186B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582325A (en) * 2020-04-20 2020-08-25 华南理工大学 Multi-order feature combination method based on automatic feature coding
CN111860592A (en) * 2020-06-16 2020-10-30 江苏大学 Solar cell defect classification detection method under condition of few samples
CN112148605A (en) * 2020-09-22 2020-12-29 华南理工大学 Software defect prediction method based on spectral clustering and semi-supervised learning
CN112199280A (en) * 2020-09-30 2021-01-08 三维通信股份有限公司 Defect prediction method and apparatus, storage medium, and electronic apparatus
CN112346974A (en) * 2020-11-07 2021-02-09 重庆大学 Cross-mobile application program instant defect prediction method based on depth feature embedding
CN112527670A (en) * 2020-12-18 2021-03-19 武汉理工大学 Method for predicting software aging defects in project based on Active Learning
CN113778811A (en) * 2021-09-28 2021-12-10 重庆邮电大学 Fault monitoring method and system based on deep convolution migration learning software system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108459955A (en) * 2017-09-29 2018-08-28 重庆大学 Software Defects Predict Methods based on depth autoencoder network
CN108984613A (en) * 2018-06-12 2018-12-11 北京航空航天大学 A kind of defect report spanned item mesh classification method based on transfer learning
US20190138731A1 (en) * 2016-04-22 2019-05-09 Lin Tan Method for determining defects and vulnerabilities in software code
CN110162475A (en) * 2019-05-27 2019-08-23 浙江工业大学 A kind of Software Defects Predict Methods based on depth migration
US20190265970A1 (en) * 2018-02-28 2019-08-29 Fujitsu Limited Automatic identification of relevant software projects for cross project learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138731A1 (en) * 2016-04-22 2019-05-09 Lin Tan Method for determining defects and vulnerabilities in software code
CN108459955A (en) * 2017-09-29 2018-08-28 重庆大学 Software Defects Predict Methods based on depth autoencoder network
US20190265970A1 (en) * 2018-02-28 2019-08-29 Fujitsu Limited Automatic identification of relevant software projects for cross project learning
CN108984613A (en) * 2018-06-12 2018-12-11 北京航空航天大学 A kind of defect report spanned item mesh classification method based on transfer learning
CN110162475A (en) * 2019-05-27 2019-08-23 浙江工业大学 A kind of Software Defects Predict Methods based on depth migration

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
何吉元 等: "一种半监督集成跨项目软件缺陷预测方法", 《软件学报》 *
倪超 等: "基于特征迁移和实例迁移的跨项目缺陷预测方法", 《基于特征迁移和实例迁移的跨项目缺陷预测方法 *
刘树毅: "基于特征迁移的跨项目软件缺陷预测", 《中国优秀博硕士学位论文全文数据库》 *
宫丽娜等: "软件缺陷预测技术研究进展", 《软件学报》 *
张天伦等: "基于代价极速学习机的软件缺陷报告分类方法", 《软件学报》 *
彭思琪: "基于训练数据选择的跨项目软件缺陷预测方法研究", 《中国优秀硕士学位论文全文数据库》 *
毛发贵等: "基于实例迁移的跨项目软件缺陷预测", 《计算机科学与探索》 *
陈翔等: "跨项目软件缺陷预测方法研究综述", 《计算机学报》 *
陈雅: "基于特征选择和实例迁移的软件缺陷预测方法研究", 《中国优秀博硕士学位论文全文数据库》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582325A (en) * 2020-04-20 2020-08-25 华南理工大学 Multi-order feature combination method based on automatic feature coding
CN111582325B (en) * 2020-04-20 2023-04-07 华南理工大学 Multi-order feature combination method based on automatic feature coding
CN111860592A (en) * 2020-06-16 2020-10-30 江苏大学 Solar cell defect classification detection method under condition of few samples
CN112148605A (en) * 2020-09-22 2020-12-29 华南理工大学 Software defect prediction method based on spectral clustering and semi-supervised learning
CN112199280A (en) * 2020-09-30 2021-01-08 三维通信股份有限公司 Defect prediction method and apparatus, storage medium, and electronic apparatus
WO2022068200A1 (en) * 2020-09-30 2022-04-07 三维通信股份有限公司 Defect prediction method and apparatus, storage medium, and electronic device
CN112346974A (en) * 2020-11-07 2021-02-09 重庆大学 Cross-mobile application program instant defect prediction method based on depth feature embedding
CN112346974B (en) * 2020-11-07 2023-08-22 重庆大学 Depth feature embedding-based cross-mobile application program instant defect prediction method
CN112527670A (en) * 2020-12-18 2021-03-19 武汉理工大学 Method for predicting software aging defects in project based on Active Learning
CN113778811A (en) * 2021-09-28 2021-12-10 重庆邮电大学 Fault monitoring method and system based on deep convolution migration learning software system

Also Published As

Publication number Publication date
CN110751186B (en) 2022-04-08

Similar Documents

Publication Publication Date Title
CN110751186B (en) Cross-project software defect prediction method based on supervised expression learning
CN109408389B (en) Code defect detection method and device based on deep learning
CN111914644B (en) Dual-mode cooperation based weak supervision time sequence action positioning method and system
CN110349597A (en) A kind of speech detection method and device
CN105740984A (en) Product concept performance evaluation method based on performance prediction
CN106203534A (en) A kind of cost-sensitive Software Defects Predict Methods based on Boosting
CN110647830A (en) Bearing fault diagnosis method based on convolutional neural network and Gaussian mixture model
Wan et al. Supervised representation learning approach for cross-project aging-related bug prediction
CN113342597B (en) System fault prediction method based on Gaussian mixture hidden Markov model
CN111290947A (en) Cross-software defect prediction method based on countermeasure judgment
CN114064459A (en) Software defect prediction method based on generation countermeasure network and ensemble learning
CN116089894B (en) Unknown fault diagnosis method for water chilling unit based on semi-supervised countermeasure variation automatic coding
CN117171713A (en) Cross self-adaptive deep migration learning method and system based on bearing service life
CN117056226A (en) Cross-project software defect number prediction method based on transfer learning
CN115049627B (en) Steel surface defect detection method and system based on domain self-adaptive depth migration network
CN115292820A (en) Method for predicting residual service life of urban rail train bearing
CN113592028A (en) Method and system for identifying logging fluid by using multi-expert classification committee machine
CN114943328A (en) SARIMA-GRU time sequence prediction model based on BP neural network nonlinear combination
CN109919464B (en) Aging screening method applied to high-power laser
CN113239021B (en) Data migration method for predicting residual life of similar products
CN116756634A (en) Intelligent building fault diagnosis method and device based on field self-adaption
Jing Neural Network-based Pattern Recognition in the Framework of Edge Computing
CN115599698A (en) Software defect prediction method and system based on class association rule
JP2024516440A (en) Systems and methods for data classification
CN117171700A (en) Drilling overflow prediction combined model based on deep learning and model timely silence updating and migration learning method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant