WO2022068200A1 - 缺陷预测方法和装置、存储介质和电子装置 - Google Patents

缺陷预测方法和装置、存储介质和电子装置 Download PDF

Info

Publication number
WO2022068200A1
WO2022068200A1 PCT/CN2021/091757 CN2021091757W WO2022068200A1 WO 2022068200 A1 WO2022068200 A1 WO 2022068200A1 CN 2021091757 W CN2021091757 W CN 2021091757W WO 2022068200 A1 WO2022068200 A1 WO 2022068200A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
network
intrinsic
target
coding vector
Prior art date
Application number
PCT/CN2021/091757
Other languages
English (en)
French (fr)
Inventor
韩璐
严军荣
Original Assignee
三维通信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三维通信股份有限公司 filed Critical 三维通信股份有限公司
Publication of WO2022068200A1 publication Critical patent/WO2022068200A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3628Software debugging of optimised code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Definitions

  • the present invention relates to the field of communications, and in particular, to a defect prediction method and device, a storage medium and an electronic device.
  • Embodiments of the present invention provide a defect prediction method and device, a storage medium, and an electronic device to at least solve the technical problem of heterogeneity in the data structures of the source domain and the target domain during defect prediction in the related art.
  • a defect prediction method comprising: encoding a source domain data set and a target domain data set respectively through a first target network to obtain a first intrinsic encoding vector corresponding to the source domain data set, And the second intrinsic coding vector corresponding to the above-mentioned target domain data set;
  • the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector is obtained.
  • the second potential coding vector corresponding to the above-mentioned second intrinsic coding vector determines the first feature vector by the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and pass the above-mentioned second intrinsic coding vector and the above-mentioned second potential coding vector
  • the vector determines a second feature vector; the target classifier is used to classify the second feature vector to obtain a classification result, wherein the classification result is set to indicate whether the second feature vector has defects.
  • a defect prediction apparatus including: a first processing unit configured to encode the source domain data set and the target domain data set respectively through the first target network to obtain the above-mentioned source domain data The first intrinsic coding vector corresponding to the set and the second intrinsic coding vector corresponding to the above-mentioned target domain data set; the second processing unit is configured to input the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector to the second target respectively.
  • a network to obtain a first potential encoding vector corresponding to the above-mentioned first intrinsic encoding vector and a second potential encoding vector corresponding to the above-mentioned second intrinsic encoding vector; a third processing unit, set to pass the above-mentioned first intrinsic encoding vector and the above-mentioned first
  • the potential coding vector determines the first feature vector
  • the second feature vector is determined by the above-mentioned second intrinsic coding vector and the above-mentioned second potential coding vector;
  • the fourth processing unit is configured to use the target classifier to classify the above-mentioned second feature vector, A classification result is obtained, wherein the classification result is set to indicate whether the second feature vector has defects.
  • a computer-readable storage medium is also provided, where a computer program is stored in the computer-readable storage medium, wherein the computer program is configured to execute the above-mentioned defect prediction method when running .
  • an electronic device including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the above-mentioned computer program through the computer program Defect prediction method.
  • the source domain data set and the target domain data set are encoded respectively through the first target network, so as to obtain the first intrinsic coding vector corresponding to the above-mentioned source domain data set and the second intrinsic coding vector corresponding to the above-mentioned target domain data set.
  • the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are input to the second target network respectively, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second potential corresponding to the above-mentioned second intrinsic coding vector are obtained.
  • the second feature vector is classified to obtain a classification result, wherein the classification result is set to indicate whether the second feature vector has defects.
  • FIG. 1 is a schematic diagram of an application environment of a defect prediction method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an optional defect prediction method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of another optional defect prediction method according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an optional defect prediction apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an optional electronic device according to an embodiment of the present invention.
  • FIG. 1 is a block diagram of a hardware structure of a mobile terminal according to a defect prediction method according to an embodiment of the present invention.
  • the mobile terminal may include one or more (only one is shown in FIG.
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 configured to store data, wherein the above-mentioned mobile terminal may further include a transmission device 106 and an input/output device 108 configured as a communication function.
  • a processing device such as a microprocessor MCU or a programmable logic device FPGA
  • a memory 104 configured to store data
  • the above-mentioned mobile terminal may further include a transmission device 106 and an input/output device 108 configured as a communication function.
  • FIG. 1 is only a schematic diagram, which does not limit the structure of the above-mentioned mobile terminal.
  • the mobile terminal may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG. 1 .
  • the memory 104 may be configured to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the defect prediction method in the embodiment of the present invention.
  • the processor 102 executes the computer programs stored in the memory 104 by running the computer programs.
  • Various functional applications and data processing implement the above method.
  • Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include memory located remotely from the processor 102, and these remote memories may be connected to the mobile terminal through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • Transmission means 106 are arranged to receive or transmit data via a network.
  • the specific example of the above-mentioned network may include a wireless network provided by a communication provider of the mobile terminal.
  • the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station so as to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is configured to communicate with the Internet in a wireless manner.
  • RF Radio Frequency
  • the above method may be set in the scenario of software defect prediction, which is not limited in this embodiment.
  • the flow of the above-mentioned defect prediction method may include the steps:
  • Step S202 Encode the source domain data set and the target domain data set respectively through the first target network to obtain a first intrinsic coding vector corresponding to the source domain data set and a second intrinsic coding vector corresponding to the target domain data set.
  • Step S204 the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input into the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the first potential coding vector corresponding to the above-mentioned second intrinsic coding vector are obtained. 2 latent encoding vectors.
  • Step S206 determining a first feature vector by using the first intrinsic coding vector and the first potential coding vector, and determining a second feature vector by using the second intrinsic coding vector and the second potential coding vector.
  • Step S208 using the target classifier to classify the second feature vector to obtain a classification result, wherein the classification result is set to indicate whether the second feature vector has defects.
  • the source domain data set and the target domain data set are encoded respectively through the first target network, and the first intrinsic encoding vector corresponding to the source domain data set and the second intrinsic encoding corresponding to the target domain data set are obtained.
  • vector; the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second corresponding to the above-mentioned second intrinsic coding vector are obtained.
  • Potential coding vector determine the first feature vector by the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and determine the second feature vector by the above-mentioned second intrinsic coding vector and the above-mentioned second potential coding vector; use the target classifier to The second feature vector is classified to obtain a classification result, wherein the classification result is set to indicate whether the second feature vector has defects.
  • the source domain data set and the target domain data set are respectively encoded by the first target network, so as to obtain the first intrinsic encoding vector corresponding to the source domain data set and the target domain data.
  • the above method further includes: constructing a source domain network, wherein the above-mentioned first target network includes the above-mentioned source domain network; inputting the above-mentioned source domain data set into the above-mentioned source domain network; the first network parameter corresponding to the above-mentioned source domain network;
  • the above-mentioned first network parameter is set to represent the reconstruction error between the data input to the above-mentioned source domain network and the data outputted from the above-mentioned source domain network
  • the above-mentioned Q1 is the above-mentioned first network parameter
  • the i-th of the above-mentioned source domain network enter
  • the above source domain network is M+1 is the number of layers of the above source domain network,
  • the corresponding second intrinsic coding vector includes: when the above-mentioned Q1 is less than the first threshold, determining the above-mentioned first intrinsic coding vector by the following formula: Wherein, the above-mentioned C (M/2, source) is the above-mentioned first intrinsic coding vector; When the above-mentioned Q2 is less than the second threshold, the above-mentioned second intrinsic coding vector is determined by the following formula: Wherein, the above-mentioned C (M/2, order) is the above-mentioned second intrinsic coding vector.
  • the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network to obtain the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector
  • the above The second potential encoding vector corresponding to the second intrinsic encoding vector includes: constructing a global encoding network, wherein the second target network includes the global encoding network; inputting the first intrinsic encoding vector and the second intrinsic encoding vector into The above-mentioned global coding network obtains a first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and a second potential coding vector corresponding to the above-mentioned second intrinsic coding vector; The vectors are respectively input into the global encoding network, and after obtaining the first potential encoding vector corresponding to the first intrinsic encoding vector and the second potential encoding vector corresponding to the second intrinsic encoding vector, the method further includes: determining the above the third network parameter corresponding to
  • the first feature vector is determined by the first intrinsic coding vector and the first potential coding vector
  • the second intrinsic coding vector is determined by the second intrinsic coding vector and the second potential coding vector.
  • the above method further includes: determining the target network parameters of the above-mentioned second target network by the following formula;
  • the first feature vector is determined by the first intrinsic coding vector and the first potential coding vector
  • the second feature is determined by the second intrinsic coding vector and the second potential coding vector vector, including: determining the above-mentioned first eigenvector by the following formula: Wherein, the above-mentioned first feature vector is the above-mentioned
  • the above-mentioned second eigenvector is determined by the following formula: Wherein, the above-mentioned second feature vector is the above-mentioned
  • the above classifier is a random forest classifier.
  • the method may include the following steps:
  • this application is implemented as an unsupervised cross-item defect prediction based on a dual-coding network, assuming represents the source project domain (source domain), represents the ith sample in the source domain, and Ns represents the number of samples in the source domain. Assumption represents the target item domain (target domain), represents the ith sample in the target domain, N t represents the number of samples in X (object) ; I represents the potential common complete representation to be learned for the source and target domains.
  • source domain represents the ith sample in the source domain
  • Ns represents the number of samples in the source domain.
  • Assumption represents the target item domain (target domain), represents the ith sample in the target domain, N t represents the number of samples in X (object) ;
  • I represents the potential common complete representation to be learned for the source and target domains.
  • Step 1 Build a fully connected deep neural network for the source domain and the target domain, respectively, to automatically encode the data in the domain.
  • the specific method is, for the source domain network:
  • M+1 is the number of layers in the source domain network; let represents the ith input of the source domain network, for any input sample of the network make express The reconstructed output representation after learning by the network, then
  • Optimal encoding representation of data in the source domain It can be obtained by minimizing the reconstruction error of the input and output of the source domain network:
  • M+1 is the number of layers of the target domain network; let represents the ith input of the target domain network, for any input sample of the network make express The reconstructed output representation after learning by the network, then
  • Optimal encoding representation of data in the target domain It can be obtained by minimizing the reconstruction error of the network input and output of the target domain:
  • Step 2 based on the obtained network parameters, generate intrinsic coding representations of the respective corresponding domains. Specifically, based on formulas (2) and (4), the network parameters of the source domain network and the target network are learned respectively. Based on network parameters and The intrinsic coding representations of all samples in the source and target domains are obtained, as shown in equations (5) and (6), respectively:
  • a global encoding network is constructed, which takes the intrinsic encoding representations of the respective domains as input and learns their shared latent full encoding representations. Specifically, in order to ensure that the learned L can completely reconstruct the source domain encoding representation C (M/2, source) and the target domain encoding representation C (M/2, destination) , instead of simply encoding the source domain and The target domain encodes to learn a common space, here by constructing a fully connected network to represent the global encoding network to learn its common latent complete representation space.
  • Step 4 Based on the obtained network parameters of the global encoding network, the deep feature representations of the source domain and the target domain are obtained.
  • the specific method is to jointly learn the internal coding network and the global coding network of each domain, learn the optimal network parameters, and solve the deep feature representation of the source and target domains.
  • the objective function for solving the deep feature representation of the source and target domains is generated as follows:
  • Step 5 For each deep feature representation of the target domain, use a random forest classifier for classification to obtain the final prediction result. Specifically, first, according to formula (8), each network parameter is obtained Then, the deep feature representation of the source domain and the target domain is obtained. and and represent the deep feature representation of the i-th sample in the source and target domains, respectively; then, for each sample in the target domain Classify using a random forest classifier, predicting whether it is defective or not.
  • an experiment is performed on RELINK, one of the public data sets commonly used for software defect prediction, to illustrate its beneficial effects.
  • the number of sample metrics in the RELINK dataset is 26, which are composed of complex codeability metrics and other object-oriented metrics.
  • RELINK contains 3 projects: APACHE, SAFE and ZXING, the details are shown in Table 1 (RELINK dataset).
  • the experimental settings are as follows: select one of the 3 items as the target item in turn, and select the remaining 2 items as the source item in turn, that is, there are APACHE--SAFE, APACHE--ZXING; SAFE--ZXING combinations.
  • the results reported in this experiment are the average of the target project results.
  • the experiment uses F-measure and recall rate pd evaluation index when evaluating the performance of cross-item defect prediction. pd+precision). The larger the F-measure and pd values, the better the performance of cross-item defect prediction.
  • Table 2 F-measure (Fm) and pd of each method on the RELINK data set lists the cross-sectional performance of the method of the present invention and the comparison method on the RELINK data set. F-Meaure and Pd in Project Defect Prediction.
  • the cross-project defect prediction performance of the method of the present invention is better than that of principal component analysis (Principle Component Analysis, referred to as PCA), Canonical Correlation Analysis (Canonical Correlation Analysis, referred to as CCA) and defect transfer learning (Transfer Defect Learning). , referred to as TCA) method.
  • PCA Principal component analysis
  • CCA Canonical Correlation Analysis
  • TCA Transfer Defect Learning
  • the PCA method mainly considers the dimensionality reduction of the samples, and does not pay too much attention to the maximum retention of the internal information of the samples during the dimensionality reduction process, so the performance is not as good as the method of the present invention; compared with CCA and TCA, the method of the present invention can extract samples more deeply characteristics, which shows the superiority of this method.
  • cross-project software defect prediction does not depend on a large amount of historical data of the same project, and secondly, there is no need to worry about the hidden danger of outdated projects.
  • CPDP can perform metric analysis on the latest software warehouse, and guarantee the defect prediction performance according to the potential feature relationship between projects.
  • transfer learning is an important option, by transferring the knowledge learned from the source item to a related but different target domain.
  • the invention combines the idea of migration learning and deep learning technology to solve the problem of distribution differences in cross-projects.
  • the deep self-encoder is a deep neural network.
  • the present invention applies the deep dual self-encoder network in the field of CPDP for the first time, aiming to pass the metric element (such as the number of lines of code, cyclic structure, recursion depth, etc.) through the multi-layer neural network.
  • the post-combination is advanced abstract and complex deep features, and then the obtained deep features are used for modeling to improve the performance of software defect prediction.
  • a defect prediction apparatus is also provided. As shown in FIG. 4 , the apparatus includes:
  • the first processing unit 402 is configured to encode the source domain data set and the target domain data set respectively through the first target network to obtain the first intrinsic coding vector corresponding to the above-mentioned source domain data set and the first corresponding to the above-mentioned target domain data set.
  • two intrinsic coding vectors are two intrinsic coding vectors;
  • the second processing unit 404 is configured to input the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector to the second target network respectively, to obtain a first potential coding vector corresponding to the above-mentioned first intrinsic coding vector, and the above-mentioned second intrinsic coding vector the second potential encoding vector corresponding to the encoding vector;
  • the third processing unit 406 is configured to determine the first eigenvector through the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and determine the second eigenvector through the above-mentioned second intrinsic coding vector and the above-mentioned second potential coding vector;
  • the fourth processing unit 408 is configured to use the target classifier to classify the second feature vector to obtain a classification result, wherein the classification result is configured to indicate whether the second feature vector has defects.
  • the source domain data set and the target domain data set are encoded respectively through the first target network, and the first intrinsic encoding vector corresponding to the source domain data set and the second intrinsic encoding corresponding to the target domain data set are obtained.
  • vector; the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second corresponding to the above-mentioned second intrinsic coding vector are obtained.
  • Potential coding vector determine the first feature vector by the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and determine the second feature vector by the above-mentioned second intrinsic coding vector and the above-mentioned second potential coding vector; use the target classifier to The second feature vector is classified to obtain a classification result, wherein the classification result is set to indicate whether the second feature vector has defects.
  • the above-mentioned apparatus further includes: a fifth processing unit configured to construct a source domain network, wherein the above-mentioned first target network includes the above-mentioned source domain network; and the above-mentioned source domain data set is input into the above-mentioned source domain.
  • the sixth processing unit is configured to construct a target domain network, wherein the first target network includes the target domain network; the target domain data set is input into the target domain network; the second target domain network corresponding to the above-mentioned target domain network is determined by the following formula: Network parameters; Wherein, the above-mentioned second network parameter is set to represent the reconstruction error between the data input to the above-mentioned target domain network and the
  • the above-mentioned first processing unit is further configured to determine the above-mentioned first intrinsic coding vector by the following formula when the above-mentioned Q1 is smaller than the first threshold: Wherein, the above-mentioned C (M/2, source) is the above-mentioned first intrinsic coding vector; When the above-mentioned Q2 is less than the second threshold, the above-mentioned second intrinsic coding vector is determined by the following formula: Wherein, the above-mentioned C (M/2, order) is the above-mentioned second intrinsic coding vector.
  • the above-mentioned second processing unit is further configured to construct a global encoding network, wherein the above-mentioned second target network includes the above-mentioned global encoding network; the above-mentioned first intrinsic encoding vector and the above-mentioned second intrinsic encoding vector Input to the above-mentioned global coding network respectively, to obtain the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second potential coding vector corresponding to the above-mentioned second intrinsic coding vector;
  • the above-mentioned device also includes: a seventh processing unit, set to The third network parameter corresponding to the above-mentioned global encoding network is determined by the following formula; Wherein, the above-mentioned C (M/2, source) is the above-mentioned first intrinsic coding vector, the above-mentioned G (L, the source) is the above-mentioned first potential coding vector, and the above-mentioned
  • the above-mentioned apparatus further includes: an eighth processing unit, configured to determine the first feature vector according to the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and use the above-mentioned second intrinsic coding vector to determine the first feature vector.
  • an eighth processing unit configured to determine the first feature vector according to the above-mentioned first intrinsic coding vector and the above-mentioned first potential coding vector, and use the above-mentioned second intrinsic coding vector to determine the first feature vector.
  • the target network parameters of the above-mentioned second target network are determined by the following formula;
  • the above-mentioned third processing unit is further configured to determine the above-mentioned first feature vector by the following formula: Wherein, the above-mentioned first feature vector is the above-mentioned
  • the above-mentioned second eigenvector is determined by the following formula: Wherein, the above-mentioned second feature vector is the above-mentioned
  • the above classifier is a random forest classifier.
  • a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, wherein the computer program is configured to execute any one of the above when running steps in a method embodiment.
  • the above-mentioned computer-readable storage medium may be configured to store a computer program configured to perform the following steps:
  • the source domain data set and the target domain data set are respectively encoded by the first target network, and the first intrinsic coding vector corresponding to the above-mentioned source domain data set and the second intrinsic coding vector corresponding to the above-mentioned target domain data set are obtained;
  • the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second corresponding to the above-mentioned second intrinsic coding vector are obtained.
  • the above-mentioned storage medium may be configured to store a computer program configured to perform the following steps:
  • the storage medium may include: a flash disk, a ROM (Read-Only Memory, read-only memory), a RAM (Random Access Memory, a random access device), a magnetic disk or an optical disk, and the like.
  • an electronic device configured to implement the above defect prediction method.
  • the electronic device includes a memory 502 and a processor 505, and the memory 502 stores a computer A program, the processor 504 is configured to execute the steps in any one of the above method embodiments through a computer program.
  • the above-mentioned electronic apparatus may be located in at least one network device among multiple network devices of a computer network.
  • the above-mentioned processor may be configured to execute the following steps through a computer program:
  • the source domain data set and the target domain data set are respectively encoded by the first target network, and the first intrinsic coding vector corresponding to the above-mentioned source domain data set and the second intrinsic coding vector corresponding to the above-mentioned target domain data set are obtained;
  • the above-mentioned first intrinsic coding vector and the above-mentioned second intrinsic coding vector are respectively input to the second target network, and the first potential coding vector corresponding to the above-mentioned first intrinsic coding vector and the second corresponding to the above-mentioned second intrinsic coding vector are obtained.
  • FIG. 5 is for illustration only, and the electronic device may also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a handheld computer, and a mobile Internet device (Mobile Internet device). Internet Devices, MID), PAD and other terminal equipment.
  • FIG. 5 does not limit the structure of the above electronic device.
  • the electronic device may also include more or less components than those shown in FIG. 5 (eg, network interfaces, etc.), or have a different configuration than that shown in FIG. 5 .
  • the memory 502 may be configured to store software programs and modules, such as program instructions/modules corresponding to the defect prediction method and apparatus in the embodiments of the present invention, and the processor 504 executes the software programs and modules stored in the memory 502 by running the software programs and modules.
  • Memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • memory 502 may further include memory located remotely from processor 504, and these remote memories may be connected to the terminal through a network.
  • the memory 502 may be specifically, but not limited to, be set to store information such as the target height of the target object.
  • the above-mentioned memory 502 may include, but is not limited to, the first processing unit 402 , the second processing unit 404 , the third processing unit 406 , and the fourth processing unit 408 in the above-mentioned defect prediction apparatus.
  • it may also include, but is not limited to, other module units in the above-mentioned defect prediction apparatus, which will not be repeated in this example.
  • the above-mentioned transmission device 506 is configured to receive or send data via a network.
  • Specific examples of the above-mentioned networks may include wired networks and wireless networks.
  • the transmission device 506 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices and routers through a network cable so as to communicate with the Internet or a local area network.
  • the transmission device 506 is a radio frequency (RF) module, which is configured to communicate with the Internet in a wireless manner.
  • RF radio frequency
  • the above-mentioned electronic device further includes: a connection bus 508 configured to connect various module components in the above-mentioned electronic device.
  • the above-mentioned terminal or server may be a node in a distributed system, wherein the distributed system may be a blockchain system, and the blockchain system may be communicated by the multiple nodes through a network A distributed system formed by formal connections.
  • a peer-to-peer (P2P, Peer To Peer) network can be formed between nodes, and any form of computing equipment, such as servers, terminals and other electronic devices can become a node in the blockchain system by joining the peer-to-peer network.
  • the storage medium may include: a flash disk, a read-only memory (Read-Only Memory, ROM), a random access device (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
  • the integrated units in the above-mentioned embodiments are implemented in the form of software functional units and sold or used as independent products, they may be stored in the above-mentioned computer-readable storage medium.
  • the technical solution of the present invention is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, Several instructions are included to cause one or more computer devices (which may be personal computers, servers, or network devices, etc.) to perform all or part of the steps of the methods of various embodiments of the present invention.
  • the disclosed client may be implemented in other manners.
  • the device embodiments described above are only illustrative, for example, the division of units is only a logical function division. In actual implementation, there may be other division methods, for example, multiple units or components may be combined or integrated into Another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.
  • Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • a defect prediction method and device, a storage medium, and an electronic device provided by the embodiments of the present invention have the following beneficial effects: It solves the problem that in the related art, in the defect prediction, the data structures of the source domain and the target domain are heterogeneous. Sexual technical issues.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了缺陷预测方法和装置、存储介质和电子装置。该方法包括:通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;使用目标分类器对上述第二特征向量进行分类,得到分类结果。

Description

缺陷预测方法和装置、存储介质和电子装置 技术领域
本发明涉及通信领域,具体而言,涉及一种缺陷预测方法和装置、存储介质和电子装置。
背景技术
人们已经进入了信息化科技时代,小到移动电话、智能电视,大到飞机,动车,这些科技产品都离不开软件的控制。人们的生活也同时随着各种高科技产品的出现发生着翻天覆地的变化,社会也正朝着更加科技化、智能化的方向发展。同时各个行业对软件系统的依赖性越来越强。人们也越来越关注软件产品的质量,只有可靠性越高的软件产品才能最终得到用户的支持和认可。
对于一个软件项目的开发,软件若存在缺陷将会在软件质量中起严重的反向作用。如果软件中的潜在缺陷没有尽快解决,在软件运行期间可能会产生各种出乎意料的结果,轻则时间延误引起项目搁置,重则威胁到社会各界使用者的生命安全和财产安全,给企业或用户带来直接惨重的经济损失。软件开发计划SDP技术更加适设置为项目开发的早期阶段,提前识别出软件程序模块中的潜在缺陷。通过预测出软件中将会存在的缺陷并反映出相关信息,对于软件质量和优化测试资源具有重要意义。但是SDP技术并不能适设置为软件缺陷预测的全部阶段。
在每次开发一个新项目时,都需要对这个新的项目进行缺陷预测,在CPDP的解决方法中,迁移学习是一种重要的选择方案,通过将源项目学习到的知识迁移到相关但是不相同的目标域中。通过这种方式能够加快源项目的缺陷预测时间,但是,源域和目标域的数据结构存在异构性的问题。
针对相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的问题,尚未提出有效的技术方案。
发明内容
本发明实施例提供了一种缺陷预测方法和装置、存储介质和电子装置,以至少解决相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的技术问题。
根据本发明实施例的一个方面,提供了缺陷预测方法,包括:通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编 码向量、以及上述目标域数据集对应的第二内在编码向量;将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
根据本发明实施例的另一方面,还提供了缺陷预测装置,包括:第一处理单元,设置为通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;第二处理单元,设置为将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;第三处理单元,设置为通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;第四处理单元,设置为使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
根据本发明实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述缺陷预测方法。
根据本发明实施例的又一方面,还提供了一种电子装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,上述处理器通过计算机程序执行上述的缺陷预测方法。
通过本发明,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。采用上述方式,解决了相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的技术问题。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明设置为解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的缺陷预测方法的应用环境的示意图;
图2是根据本发明实施例的一种可选的缺陷预测方法的流程示意图;
图3是根据本发明实施例的另一种可选的缺陷预测方法的流程示意图;
图4是根据本发明实施例的一种可选的缺陷预测装置的结构示意图;
图5是根据本发明实施例的一种可选的电子装置的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是设置为区别类似的对象,而不必设置为描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图1是本发明实施例的一种缺陷预测方法的移动终端的硬件结构框图。如图1所示,移动终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和设置为存储数据的存储器104,其中,上述移动终端还可以包括设置为通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述移动终端的结构造成限定。例如,移动终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配 置。
存储器104可设置为存储计算机程序,例如,应用软件的软件程序以及模块,如本发明实施例中的缺陷预测方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106设置为经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,简称为RF)模块,其设置为通过无线方式与互联网进行通讯。
可选地,上述方法可以应设置为软件缺陷预测的场景中,本实施例在此不作任何限定。
可选地,作为一种可选的实施方式,如图2所示,上述缺陷预测方法的流程可以包括步骤:
步骤S202,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量。
步骤S204,将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量。
步骤S206,通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量。
步骤S208,使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
通过本实施例,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标 网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。采用上述方式,解决了相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的技术问题。
在一种可选的实施例中,在上述通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量之前,上述方法还包括:构建源域网络,其中,上述第一目标网络包括上述源域网络;将上述源域数据集输入至上述源域网络;通过以下公式确定上述源域网络对应的第一网络参数;
Figure PCTCN2021091757-appb-000001
其中,上述第一网络参数设置为表示输入至上述源域网络的数据与输出上述源域网络的数据之间的重构误差,上述Q1为上述第一网络参数,上述源域网络的第i个输入
Figure PCTCN2021091757-appb-000002
Figure PCTCN2021091757-appb-000003
上述源域网络为
Figure PCTCN2021091757-appb-000004
M+1为上述源域网络的层数,
Figure PCTCN2021091757-appb-000005
Figure PCTCN2021091757-appb-000006
经过上述源域网络学习后的重构输出表示,
Figure PCTCN2021091757-appb-000007
构建目标域网络,其中,上述第一目标网络包括上述目标域网络;将上述目标域数据集输入至上述目标域网络;通过以下公式确定上述目标域网络对应的第二网络参数;
Figure PCTCN2021091757-appb-000008
其中,上述第二网络参数设置为表示输入至上述目标域网络的数据与输出上述目标域网络的数据之间的重构误差,上述Q2为上述第二网络参数,上述目标域网络的第i个输入
Figure PCTCN2021091757-appb-000009
Figure PCTCN2021091757-appb-000010
上述目标域网络为
Figure PCTCN2021091757-appb-000011
M+1为上述目标域网络的层数,
Figure PCTCN2021091757-appb-000012
Figure PCTCN2021091757-appb-000013
经过上述目标域网络学习后的重构输出表示,
Figure PCTCN2021091757-appb-000014
在一种可选的实施例中,上述通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量,包括:在上述Q1小于第一阈值的情况下,通过以下公式 确定上述第一内在编码向量:
Figure PCTCN2021091757-appb-000015
其中,上述C (M/2,源)为上述第一内在编码向量;在上述Q2小于第二阈值的情况下,通过以下公式确定上述第二内在编码向量:
Figure PCTCN2021091757-appb-000016
其中,上述C (M/2,目)为上述第二内在编码向量。
在一种可选的实施例中,上述将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量,包括:构建全局编码网络,其中,上述第二目标网络包括上述全局编码网络;将上述第一内在编码向量和上述第二内在编码向量分别输入至上述全局编码网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;在上述将上述第一内在编码向量和上述第二内在编码向量分别输入至上述全局编码网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量之后,上述方法还包括:通过以下公式确定上述全局编码网络对应的第三网络参数;
Figure PCTCN2021091757-appb-000017
其中,上述C (M/2,源)为上述第一内在编码向量,上述G (L,源)为上述第一潜在编码向量,上述C (M/2,目)为上述第二内在编码向量,上述G (L,目)为上述第二潜在编码向量,上述全局编码网络为
Figure PCTCN2021091757-appb-000018
Figure PCTCN2021091757-appb-000019
L表示上述全局编码网络的层数。
在一种可选的实施例中,在上述通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量之前,上述方法还包括:通过以下公式确定上述第二目标网络的目标网络参数;
Figure PCTCN2021091757-appb-000020
在一种可选的实施例中,上述通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量,包括:通过以下公式确定上述第一特征向量:
Figure PCTCN2021091757-appb-000021
其中,上述第一特征向量为上述
Figure PCTCN2021091757-appb-000022
通过以下公式确定上述第二特征向量:
Figure PCTCN2021091757-appb-000023
其中,上述第二特征向量为上述
Figure PCTCN2021091757-appb-000024
可选地,上述分类器为随机森林分类器。
下面结合一可选示例对缺陷预测方法的流程进行说明,如图3所示,该方法可以包括以下步骤:
如图3所示,本申请实施为基于双编码网络的无监督跨项目缺陷预测,假设
Figure PCTCN2021091757-appb-000025
表示源项目域(源域),
Figure PCTCN2021091757-appb-000026
表示源域中第i个样本,N s表示源域中样本的个数。假设
Figure PCTCN2021091757-appb-000027
表示目标项目域(目标域),
Figure PCTCN2021091757-appb-000028
表示目标域中第i个样本,N t表示X (目)中样本的个数;I表示将要为源域和目标域学习的潜在公共完整表示,具体步骤如下:
步骤1,分别构建针对源域和目标域的全连接深度神经网络,对域内数据进行自动编码,具体做法为,对于源域网络:
定义源域网络
Figure PCTCN2021091757-appb-000029
其中,
Figure PCTCN2021091757-appb-000030
M+1为源域网络的层数;令
Figure PCTCN2021091757-appb-000031
表示源域网络的第i个输入,对于该网络的任意一个输入样本
Figure PCTCN2021091757-appb-000032
Figure PCTCN2021091757-appb-000033
表示
Figure PCTCN2021091757-appb-000034
经过该网络学习后的重构输出表示,则
Figure PCTCN2021091757-appb-000035
源域内数据的最优编码表示
Figure PCTCN2021091757-appb-000036
可通过最小化源域网络输入和输出的重构误差获得:
Figure PCTCN2021091757-appb-000037
同理,对于目标域网络:
定义目标域网络
Figure PCTCN2021091757-appb-000038
其中,
Figure PCTCN2021091757-appb-000039
M+1为目标域网络的层数;令
Figure PCTCN2021091757-appb-000040
表示目标域网络的第i个输入,对于该网络的任意一个输入样本
Figure PCTCN2021091757-appb-000041
Figure PCTCN2021091757-appb-000042
表示
Figure PCTCN2021091757-appb-000043
经过该网络学习后的重构输出表示,则
Figure PCTCN2021091757-appb-000044
目标域内数据的最优编码表示
Figure PCTCN2021091757-appb-000045
可通过最小化目标域网络输入和输出的重构误差获得:
Figure PCTCN2021091757-appb-000046
步骤2,基于求解得到的网络参数,生成各自对应域的内在编码表示。具体做法为,基于公式(2)和(4),分别学习源域网络和目标网络的网络参数。基于网络参数
Figure PCTCN2021091757-appb-000047
Figure PCTCN2021091757-appb-000048
求得源域和目标域所有样本的内在编码表示,分别如公式(5)和(6)所示:
Figure PCTCN2021091757-appb-000049
Figure PCTCN2021091757-appb-000050
步骤3,构建全局编码网络,以各自域的内在编码表示为输入,学习其共有的潜在完整编码表示。具体做法为,为了确保学到的L能够完整地重构源域编码表示C (M/2,源)和目标域编码表示C (M/2,目),而不是简单地为源域编码和目标域编码去学习一个公共空间,这里通过构建全连接网络表示全局编码网络来学习其共有的潜在完整表示空间。令
Figure PCTCN2021091757-appb-000051
表示全局编码网络,
Figure PCTCN2021091757-appb-000052
表示源域通道的网络参数,
Figure PCTCN2021091757-appb-000053
表示目标域通道的网络参数,L表示网络的层数,G (0)=I表示网络的输入,则该全局编码网络的目标函数可表示为:
Figure PCTCN2021091757-appb-000054
步骤4,基于求解得到的全局编码网络的网络参数,获得源域和目标域的深度特征表示。具体做法为,联合学习各自域的内部编码网络和全局编码网络,学习最优网络参数,求解源域和目标域的深度特征表示。基于公式(2)、(4)和(7),生成求解源域和目标域的深度特征表示的目标函数如下:
Figure PCTCN2021091757-appb-000055
步骤5,对于目标域的每个深度特征表示,使用随机森林分类器进行分类,获取最终预测结果。具体为,首先根据公式(8),求取各个网络参数
Figure PCTCN2021091757-appb-000056
进而求得源域和目标域的深度特征表示
Figure PCTCN2021091757-appb-000057
Figure PCTCN2021091757-appb-000058
Figure PCTCN2021091757-appb-000059
Figure PCTCN2021091757-appb-000060
分别表示源域和目标域第i个样本的深度特征表示;然后,对于目标域中的每一个样本
Figure PCTCN2021091757-appb-000061
使用随机森林分类器进行分类,预测其是否有缺陷。
以下结合具体实验对本发明的有益效果进行说明。
本发明实施例中,在软件缺陷预测常用的其中一个公开数据集RELINK上进行实验说明其有益效果。RELINK数据集中样本度量个数为26,由复杂代码性度量和其他的面向对象度量。RELINK包含3个项目:APACHE、SAFE和ZXING,具体情况如表1(RELINK数据集)所示。本实验设置如下:依次选取3个项目中的一个作为目标项目,选取剩余的2个项目依次作为源项目,即共有APACHE--SAFE,APACHE--ZXING;SAFE--ZXING个组合。本实验报告的结果为目标项目结果的平均值。实验在评价跨项目缺陷预测的性能时使用F-measure和召回率pd评价指标,F-measure指标就是将召回率pd与准确率precision结合起来评价,即F-measure=2*pd*precision/(pd+precision)。F-measure和pd值越大说明跨项目缺陷预测的性能越好。
表1
项目名称 特征数 样本数 有缺陷样本数 缺陷样本数的比例
APACHE 26 194 98 50.52%
SAFE 26 56 22 39.29%
ZXING 26 399 118 29.57%
本发明实施例中,采用一种无监督学习方法,表2(各个方法在RELINK数据集上的F-measure(Fm)和pd)列出了本发明方法与对比方法在RELINK数据集上进行跨项目缺陷预测时的F-Meaure和Pd。从表2可以看出,本发明方法的跨项目缺陷预测性能均优于主成分分析(Principle Component Analysis,简称PCA)、典型关联分析(Canonical Correlation Analysis,简称CCA)和缺陷迁移学习(Transfer Defect Learning,简称TCA)方法。PCA方法主要考虑了对样本的降维,并未过多关注降维过程中样本内部信息的最大保留,因此性能不如本发明方法;与CCA、TCA相比,本发明方法能 更深度地提取样本的特征,这说明本方法具有优越性。
表2
Figure PCTCN2021091757-appb-000062
除上述实施例外,本发明还可以有其他实施方式。凡采用等同替换或等效变换形成的技术方案,均落在本发明要求的保护范围。
通过本实施例,跨项目软件缺陷预测(CPDP)作为SDP领域的一个重点研究方向,一来不依赖于同一项目的大量历史数据,二来不必担心项目存在过时的隐患。当软件的快速更新造成大量的软件项目数据过时,CPDP可以对最新的软件仓库进行度量分析,根据项目之间的潜在特征关系来保证缺陷预测性能。在CPDP的解决方法中,迁移学习是一种重要的选择方案,通过将源项目学习到的知识迁移到相关但是不相同的目标域中。本发明就是将迁移学习的思想和深度学习技术相结合来解决跨项目中分布差异问题。深度自编码器是一种深度神经网络,本发明首次将深度双自编码器网络应用在CPDP领域,旨在将度量元(如代码的行数、循环结构、递归深度等)通过多层神经网络后组合为高级抽象复杂的深度特征,再利用得到的深度特征进行建模,提高软件缺陷预测的各项性能。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
根据本发明实施例的又一方面,还提供了缺陷预测装置,如图4所示,该装置包括:
第一处理单元402,设置为通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;
第二处理单元404,设置为将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;
第三处理单元406,设置为通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;
第四处理单元408,设置为使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
通过本实施例,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。采用上述方式,解决了相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的技术问题。
作为一种可选的技术方案,上述装置还包括:第五处理单元,设置为构建源域网络,其中,上述第一目标网络包括上述源域网络;将上述源域数据集输入至上述源域网络;通过以下公式确定上述源域网络对应的第一网络参数;
Figure PCTCN2021091757-appb-000063
Figure PCTCN2021091757-appb-000064
其中,上述第一网络参数设置为表示输入至上述源域网络的数据与输出上述源域网络的数据之间的重构误差,上述Q1为上述第一网络参数,上述源域网络的第i个输入
Figure PCTCN2021091757-appb-000065
上述源域网络为
Figure PCTCN2021091757-appb-000066
Figure PCTCN2021091757-appb-000067
M+1为上述源域网络的层数,
Figure PCTCN2021091757-appb-000068
Figure PCTCN2021091757-appb-000069
经过上述源域网络学习后的重构输出表示,
Figure PCTCN2021091757-appb-000070
第六处理单元,设置为构建目标域网络,其中,上述第一目标网络包括上述目标域网络;将上述目标 域数据集输入至上述目标域网络;通过以下公式确定上述目标域网络对应的第二网络参数;
Figure PCTCN2021091757-appb-000071
其中,上述第二网络参数设置为表示输入至上述目标域网络的数据与输出上述目标域网络的数据之间的重构误差,上述Q2为上述第二网络参数,上述目标域网络的第i个输入
Figure PCTCN2021091757-appb-000072
上述目标域网络为
Figure PCTCN2021091757-appb-000073
M+1为上述目标域网络的层数,
Figure PCTCN2021091757-appb-000074
Figure PCTCN2021091757-appb-000075
经过上述目标域网络学习后的重构输出表示,
Figure PCTCN2021091757-appb-000076
作为一种可选的技术方案,上述第一处理单元,还设置为在上述Q1小于第一阈值的情况下,通过以下公式确定上述第一内在编码向量:
Figure PCTCN2021091757-appb-000077
其中,上述C (M/2,源)为上述第一内在编码向量;在上述Q2小于第二阈值的情况下,通过以下公式确定上述第二内在编码向量:
Figure PCTCN2021091757-appb-000078
其中,上述C (M/2,目)为上述第二内在编码向量。
作为一种可选的技术方案,上述第二处理单元,还设置为构建全局编码网络,其中,上述第二目标网络包括上述全局编码网络;将上述第一内在编码向量和上述第二内在编码向量分别输入至上述全局编码网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;上述装置还包括:第七处理单元,设置为通过以下公式确定上述全局编码网络对应的第三网络参数;
Figure PCTCN2021091757-appb-000079
其中,上述C (M/2,源)为上述第一内在编码向量,上述G (L,源)为上述第一潜在编码向量,上述C (M/2,目)为上述第二内在编码向量,上述G (L,目)为上述第二潜在编码向量,上述全局编码网络为
Figure PCTCN2021091757-appb-000080
Figure PCTCN2021091757-appb-000081
L表示上述全局编码网络的层数。
作为一种可选的技术方案,上述装置还包括:第八处理单元,设置为在上述通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量之前,通过以下公式确定上述第二目标网络的目标网络参数;
Figure PCTCN2021091757-appb-000082
Figure PCTCN2021091757-appb-000083
作为一种可选的技术方案,上述第三处理单元,还设置为通过以下公式确定上述第一特征向量:
Figure PCTCN2021091757-appb-000084
其中,上述第一特征向量为上述
Figure PCTCN2021091757-appb-000085
通过以下公式确定上述第二特征向量:
Figure PCTCN2021091757-appb-000086
其中,上述第二特征向量为上述
Figure PCTCN2021091757-appb-000087
作为一种可选的技术方案,上述分类器为随机森林分类器。
根据本发明的实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述计算机可读的存储介质可以被设置为存储设置为执行以下步骤的计算机程序:
S1,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;
S2,将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;
S3,通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;
S4,使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
可选地,在本实施例中,上述存储介质可以被设置为存储设置为执行以下步骤的计算机程序:
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、ROM(Read-Only Memory,只读存储器)、RAM(Random Access Memory,随机存取器)、磁盘或光盘等。
根据本发明实施例的又一个方面,还提供了一种设置为实施上述缺陷预测方法的电子装置,如图5所示,该电子装置包括存储器502和处理器505,该存储器502中存储有计算机程序,该处理器504被设置为通过计算机程序执行上述任一项方法实施 例中的步骤。
可选地,在本实施例中,上述电子装置可以位于计算机网络的多个网络设备中的至少一个网络设备。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:
S1,通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到上述源域数据集对应的第一内在编码向量、以及上述目标域数据集对应的第二内在编码向量;
S2,将上述第一内在编码向量和上述第二内在编码向量分别输入至第二目标网络,得到上述第一内在编码向量对应的第一潜在编码向量、以及上述第二内在编码向量对应的第二潜在编码向量;
S3,通过上述第一内在编码向量和上述第一潜在编码向量确定第一特征向量,以及通过上述第二内在编码向量和上述第二潜在编码向量确定第二特征向量;
S4,使用目标分类器对上述第二特征向量进行分类,得到分类结果,其中,上述分类结果设置为表示上述第二特征向量是否存在缺陷。
可选地,本领域普通技术人员可以理解,图5所示的结构仅为示意,电子装置也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图5其并不对上述电子装置的结构造成限定。例如,电子装置还可包括比图5中所示更多或者更少的组件(如网络接口等),或者具有与图5所示不同的配置。
其中,存储器502可设置为存储软件程序以及模块,如本发明实施例中的缺陷预测方法和装置对应的程序指令/模块,处理器504通过运行存储在存储器502内的软件程序以及模块,从而执行各种功能应用以及原始数据信息传输,即实现上述的缺陷预测方法。存储器502可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器502可进一步包括相对于处理器504远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。其中,存储器502具体可以但不限于设置为存储目标对象的目标高度等信息。作为一种示例,如图5所示,上述存储器502中可以但不限于包括上述缺陷预测装置中的第一处理单元402、第二处理单元404,第三处理单元406,第四处理单元408。此外,还可以包括但不限于上述缺陷预测装置中的其他模块单元,本示例中不再赘述。
可选地,上述的传输装置506设置为经由一个网络接收或者发送数据。上述的网 络具体实例可包括有线网络及无线网络。在一个实例中,传输装置506包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置506为射频(Radio Frequency,RF)模块,其设置为通过无线方式与互联网进行通讯。
此外,上述电子装置还包括:连接总线508,设置为连接上述电子装置中的各个模块部件。
在其他实施例中,上述终端或者服务器可以是一个分布式系统中的一个节点,其中,该分布式系统可以为区块链系统,该区块链系统可以是由该多个节点通过网络通信的形式连接形成的分布式系统。其中,节点之间可以组成点对点(P2P,Peer To Peer)网络,任意形式的计算设备,比如服务器、终端等电子设备都可以通过加入该点对点网络而成为该区块链系统中的一个节点。
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的 部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
工业实用性
如上所述,本发明实施例提供的一种缺陷预测方法和装置、存储介质和电子装置具有以下有益效果:解决了相关技术中,在缺陷预测时,源域和目标域的数据结构存在异构性的技术问题。

Claims (10)

  1. 一种缺陷预测方法,包括:
    通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到所述源域数据集对应的第一内在编码向量、以及所述目标域数据集对应的第二内在编码向量;
    将所述第一内在编码向量和所述第二内在编码向量分别输入至第二目标网络,得到所述第一内在编码向量对应的第一潜在编码向量、以及所述第二内在编码向量对应的第二潜在编码向量;
    通过所述第一内在编码向量和所述第一潜在编码向量确定第一特征向量,以及通过所述第二内在编码向量和所述第二潜在编码向量确定第二特征向量;
    使用目标分类器对所述第二特征向量进行分类,得到分类结果,其中,所述分类结果设置为表示所述第二特征向量是否存在缺陷。
  2. 根据权利要求1所述的方法,其中,在所述通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到所述源域数据集对应的第一内在编码向量、以及所述目标域数据集对应的第二内在编码向量之前,所述方法还包括:
    构建源域网络,其中,所述第一目标网络包括所述源域网络;
    将所述源域数据集输入至所述源域网络;
    通过以下公式确定所述源域网络对应的第一网络参数;
    Figure PCTCN2021091757-appb-100001
    其中,所述第一网络参数设置为表示输入至所述源域网络的数据与输出所述源域网络的数据之间的重构误差,所述Q1为所述第一网络参数,所述源域网络的第i个输入
    Figure PCTCN2021091757-appb-100002
    所述源域网络为
    Figure PCTCN2021091757-appb-100003
    M+1为所述源域网络的层数,
    Figure PCTCN2021091757-appb-100004
    Figure PCTCN2021091757-appb-100005
    经过所述源域网络学习后的重构输出表示,
    Figure PCTCN2021091757-appb-100006
    构建目标域网络,其中,所述第一目标网络包括所述目标域网络;
    将所述目标域数据集输入至所述目标域网络;
    通过以下公式确定所述目标域网络对应的第二网络参数;
    Figure PCTCN2021091757-appb-100007
    其中,所述第二网络参数设置为表示输入至所述目标域网络的数据与输出所述目标域网络的数据之间的重构误差,所述Q2为所述第二网络参数,所述目标域网络的第i个输入
    Figure PCTCN2021091757-appb-100008
    所述目标域网络为
    Figure PCTCN2021091757-appb-100009
    Figure PCTCN2021091757-appb-100010
    M+1为所述目标域网络的层数,
    Figure PCTCN2021091757-appb-100011
    Figure PCTCN2021091757-appb-100012
    经过所述目标域网络学习后的重构输出表示,
    Figure PCTCN2021091757-appb-100013
  3. 根据权利要求2所述的方法,其中,所述通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到所述源域数据集对应的第一内在编码向量、以及所述目标域数据集对应的第二内在编码向量,包括:
    在所述Q1小于第一阈值的情况下,通过以下公式确定所述第一内在编码向量:
    Figure PCTCN2021091757-appb-100014
    其中,所述C (M/2,源)为所述第一内在编码向量;
    在所述Q2小于第二阈值的情况下,通过以下公式确定所述第二内在编码向量:
    Figure PCTCN2021091757-appb-100015
    其中,所述C (M/2,目)为所述第二内在编码向量。
  4. 根据权利要求2所述的方法,其中,所述将所述第一内在编码向量和所述第二内在编码向量分别输入至第二目标网络,得到所述第一内在编码向量对应的第一潜在编码向量、以及所述第二内在编码向量对应的第二潜在编码向量,包括:
    构建全局编码网络,其中,所述第二目标网络包括所述全局编码网络;
    将所述第一内在编码向量和所述第二内在编码向量分别输入至所述全局编码网络,得到所述第一内在编码向量对应的第一潜在编码向量、以及所述第二内在编码向量对应的第二潜在编码向量;
    在所述将所述第一内在编码向量和所述第二内在编码向量分别输入至所述全局编码网络,得到所述第一内在编码向量对应的第一潜在编码向量、以及所述第二内在编码向量对应的第二潜在编码向量之后,所述方法还包括:
    通过以下公式确定所述全局编码网络对应的第三网络参数;
    Figure PCTCN2021091757-appb-100016
    其中,所述C (M/2,源)为所述第一内在编码向量,所述G (L,源)为所述第一潜在编码向量,所述C (M/2,目)为所述第二内在编码向量,所述G (L,目)为所述第二潜在编码向量,所述全局编码网络为
    Figure PCTCN2021091757-appb-100017
    L表示所述全局编码网络的层数。
  5. 根据权利要求4所述的方法,其中,在所述通过所述第一内在编码向量和所述第一潜在编码向量确定第一特征向量,以及通过所述第二内在编码向量和所述第二潜在编码向量确定第二特征向量之前,所述方法还包括:
    通过以下公式确定所述第二目标网络的目标网络参数;
    Figure PCTCN2021091757-appb-100018
  6. 根据权利要求4所述的方法,其中,所述通过所述第一内在编码向量和所述第一潜在编码向量确定第一特征向量,以及通过所述第二内在编码向量和所述第二潜在编码向量确定第二特征向量,包括:
    通过以下公式确定所述第一特征向量:
    Figure PCTCN2021091757-appb-100019
    其中,所述第一特征向量为所述
    Figure PCTCN2021091757-appb-100020
    通过以下公式确定所述第二特征向量:
    Figure PCTCN2021091757-appb-100021
    其中,所述第二特征向量为所述
    Figure PCTCN2021091757-appb-100022
  7. 根据权利要求1至6任一项中所述的方法,其中,所述分类器为随机森林分类器。
  8. 一种缺陷预测装置,包括:
    第一处理单元,设置为通过第一目标网络对源域数据集和目标域数据集分别进行编码,得到所述源域数据集对应的第一内在编码向量、以及所述目标域数据集对应的第二内在编码向量;
    第二处理单元,设置为将所述第一内在编码向量和所述第二内在编码向量分别输入至第二目标网络,得到所述第一内在编码向量对应的第一潜在编码向量、以及 所述第二内在编码向量对应的第二潜在编码向量;
    第三处理单元,设置为通过所述第一内在编码向量和所述第一潜在编码向量确定第一特征向量,以及通过所述第二内在编码向量和所述第二潜在编码向量确定第二特征向量;
    第四处理单元,设置为使用目标分类器对所述第二特征向量进行分类,得到分类结果,其中,所述分类结果设置为表示所述第二特征向量是否存在缺陷。
  9. 一种计算机可读的存储介质,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行上述权利要求1至7任一项中所述的方法。
  10. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为通过所述计算机程序执行所述权利要求1至7任一项中所述的方法。
PCT/CN2021/091757 2020-09-30 2021-04-30 缺陷预测方法和装置、存储介质和电子装置 WO2022068200A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011065824.9 2020-09-30
CN202011065824.9A CN112199280B (zh) 2020-09-30 2020-09-30 软件的缺陷预测方法和装置、存储介质和电子装置

Publications (1)

Publication Number Publication Date
WO2022068200A1 true WO2022068200A1 (zh) 2022-04-07

Family

ID=74012896

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/091757 WO2022068200A1 (zh) 2020-09-30 2021-04-30 缺陷预测方法和装置、存储介质和电子装置

Country Status (2)

Country Link
CN (1) CN112199280B (zh)
WO (1) WO2022068200A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199280B (zh) * 2020-09-30 2022-05-20 三维通信股份有限公司 软件的缺陷预测方法和装置、存储介质和电子装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160292069A1 (en) * 2014-07-06 2016-10-06 International Business Machines Corporation Utilizing semantic clusters to Predict Software defects
CN110659207A (zh) * 2019-09-02 2020-01-07 北京航空航天大学 基于核谱映射迁移集成的异构跨项目软件缺陷预测方法
CN110751186A (zh) * 2019-09-26 2020-02-04 北京航空航天大学 一种基于监督式表示学习的跨项目软件缺陷预测方法
CN111198820A (zh) * 2020-01-02 2020-05-26 南京邮电大学 一种基于共享隐层自编码器的跨项目软件缺陷预测方法
CN112199280A (zh) * 2020-09-30 2021-01-08 三维通信股份有限公司 缺陷预测方法和装置、存储介质和电子装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017181286A1 (en) * 2016-04-22 2017-10-26 Lin Tan Method for determining defects and vulnerabilities in software code
US20180150742A1 (en) * 2016-11-28 2018-05-31 Microsoft Technology Licensing, Llc. Source code bug prediction
CN111290947B (zh) * 2020-01-16 2022-06-14 华南理工大学 一种基于对抗判别的跨软件缺陷预测方法
CN111522743B (zh) * 2020-04-17 2021-10-22 北京理工大学 一种基于梯度提升树支持向量机的软件缺陷预测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160292069A1 (en) * 2014-07-06 2016-10-06 International Business Machines Corporation Utilizing semantic clusters to Predict Software defects
CN110659207A (zh) * 2019-09-02 2020-01-07 北京航空航天大学 基于核谱映射迁移集成的异构跨项目软件缺陷预测方法
CN110751186A (zh) * 2019-09-26 2020-02-04 北京航空航天大学 一种基于监督式表示学习的跨项目软件缺陷预测方法
CN111198820A (zh) * 2020-01-02 2020-05-26 南京邮电大学 一种基于共享隐层自编码器的跨项目软件缺陷预测方法
CN112199280A (zh) * 2020-09-30 2021-01-08 三维通信股份有限公司 缺陷预测方法和装置、存储介质和电子装置

Also Published As

Publication number Publication date
CN112199280B (zh) 2022-05-20
CN112199280A (zh) 2021-01-08

Similar Documents

Publication Publication Date Title
Shukla et al. An analytical model to minimize the latency in healthcare internet-of-things in fog computing environment
Caminha et al. A smart trust management method to detect on-off attacks in the internet of things
TWI712963B (zh) 推薦系統建構方法及裝置
CN110263280B (zh) 一种基于多视图的动态链路预测深度模型及应用
CN111881350B (zh) 一种基于混合图结构化建模的推荐方法与系统
US11715044B2 (en) Methods and systems for horizontal federated learning using non-IID data
CN114386567A (zh) 图像分类神经网络
CN107240029B (zh) 一种数据处理方法及装置
US11928583B2 (en) Adaptation of deep learning models to resource constrained edge devices
US20240135191A1 (en) Method, apparatus, and system for generating neural network model, device, medium, and program product
CN114514519A (zh) 使用异构模型类型和架构的联合学习
CN115358487A (zh) 面向电力数据共享的联邦学习聚合优化系统及方法
CN113191530B (zh) 一种具有隐私保护的区块链节点可靠性预测方法及系统
KR102086936B1 (ko) 사용자 데이터 공유 방법 및 디바이스
CN115344883A (zh) 一种用于处理不平衡数据的个性化联邦学习方法和装置
WO2022068200A1 (zh) 缺陷预测方法和装置、存储介质和电子装置
CN113228059A (zh) 面向跨网络的表示学习算法
WO2020237689A1 (zh) 网络结构搜索的方法及装置、计算机存储介质和计算机程序产品
Zhang et al. [Retracted] Network Traffic Prediction via Deep Graph‐Sequence Spatiotemporal Modeling Based on Mobile Virtual Reality Technology
CN113541986B (zh) 5g切片的故障预测方法、装置及计算设备
US11201789B1 (en) Coordinated device grouping in fog computing
CN112468324A (zh) 基于图卷积神经网络的加密流量分类方法及装置
Ullah et al. Federated learning using sparse-adaptive model selection for embedded edge computing
CN117033997A (zh) 数据切分方法、装置、电子设备和介质
CN112437051B (zh) 网络风险检测模型负反馈训练方法、装置及计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21873863

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21873863

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21873863

Country of ref document: EP

Kind code of ref document: A1