CN112965894B - Defect positioning method based on context awareness - Google Patents

Defect positioning method based on context awareness Download PDF

Info

Publication number
CN112965894B
CN112965894B CN202110152656.5A CN202110152656A CN112965894B CN 112965894 B CN112965894 B CN 112965894B CN 202110152656 A CN202110152656 A CN 202110152656A CN 112965894 B CN112965894 B CN 112965894B
Authority
CN
China
Prior art keywords
node
test case
defect
program
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110152656.5A
Other languages
Chinese (zh)
Other versions
CN112965894A (en
Inventor
雷晏
张卓
刘春燕
谢欢
鄢萌
徐玲
徐洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202110152656.5A priority Critical patent/CN112965894B/en
Publication of CN112965894A publication Critical patent/CN112965894A/en
Application granted granted Critical
Publication of CN112965894B publication Critical patent/CN112965894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3676Test management for coverage analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/368Test management for test version control, e.g. updating test cases to a new software version
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention relates to a defect positioning method based on context awareness, which utilizes a program slicing technology to construct a defect context, wherein the context can be expressed as a directed graph expressed by a program dependency graph, nodes in the graph are sentences with direct or indirect association relation with failures, and edges are association relation between the sentences. Based on the graph, each node in the graph is embedded into a node representation vector by using one-hot coding, the dependency relationship between sentences is obtained by using GNN, and the CAN is trained by using test cases on the basis of the node representation vectors, so that more accurate node representation vectors CAN be obtained. And finally, constructing a virtual test case set by a method that each statement in the defect context statement of the defective target program is covered by only one test case and only one defect context statement is covered by one test case. And inputting the test case set into the trained GNN to obtain the suspicious value of each statement. The method is used for analyzing the defect context and incorporating the defect context into suspicious evaluation to improve defect positioning, and experimental analysis shows that the method can remarkably improve the defect positioning effectiveness.

Description

Defect positioning method based on context awareness
Technical Field
The invention relates to a defect positioning method, in particular to a defect positioning method based on context awareness.
Background
The automatic software debugging technology plays a very necessary role in helping the developer reduce the labor and time consumption in the test process, and can greatly lighten the burden of the developer. Researchers have therefore proposed many software bug localization methods to assist developers in locating bugs in programs by analyzing program execution that results in unexpected output. Among them, the program spectrum based method (SFL) is one of the most popular defect localization methods.
A defect localization method (SFL) based on program spectrum utilizes program coverage information and test case results to construct a defect localization model and calculates suspicious values of each executable statement in the program as a defect statement. SFL defines an information model named program spectrum, which is the input of suspicious values of the calculation statement. The program spectrum records the running information and test case results of the program after execution. Assuming that a program P includes N sentences, the test case set T of the program includes M test cases, where at least one test case fails, see fig. 1.X is X ij =1 indicates that test case i executed statement j, X ij =0 means that test case i does not execute statement j. The matrix m×n records execution information of each sentence in the test case set T. The error vector e is the test case result. Wherein element e i =1 indicates that test case i is a failed test case, e i =0 indicates that test case i is a successful test case. Based on program spectrum, the suspicious value calculation formula defined by SFL comprises four parameters, which are respectively a np 、a nf 、a ep And a ef These four parameters are bound to a specific statement. They represent the number of test cases that pass or fail, respectively, the statement is executed or not, i.e., a np The number of test cases, a, indicating the pass of the statement is not executed nf Number of test cases, a, indicating failure in executing the statement ep The number of test cases, a, indicating the passage of the statement ef The number of test cases that failed to execute the statement is indicated. In practice we want the true error statement to have a higher a ef Value sum is lowa ep Value of a when a ef Maximum and a ep And when the statement is minimum, the statement is not executed by the successful test cases, all failed test cases execute the statement, and all suspicious value calculation formulas return to the maximum value when calculating the suspicious value of the statement. In the normal case, different suspicious value calculation formulas output different suspicious values.
FIG. 2 illustrates a schematic diagram of defect localization using deep learning techniques. The model includes an input layer, a deep learning component, and an output layer. At the input layer, the test case coverage information matrix and the test case result vector are used as training samples and corresponding labels. According to the mxn matrix and the corresponding result vector, each time there is an h row of mxn matrix as input to the model, the corresponding result vector as label, the start row i,
Figure BDA0002932991110000011
at the deep learning component layer, the MLP-FL, CNN-FL and BiLSTM-FL respectively use a multi-layer perceptron, a convolutional neural network and a two-way long and short range memory network as the deep learning component. At the output layer, SFL uses a sigmoid function (a sigmoid function can output the input data as a value between 0 and 1). The value in the result vector e differs from the value in the sigmoid function output result vector y. And then, the BP algorithm is utilized to continuously update parameters of the optimization model through repeated iterative training, so that the difference value between the training result vector e and the result vector y is continuously reduced. Although SFL achieves good positioning results, it still has certain limitations. Their suspicious value calculation model does not use the defect context. The defect context is a set of statements of a program, the statements in the set having direct or indirect data or control dependencies with the output statements of a failed test case. In fact, the defect context shows the propagation process and mechanism of defects in the program, which is of great importance for understanding the program and locating defects. Thus, relying solely on test case results and test case coverage information without taking into account the complex inherent relationships in the defect context can affect defect localization Is of the accuracy of (3).
Disclosure of Invention
Aiming at the problems existing in the prior art, the technical problems to be solved by the invention are as follows: how to accurately locate defects using the defect context.
In order to solve the technical problems, the invention adopts the following technical scheme: a defect positioning method based on context awareness comprises the following steps:
s100, data extraction and preparation, wherein the specific steps are as follows:
s110, a program containing N sentences is set, a test case set T containing M test cases runs the program, the test case set T at least contains one failed test case, and the output of each test case in the test case set T is known;
giving each test case in T to program execution to obtain statement coverage information of each test case in the program, so that the statement coverage information of the test case in the program forms an M multiplied by N coverage information matrix, and the known outputs of all the test cases form a result vector;
s120, constructing a defect context, wherein the defect context is a set of sentences influencing program errors, and determining a sentence set from output sentences of one failed test case in a test case set T by using a dynamic slicing technology, wherein the sentences in the sentence set have a dependency relationship with the output sentences of the selected failed test case;
S130: the program statement dependency graph and the adjacency matrix A are constructed by the following steps:
s131: constructing a program statement dependency graph by adopting a dynamic slicing technology, wherein nodes in the program statement dependency graph represent one statement in a defect context, and edges in the program statement dependency graph represent association relations between two statements;
the program statement dependency graph is expressed as G, G= (V, ζ), V represents the set of nodes in the program statement dependency graph, ζ represents the set of edges in the program statement dependency graph, and one node V i Representing a statement, v, in the context of the defect i E V, an edge (V i ,v j ) Representing two sentences v i And v j Association relation between (v) i ,v j )∈ξ;
S132: the association relation between nodes on the program statement dependency graph is represented by an adjacency matrix A, wherein A ij Elements representing the ith row and jth column in A, A ij =1 means that there is a directed edge a between node i and node j pointing from node i to node j ij ;A ij =0 indicates that there is no directed edge between node i and node j pointing from node i to node j, a ij Reflecting whether there is information flowing from node i to node j;
s140: providing a K sentences in the defect context, checking N sentences of the program, discarding sentences not in the defect context to obtain an MxK matrix, wherein the MxK matrix records the execution information of the K sentences in the defect context in the test case set T, and each row of the MxK matrix represents one test case and y ij The value of the element in the jth column of the ith row of the MxK matrix is 0 or 1, when y ij =1 indicates that the jth statement in the defect context is executed by test case i, y ij =0 indicates that no execution is performed;
s200: the specific steps of the model training process are as follows:
s210: embedding node expression vectors into all nodes of the program statement dependency graph through one-hot coding, and initializing the node expression vectors into one-hot expression vectors i (1) ,vector i (1) ∈R d The length of the node representation vector is the number d of the nodes, the superscript (1) represents that the iteration is the first round, and then each node i fully collects the information of all adjacent nodes after t rounds of iteration, so that the iterated representation m of the nodes is obtained i (t) ,m i (t) ∈R d
S220: taking the ith row in the MxK matrix, when the value of the jth column element in the ith row is 1, namely y ij By node, vector j is represented (where vector j is different from the preceding neighbor node vector j, where vector j represents only the value of j andy ij the value of j is the same) to replace y ij When the value of the j-th column element in the i-th row is 0, i.e., y ij =0, then the y is replaced by a zero vector ij
S230: taking the ith row in the M multiplied by K matrix processed by S220 as the input of a neural network model to obtain an o i Value of o i Value and known output e of test case represented by ith row i The values are differenced to obtain a loss value, and a back propagation algorithm (back propagation algorithm) is used for parameters and node vectors vector of the neural network model according to the loss value during iteration 1 To vector K And updating. The specific iteration number S is set according to the empirical value, the values of different programs S are different, if the iteration number is smaller than S, the next step is executed, otherwise, the S300 is executed;
s231: the updating method of the node representation vector comprises the following steps:
taking node i as a central node, and a neural network a is R d ×R d ->R distributes different weights delta to each neighbor node j according to the importance of the neighbor node j to the center node i ij Normalizing the correlation coefficients of all the neighbor nodes by using a softmax function;
Figure BDA0002932991110000041
wherein a represents a neural network a: R d ×R d ->R;
S232: computing interaction information
Figure BDA0002932991110000042
Figure BDA0002932991110000043
Interaction information for the central node i:
Figure BDA0002932991110000044
wherein A is ij Representing a center node i and neighbor nodesIn the adjacency matrix A of j, whether information flows from node i to node j, vector j Node representation vector representing neighbor node j, N i All neighbor nodes representing a central node i;
s233: the GRU is a gating loop unit, and when the node representation vector is iterated in the t-th round, the GRU outputs vector of the neighbor node j j (t-1) And m i (t) As input, a new node representing vector is output as node i i (t)
Figure BDA0002932991110000045
Returning to S220 after updating the node representation vector;
s300: performing defect positioning on a defective target program:
for a given faulty program, we want to find out which statement, among the statements contained in this faulty program, caused the faulty program, and for this reason we get a suspicious value between 0 and 1, the larger the suspicious value, the higher the probability that this statement caused the faulty program. Thus we find the location of the defect in the program.
S310: constructing a defect context for the method of S120 of the defective target program, and constructing a K x K virtual test case set, wherein
K is the statement number of the defect context, each virtual test case only covers one defect context Wen Yugou, and each row in the K multiplied by K virtual test case set is one virtual test case;
s320: taking the ith row in the KxK virtual test case set, when the value of the jth column element in the ith row is 1, namely y' ij The value of =1 is 1, then the node representation vector j 'is used to replace y' ij When the value of the j 'th column element in the i-th row is 0, i.e., y' ij =0, then the y 'is replaced by a zero vector' ij
S330: the ith 'of the KxK virtual test case set processed by S320 is processed'The line is used as the input of the trained neural network model to obtain an o i ' value, o i ' is the suspicious value;
s330: traversing each row in the KxK virtual test case set to obtain K suspicious values, wherein the range of the K suspicious values is 0-1, and the larger the value of the suspicious value is, the higher the possibility that the statement affects the target program to make mistakes is.
Preferably, the method for constructing the defect context in S120 is specifically as follows:
the defect context is a statement set influencing program errors, a statement set is determined from output statements of a failed test case in the test case set T by using a dynamic slicing technology, and the statements in the statement set have a dependency relationship with the output statements of the selected failed test case;
the following slicing criteria were used:
failSC=(outStm,incorrectVar,failEx)
the outpstm is an output statement, one variable in the statement is incorrectVar, incorrectVar indicates that the error is Var, and failx is a failed execution path.
Compared with the prior art, the invention has at least the following advantages:
The method is a neural defect localization based on context awareness to analyze the fault context and incorporate it into suspicious assessment to improve defect localization; modeling a defect context by using a program slice, representing the defect context as a program dependency graph, and then constructing a graph neural network to analyze and learn complex relations among sentences in the defect context; eventually a model is learned that evaluates whether each false statement is suspicious. The inventors have conducted experiments on 12 practical large programs and compared the method of the present invention with 10 latest defect localization methods. The results show that the method of the present invention can significantly improve the effectiveness of defect localization, e.g., 4.61%,20%,29.23%,49.23% and 64.62% of the defects are located in the first 1, first 3, first 5, first 10 and first 10 positions, respectively.
Drawings
FIG. 1 is a statement overlay information matrix of M test cases.
FIG. 2 is a graph of suspicious evaluation using a neural network.
FIG. 3 is a schematic diagram of the method of the present invention.
FIG. 4 is a virtual test case set.
Fig. 5 is an example of the method of the present invention.
FIGS. 6 (a) -6 (d) are graphs comparing the EXAM values of CAN and 10 defect localization methods.
Fig. 7 (a) and 7 (b) are CAN and rim comparison diagrams.
Detailed Description
The present invention will be described in further detail below.
The invention utilizes learning ability to construct a model capable of simulating complex association relation in the defect context, thereby achieving the purpose of integrating the error statement context into the defect positioning technology. We therefore propose a context-aware defect localization technique, abbreviated as CAN, notably CAN, which simulates a defect context by building a program dependency graph, which CAN exhibit a set of statements that interact (data dependency and control dependency). The CAN utilizes the technology of the graph neural network to simulate the propagation of the defect in the defect context, and the defect context is fused into a defect positioning system, so that the position of a defect statement CAN be accurately positioned. Experiments show that in 12 large-scale real programs, CAN CAN achieve very accurate positioning effect, and 49.23% of defects are positioned at TOP10. Therefore, CAN significantly improves the efficacy of defect localization techniques.
The Graph Neural Network (GNNs) can be modeled according to nodes on the graph structure and the association relationship between the nodes, and the iterative model is continuously trained through information transmission between the nodes so as to obtain the convergence of the model, so that the classification or regression problem can be solved. The context of the prize defect is expressed as data of a graph structure, namely a program dependency graph, wherein the program dependency graph comprises nodes and edges, the nodes are sentences in the program, and the edges are association relations among the sentences and comprise data dependency relations and control dependency relations. The GNNs can integrate the defect context into the defect localization system by learning the complex associations of elements in the defect context.
The method of the invention uses the program dependency graph to model the defect context, uses the Graph Neural Network (GNNs) to analyze and understand the defect context, and then blends the defect context into the defect positioning system so as to finish the positioning of the program defect. In particular, the CAN firstly utilizes a program slicing technology to construct a defect context, the context is a directed graph expressed by a program dependency graph, nodes in the graph are sentences with direct association relation with failures, and edges are association relation between the sentences. Based on this graph, CAN uses GNN to obtain dependency relationships between statements and then generates corresponding node representation vectors, which are not well represented (e.g., SFL) in conventional defect localization methods. Based on the node expression vectors, the CAN is trained by using test cases, so that more accurate statement expression CAN be obtained. Finally, CAN utilizes a virtual test case to evaluate each statement as suspicious of defective statements.
Referring to fig. 3, a defect localization method based on context awareness includes the following steps:
s100, data extraction and preparation, wherein the specific steps are as follows:
s110, a program containing N sentences is set, a test case set T containing M test cases runs the program, the test case set T at least contains one failed test case, and the output of each test case in the test case set T is known.
And (3) handing each test case in T to program execution to obtain statement coverage information of each test case in the program, so that the statement coverage information of the test cases in the program forms an M multiplied by N coverage information matrix, and the known outputs of all the test cases form a result vector.
Representing the execution information of the program P by using an overlay information matrix and a test case result vector; CAN uses a graph neural network model based on a defect program and an information model (an overlay information matrix and a test case result vector)The defect context is integrated into the defect localization system. The graph neural network model includes an input layer, a deep learning component, and an output layer. At the input layer, the coverage information matrix of the test case is used as a training sample, and the result vector of the test case is used as a corresponding label. According to the M x N coverage information matrix and the corresponding result vector, each time there is h rows of M x N coverage information matrix as the input of the model, the corresponding result vector as the label, the initial row i, i e {1,1+h,1+2h, …,
Figure BDA0002932991110000071
}. At the deep learning component layer, the MLP-FL, CNN-FL and BiLSTM-FL respectively use a multi-layer perceptron, a convolutional neural network and a two-way long and short range memory network as the deep learning component. At the output layer, the program spectrum based method uses a sigmoid function to output the input data as a value between 0 and 1). The value in the result vector e differs from the value in the sigmoid function output result vector y. And then, the BP algorithm (Back propagation algorithm) is utilized to continuously update the parameters of the optimization model through repeated iterative training, so that the difference value between the training result vector e and the result vector y is continuously reduced.
S120, constructing a defect context, wherein the defect context is a set of sentences influencing program errors, and determining a sentence set from output sentences of one failed test case in the test case set T by using a dynamic slicing technology, wherein the sentences in the sentence set have a dependency relationship with the output sentences of the selected failed test case.
The method for constructing the defect context comprises the following steps:
since the defect context is strongly related to a failed execution, a statement that directly or indirectly affects the calculation of the erroneous output value of the failure through the dynamic data chain/control correlation is included in the defect context, the defect context is constructed using a dynamic slicing method. The defect context is a statement set influencing program errors, a statement set is determined from output statements of a failed test case in the test case set T by using a dynamic slicing technology, and the statements in the statement set have a dependency relationship with the output statements of the selected failed test case; in particular implementations, such dependencies may be direct or indirect data or control dependencies.
The following slicing criteria were used:
failSC=(outStm,incorrectVar,failEx)
the outpstm is an output statement, one variable in the statement is incorrectVar, incorrectVar indicates that the error is Var, and failx is a failed execution path.
It should be noted that: dynamic slicing techniques are one prior art technique.
One execution failsc= (outStm, incorrectVar, failx) slice standard is randomly selected from all failed test cases, thereby acquiring a defect context. After calculation using the failsc= (outStm, incorrectVar, failx) slicing algorithm, a defect context of the program can be obtained, and then a dependency graph of the program statement is constructed according to the defect context of the program statement.
S130: the program statement dependency graph and the adjacency matrix A are constructed by the following steps:
s131: constructing a program statement dependency graph by adopting a dynamic slicing technology, wherein nodes in the program statement dependency graph represent one statement in a defect context, and edges in the program statement dependency graph represent association relations between two statements;
the program statement dependency graph is expressed as G, G= (V, ζ), V represents the set of nodes in the program statement dependency graph, ζ represents the set of edges in the program statement dependency graph, and one node V i Representing a statement, v, in the context of the defect i E V, an edge (V i ,v j ) Representing two sentences v i And v j Association relation between (v) i ,v j ) Epsilon; one node in the program statement dependency graph represents one statement in the defect context, the edges are the association relations among the statements, including the data dependency relation and the control dependency relation, and the failed test case contains the defect context statement, so that the relationship exists, and the neural network and the data correlation are utilized to find the test case coverage information and the test case coverage information Nonlinear relationships between test case results.
S132: the association relation between nodes on the program statement dependency graph is represented by an adjacency matrix A, wherein A ij Elements representing the ith row and jth column in A, A ij =1 means that there is a directed edge a between node i and node j pointing from node i to node j ij ;A ij =0 indicates that there is no directed edge between node i and node j pointing from node i to node j, a ij Reflecting whether there is information flowing from node i to node j;
s140: providing a K sentences in the defect context, checking N sentences of the program, discarding sentences not in the defect context to obtain an MxK matrix, wherein the MxK matrix records the execution information of the K sentences in the defect context in the test case set T, and each row of the MxK matrix represents one test case and y ij The value of the element in the jth column of the ith row of the MxK matrix is 0 or 1, when y ij =1 indicates that the jth statement in the defect context is executed by test case i, y ij =0 indicates that no execution is performed;
because the coverage information matrix cannot show the association relation among sentences and the propagation process of defects in the program, a program sentence dependency graph needs to be constructed.
S200: the specific steps of the model training process are as follows:
s210: embedding node expression vectors into all nodes of the program statement dependency graph through one-hot coding, and initializing the node expression vectors into one-hot expression vectors i (1) ,vector i (1) ∈R d The length of the node representation vector is the number d of the nodes, the superscript (1) represents that the iteration is the first round, and then each node i fully collects the information of all adjacent nodes after t rounds of iteration, so that the iterated representation m of the nodes is obtained i (t) ,m i (t) ∈R d
S220: taking the ith row in the MxK matrix, when the value of the jth column element in the ith row is 1, namely y ij For example, =1, then use node tableVector j is shown (where vector j is different from the preceding neighbor node vector j, where vector j only represents the value of j versus y) ij The value of j is the same) to replace y ij When the value of the j-th column element in the i-th row is 0, i.e., y ij =0, then the y is replaced by a zero vector ij
y ij By =1 is meant node v j Test case T i Execute, node v j Is a representation vector of (a) j Input into the neural network, y ij =0 denotes node v j Test case T not to be tested i The 0 vector is entered into this iteration of the neural network.
S230: taking the h row in the M multiplied by K matrix processed by S220 as the input of a neural network model to obtain an o h Value of o i Value and known output e of test case represented by h line h The values are differenced to obtain a loss value, and a back propagation algorithm (back propagation algorithm) is used for parameters and node vectors vector of the neural network model according to the loss value during iteration 1 To vector K And updating. The specific iteration number S is set according to the empirical value, the values of different programs S are different, if the iteration number is smaller than S, the next step is executed, otherwise, the S300 is executed;
s231: the updating method of the node representation vector comprises the following steps:
taking node i as a central node, and a neural network a is R d ×R d ->R distributes different weights delta to each neighbor node j according to the importance of the neighbor node j to the center node i ij Normalizing the correlation coefficients of all the neighbor nodes by using a softmax function;
Figure BDA0002932991110000091
wherein a represents a neural network a: R d ×R d ->R;
S232: computing interaction information
Figure BDA0002932991110000092
Figure BDA0002932991110000093
Interaction information for the central node i:
Figure BDA0002932991110000094
wherein A is ij In the adjacency matrix A representing the center node i and the neighbor node j, whether information flows from the node i to the node j or not and vector j Node representation vector representing neighbor node j, N i All neighbor nodes representing a central node i;
s233: the GRU is a gating loop unit, and when the node representation vector is iterated in the t-th round, the GRU outputs vector of the neighbor node j j (t-1) And m i (t) As input, a new node representing vector is output as node i i (t)
Figure BDA0002932991110000095
Returning to S220 after updating the node representation vector;
according to the node representation information updating process, the CAN performs continuous iterative training on the model, and the node representation vector is updated in each iteration. Suppose CAN uses the ith row ([ y ] of the information coverage matrix mxk) i1 ,y i2 ,…,y iK ]) And corresponding test case result vector e i . For y ij For y ij By =1 is meant node v j Test case T i Executing, CAN connecting node v j Is a representation vector of (a) j Inputting the model into the iteration of the model; y is ij =0 denotes node v j Test case T not to be tested i And executing, wherein the CAN inputs the 0 vector into the current iteration of the model.
After the test case T is executed i Thereafter, we select a set of sentences that are strongly related to the defect output and input their node representation vectors to the GNN moduloIn the model, the node expression vector of the model is subjected to iterative training, and T is used for training i The node representation vector of the non-covered statement remains unchanged, and only the node representation vector of the covered statement is updated. After the node representation update is completed, the model outputs a value o of 0 to 1 through the linear transformation layer i (i∈{1,2,…,M})。
CAN iterates continuously on each row of the matrix MxK and their corresponding test case result vectors as inputs, using a back propagation algorithm (back propagation algorithm) to model parameters and node vectors vector 1 To vector K And updating. The goal is to continually narrow down the difference between the values in the output o and the test case result vector e. The algorithm continues to calculate from the input layer to the output layer and then updates the parameters and node representations in reverse. The CAN adopts a method of dynamically adjusting the learning rate for training, and the method has two advantages, namely, the CAN CAN use a larger learning rate to enable loss to be reduced rapidly at the beginning, and then use a smaller learning rate to enable training not to miss the optimal point.
In the following formula for calculating LR, one Epoch represents that training data is completely trained once, LR represents a learning rate, dropRate represents a value of each adjustment of the learning rate, and EpochDrop represents a frequency of updating the learning rate. We set the initial learning rate to 0.01 and droplate to 0.98. The EpochDrop is set according to the size of the test case set.
LR=LR*DropRate (Epoch+1)/EpochDrop
S300: performing defect positioning on a defective target program:
for a given faulty program, we want to find out which statement, among the statements contained in this faulty program, caused the faulty program, and for this reason we get a suspicious value between 0 and 1, the larger the suspicious value, the higher the probability that this statement caused the faulty program. Thus we find the location of the defect in the program.
S310: constructing a defect context for the method of S120 of the defective target program, and constructing a K multiplied by K virtual test case set, see FIG. 4, wherein K is the statement number of the defect context, and each virtual test case only covers one defect context statement, so that K test cases, namely K multiplied by K virtual test case sets, are constructed in total;
each row in the K multiplied by K virtual test case set is a virtual test case;
s320: taking the ith row in the KxK virtual test case set, when the value of the jth column element in the ith row is 1, namely y' ij The value of=1 is 1, then the vector j ' is represented by a node (vector j ' here represents only the value of j and y ' ij The value of j is the same) to replace the y' ij When the value of the j 'th column element in the i-th row is 0, i.e., y' ij =0, then the y 'is replaced by a zero vector' ij
S330: taking the ith row in the KxK virtual test case set processed by S320 as the input of the trained neural network model to obtain an o i ' value, o i ' is the suspicious value;
s330: traversing each row in the KxK virtual test case set to obtain K suspicious values, wherein the range of the K suspicious values is 0-1, and the larger the value of the suspicious value is, the higher the possibility that the statement influences the target program to make mistakes is, namely, the more likely the statement with the larger suspicious value is the position of the defect of the program, and the defect is positioned.
One example of CAN
FIG. 5 is an example showing how CAN works, program P in the example contains 8 sentences, one of which is a defective sentence S 4 . FIG. 5 (a) shows a sentence S with defects 4 Is a program P of (a). FIG. 5 (b) shows 6 test cases, where T 2 ,T 3 Is a failed test case. FIG. 5 (c) shows the procedure P using the test case T 3 The result contains 6 of 8 sentences. We can see that in the slicing result, S 1 ,S 3 ,S 4 ,S 5 And S is 8 Influence S 8 And the variable z in (a). FIG. 5 (d) shows a schematic representation of a program P, including control dependencies of the programLai Heshu is dependent.
Fig. 5 (e) shows the training process of CAN. CAN converts the program dependency graph into an adjacency matrix and inputs the adjacency matrix into the GNN model. And then the CNN trains the model by using the coverage information and the test case result vector. In the example, 6 vectors represent representations of 6 nodes. For example, S 1 Is a node in the program P, vector1 is S 1 Is represented by a node of (a). Specifically, according to test case T 1 =[1,1,1,1,1,1]And its result 0 (the rightmost vector of fig. 5 (b)), we input 0 in (vector 1, vector2, vector3, vector4, vector5, vector 6) and the result vector into the model; according to test case T 2 =[1,1,1,1,1,1]And its result 1, we input 1 in the (vector 1, vector2, vector3, vector4, vector5, vector 6) and result vector after the first round of iteration into the model; according to test case T 3 =[1,1,1,1,1,1]And its result 1, we input 1 in (vector 1, vector2, vector3, vector4, vector5, vector 6) and the result vector after the previous iteration into the model; according to test case T 4 =[1,1,1,1,0,1]And its result 0, we input 0 in (vector 1, vector2, vector3, vector4, zero vector, vector 6) and result vector of the previous iteration into the model; according to test case T 5 =[1,1,1,1,0,1]And its result 0, we input 0 in (vector 1, vector2, vector3, vector4, zero vector, vector 6) and result vector of the previous iteration into the model; according to test case T 6 =[1,1,1,1,0,1]And its result 0, we input 0 in (vector 1, vector2, vector3, vector4, zero vector, vector 6) and result vector of the previous iteration into the model. The convergence condition is achieved by continuously and repeatedly training the network until loss is small to some extent. After training, the model reflects the complex nonlinear relationship between the statement representation and the test case coverage information and the test case results.
Finally, the CAN constructs a virtual test case (see FIG. 5 (f)), which contains 6 test cases, each of which contains only one covered statement. One of the virtual test cases is input into a trained model, and the model is obtained The output of the model is that the statement covered by the test case set is the suspicious value of the defect statement. For example, we will virtual test case v1= [1,0,0,0,0,0 ]]Input into the trained model, and output is sentence S 1 Is 0.6. Similarly, we can calculate suspicious values for other statements. Since the defect context does not contain S 2 And S is 7 Thus CAN imparts S 2 And S is 7 The lowest suspect value of 0. As can be seen from FIG. 5 (g), the final suspicious value list is (S) 4 ,S 1 ,S 3 ,S 5 ,S 6 ,S 8 ,S 2 ,S 7 ). True defect statement S 4 The first name is ranked.
Experimental test
A. Experimental construction
To verify the effectiveness of CAN, CAN was compared to 10 better defect localization methods. The 10 methods are MLP-FL, CNN-FL, biLSTM-FL, ochiai, ER5, GP02, GP03, dstar, GP19 and ER1', respectively. Further, experiments used large program objects widely used in the defect localization field, and the number of code lines varied from 5.4 kilolines to 491 kilolines.
Table 1 summarizes the characteristics of these subject procedures. For each program, the column "description" of Table 1 describes the subject program; the "version number" column describes the number of defective versions of the program; the "thousand rows" column describes the number of rows of code; "number of test cases" describes the number of test cases of the program. The first 4 programs (chart, math, lang and time) are derived from Defect4J ] http://defects4j.org);pythonGzip, libtiff derived from ManyBugshttp:// repairbenchmarks.cs.umass.edu/ManyBugs /); space and 4The nanoml of each version is derived from SIR @http://sir.unl.edu/portal/index.php)。
Experimental environment: the CPU is I5-2640,64G memory, a NVIDIA TITAN X Pascal GPU of 12G, and the experiment running system is Ubantu 16.04.3.
B. Evaluation method
To verify the effectiveness of CAN, we use three widely used evaluation methods: top-N, EXAM and rim. Top-N may exhibit the best effect of localization of defects, with EXAM and rim exhibiting the overall effect of localization.
Table 1 experimental procedure
Program name Brief description of the drawings Number of versions Number of lines of code (thousands of lines) Number of test cases
python General-purpose language 8 407 355
gzip Data compression 5 491 12
libtiff Image processing 12 77 78
space ADL interpreter 35 6.1 13585
nanoxml_v1 XML parser 7 5.4 206
nanoxml_v2 XML parser 7 5.7 206
nanoxml_v3 XML parser 10 8.4 206
nanoxml_v5 XML parser 7 8.8 206
chart JFreeChart 26 96 2205
math Apache Commons Math 106 85 3602
lang Apachecommons lang 65 22 2245
time Joda-Time 27 53 4130
Specifically, top-N shows the positioning accuracy, i.e. how many proportion of defect versions in the positioning result of a defect positioning method position the real defect sentences in the first N. The higher the value of Top-N indicates that there are more true defect statements located in the first N bits. Expm is defined as the percentage of sentences that have been checked when a truly erroneous sentence was found. Lower value of Exam indicates better defect localization performance. RImp is defined as the sum of the number of all statements that CAN find all versions of a program for error checking divided by the sum of the number of all statements that CAN find all versions of the program for error checking by another defect localization method. For CAN, a lower rim value represents better positioning performance.
To verify the effectiveness of CAN, CAN was compared to 10 typical defect localization methods. The 10 methods are MLP-FL, CNN-FL, biLSTM-FL, ochiai, ER5, GP02, GP03, dstar, GP19 and ER1', respectively. To assess the effectiveness of defect localization, we used three widely used indicators: top-N accuracy, defect localization accuracy (called EXAM) and relatively improved accuracy (called rim). Top-N is a measure showing the best localization effectiveness of the defect localization method, while EXAM and rim are two measures showing the overall localization effectiveness. Further, experiments used large program objects widely used in the defect localization field, and the number of code lines varied from 5.4 kilolines to 491 kilolines. We used three widely used evaluation methods: top-N, EXAM and rim. Top-N may exhibit the best effect of localization of defects, with EXAM and rim exhibiting the overall effect of localization.
Top-N shows the positioning accuracy, i.e. how many proportion of defect versions in the positioning result of a defect positioning method position the real defect sentences in the first N. The higher the value of Top-N indicates that there are more true defect statements located in the first N bits. Expm is defined as the percentage of sentences that have been checked when a truly erroneous sentence was found. Lower value of Exam indicates better defect localization performance. RImp is defined as the sum of the number of all statements that CAN find all versions of a program for error checking divided by the sum of the number of all statements that CAN find all versions of the program for error checking by another defect localization method. For CAN, a lower rim value represents better positioning performance.
Top-N our experiments used Top-N (N= 1,3,4,10,20) to compare CAN to 10 superior defect localization methods. Table 2 shows the Top-N distribution of 11 defect localization methods. In Table 2, CAN gives optimal performance in five scenarios of Top-N. Specifically, CAN locates 4.62% of the defect version to Top-1, 20% to Top-3, 29.23% to Top-5, 49.23% to Top-10, and 64.62% to Top-20.
TABLE 2 Top-N comparison
Figure BDA0002932991110000131
Figure BDA0002932991110000141
To compare CAN with other defect localization methods, we plotted four comparison graphs of EXAM values, i.e., fig. 6 (a) -6 (d), with the ordinate in fig. 6 (a) -6 (d) representing the proportion of sentences that have been examined in all versions of the defect program and the abscissa representing the percentage of versions of the defect sentence found. One point in fig. 6 (a) -6 (d) represents the percentage of the total number of versions of a defect statement that can be found by checking a certain proportion of executable code. The results of fig. 6 (a) -6 (d) show that the CAN curves are much higher than the other 10 defect localization methods. The result shows that the positioning performance of CAN is obviously superior to that of other 10 defect positioning methods.
To further verify the experimental results, we used rim in both scenarios to evaluate CAN. Fig. 7 shows the distribution of rim in two scenarios: FIG. 7 (a) is a comparison of RIMP on 10 defect localization methods, and FIG. 7 (b) is a comparison of RIMP on 12 subject procedures.
In fig. 7 (a), the rim value is lower than 100% in all defect localization methods, which means CAN is superior to these comparative defect localization methods. The reduction in the number of total sentences that need to be checked is from 12.99% for BiLSTM-FL to 34.63% for Dstar. This also means that CAN has a maximum value of the statement count savings for inspection of 87.01% (100% -12.99% = 87.01%) for BiLSTM-FL and a minimum value of 65.37% (100% -34.63% = 65.37%) for Dstar compared to other defect localization methods. This shows that CAN saves 65.37% to 87.01% of statement count after locating all defect statements when compared with other types of defect locating methods.
In fig. 7 (b), the rim value is lower than 100% in all subjects, which means that CAN has a more significant improvement in positioning accuracy in all subjects. The number of sentences that need to be checked is reduced from 1.75% for python to 57% for nanopaxl_v1. This means that CAN be reduced to 1.75% and 57% of the number of sentences to be inspected, respectively, on average, compared to 10 defect localization methods when locating all defect sentences of python and to all defect versions of nanoxml v 1. The maximum saving value of CAN is 98.25% (100% -1.75% = 98.25%) in python and the minimum saving value is 43% (100% -57% = 43%) of nano ml_v1. This shows that CAN saves 43% to 98.25% of the number of sentences to be checked in all subjects.
As CAN be seen from the rim comparison chart, after CAN is used, the number of sentences to be checked is obviously reduced, which indicates that CAN obviously improves the defect positioning efficiency.
Since rim only shows a specific boost ratio, which is only a general effect boost display, some specific details may be omitted, we used Wilcoxon-Signed-Rank for statistical analysis in order to further verify the effectiveness of the invention. Wilcoxon-Signed-Rank Test non-parametric statistics were used to Test for differences between a pair of data, e.g., F (x) and G (y), and given a parameter phi, we can use 2-labeled and 1-labeled p-value to get a result. For 2-tagged p-value, if p ε, then assume H 0 : no difference between F (x) and G (y) is accepted; otherwise assume H 1 : f (x) and G (y) are accepted differently. There are two cases for 1-labeled p-value: 1-pinned (right) and 1-pinned (left), respectively. For 1-tailed (right), if p+.phi., then assume H 0 : f (x) and G (y) are not accepted as Better by comparison; otherwise assume H 1 : the comparison of F (x) and G (y) results in Better being accepted. For 1-tailed (left), if p+.phi., then assume H 0 : f (x) and G (y) are not accepted as Worse; otherwise assume H 1 The result of F (x) and G (y) compared is that Worse is accepted.
In the experiment, we used CAN as F (x) the EXAM value of all defective versions, the EXAM value of defect localization method FL1 being G (y). If p <0.05, then it is assumed that the EXAM value for H1: CAN is significantly less than the EXAM value for defect localization method FL1 is accepted. This means that CAN has BETTER positioning performance than FL1, we denote by bet ter. Conversely, H0: the exom value of CAN is not accepted to be smaller than the exom value of defect localization method FL1, which means CAN is not higher than the localization efficacy of FL 1.
We used Wilcoxon-Signed-Rank Test non-parametric statistics to verify whether CAN is significantly improved over other defect localization methodsFor testing the difference between a pair of data, e.g. F (x) and G (y), given a parameter phi, we can use 2-labeled and 1-labeled p-value to get a result. For 2-tagged p-value, if p ε, then assume H 0 : no difference between F (x) and G (y) is accepted; otherwise assume H 1 : f (x) and G (y) are accepted differently. There are two cases for 1-labeled p-value: 1-pinned (right) and 1-pinned (left), respectively. For 1-tailed (right), if p+.phi., then assume H 0 : f (x) and G (y) are not accepted as Better by comparison; otherwise assume H 1 : the comparison of F (x) and G (y) results in Better being accepted. For 1-tailed (left), if p+.phi., then assume H 0 : f (x) and G (y) are not accepted as Worse; otherwise assume H 1 The result of F (x) and G (y) compared is that Worse is accepted. Specifically, we use CAN as F (x) the EXAM value of all defective versions, and the EXAM value of the defect localization method FL1 is G (y). If p is<0.05, it is assumed that the EXAM value of H1: CAN is significantly smaller than the EXAM value of defect localization method FL1 is accepted. This means that CAN has BETTER positioning performance than FL1, we denote by bet ter. Conversely, H0: the exom value of CAN is not accepted to be smaller than the exom value of defect localization method FL1, which means CAN is not higher than the localization efficacy of FL 1.
Table 3 shows the results of Wilcoxon-Signed-Rank Test non-parametric statistics, it CAN be seen that most CAN values are significantly smaller than other defect localization methods. For A-test, the farther the deviation of the A statistic from 0.5 for both comparison methods, the greater the difference between the two comparison methods. A-test values greater than 0.64 or less than 0.36 are "medium" differences, and A-test values greater than 0.71 or less than 0.29 are "large" differences. Table 3 shows that CAN is mostly a "large" difference. Therefore, the positioning efficiency of CAN is higher than other defect positioning methods.
Table 3 statistical analysis of CAN and 10 defect localization methods
Figure BDA0002932991110000161
Therefore, based on the experimental results and analysis, we CAN conclude that CAN significantly improves the efficiency of defect localization and shows that the graph neural network has great potential in understanding the defect context and improving the defect localization efficiency.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (2)

1. The defect positioning method based on context awareness is characterized by comprising the following steps:
s100, data extraction and preparation, wherein the specific steps are as follows:
s110, a program containing N sentences is set, a test case set T containing M test cases runs the program, the test case set T at least contains one failed test case, and the output of each test case in the test case set T is known;
Giving each test case in T to program execution to obtain statement coverage information of each test case in the program, so that the statement coverage information of the test case in the program forms an M multiplied by N coverage information matrix, and the known outputs of all the test cases form a result vector;
s120, constructing a defect context, wherein the defect context is a set of sentences influencing program errors, and determining a sentence set from output sentences of one failed test case in a test case set T by using a dynamic slicing technology, wherein the sentences in the sentence set have a dependency relationship with the output sentences of the selected failed test case;
s130: the program statement dependency graph and the adjacency matrix A are constructed by the following steps:
s131: constructing a program statement dependency graph by adopting a dynamic slicing technology, wherein nodes in the program statement dependency graph represent one statement in a defect context, and edges in the program statement dependency graph represent association relations between two statements;
the program statement dependency graph is expressed as G, G= (V, ζ), V represents the set of nodes in the program statement dependency graph, ζ represents the set of edges in the program statement dependency graph, and one node V i Representing a statement, v, in the context of the defect i E V, an edge (V i ,v j ) Representing two sentences v i And v j Association relation between (v) i ,v j )∈ξ;
S132: the association relation between nodes on the program statement dependency graph is represented by an adjacency matrix A, wherein A ij Elements representing the ith row and jth column in A, A ij =1 means that there is a directed edge a between node i and node j pointing from node i to node j ij ;A ij =0 indicates that there is no directed edge between node i and node j pointing from node i to node j, a ij Reflecting whether there is information flowing from node i to node j;
s140: providing a K sentences in the defect context, checking N sentences of the program, discarding sentences not in the defect context to obtain an MxK matrix, wherein the MxK matrix records the execution information of the K sentences in the defect context in the test case set T, and each row of the MxK matrix represents one test case and y ij The value of the element in the jth column of the ith row of the MxK matrix is 0 or 1, when y ij =1 indicates that the jth statement in the defect context is executed by test case i, y ij =0 indicates that no execution is performed;
s200: the specific steps of the model training process are as follows:
s210: embedding node expression vectors into all nodes of the program statement dependency graph through one-hot coding, and initializing the node expression vectors into one-hot expression vectors i (1) ,vector i (1) ∈R d The length of the node representation vector is the number d of the nodes, the superscript (1) represents that the iteration is the first round, and then each node i fully collects the information of all adjacent nodes after t rounds of iteration to obtainTaking an iterated representation m of the node i (t) ,m i (t) ∈R d
S220: taking the ith row in the MxK matrix, when the value of the jth column element in the ith row is 1, namely y ij =1, then replace y with the node representation vector j ij When the value of the j-th column element in the i-th row is 0, i.e., y ij =0, then the y is replaced by a zero vector ij
S230: taking the ith row in the M multiplied by K matrix processed by S220 as the input of a neural network model to obtain an o i Value of o i Value and known output e of test case represented by ith row i Obtaining a loss value by value difference;
during iteration, a back propagation algorithm is used for parameters and node vector vectors of the neural network model according to loss values 1 To vector K Updating, wherein the specific iteration times are set according to the experience values, if the iteration times are smaller than the preset value, the next step is executed, otherwise, the S300 is executed;
s231: the updating method of the node representation vector comprises the following steps:
taking node i as a central node, and a neural network a is R d ×R d ->R distributes different weights delta to each neighbor node j according to the importance of the neighbor node j to the center node i ij Normalizing the correlation coefficients of all the neighbor nodes by using a softmax function;
Figure FDA0004268743430000021
wherein a represents a neural network a: R d ×R d ->R;
S232: computing interaction information
Figure FDA0004268743430000022
Figure FDA0004268743430000023
Interaction information for the central node i:
Figure FDA0004268743430000024
wherein A is ij In the adjacency matrix A representing the center node i and the neighbor node j, whether information flows from the node i to the node j or not and vector j Node representation vector representing neighbor node j, N i All neighbor nodes representing a central node i;
s233: the GRU is a gating loop unit, and when the node representation vector is iterated in the t-th round, the GRU outputs vector of the neighbor node j j (t-1) And m i (t) As input, a new node representing vector is output as node i i (t)
Figure FDA0004268743430000025
Returning to S220 after updating the node representation vector;
s300: performing defect positioning on a defective target program:
for a given faulty program, we want to find out which statement in the sentences contained in the faulty program caused the program fault, for this reason we get a suspicious value between 0 and 1, the larger the suspicious value, the higher the probability that this sentence caused the program fault, so we find the location of the defect in the program;
s310: constructing a defect context for the method of S120 of the defective target program, and constructing a K x K virtual test case set, wherein
K is the statement number of the defect context, each virtual test case only covers one defect context Wen Yugou, and each row in the K multiplied by K virtual test case set is one virtual test case;
s320: taking the ith row in the KxK virtual test case set, when the value of the jth column element in the ith row is 1, namely y' ij The value of =1 is1, then replace the y 'with the node representation vector j' ij When the value of the j 'th column element in the i-th row is 0, i.e., y' ij =0, then the y 'is replaced by a zero vector' ij
S330: taking the ith row in the KxK virtual test case set processed by S320 as the input of the trained neural network model to obtain an o i ' value, o i ' is the suspicious value;
s330: traversing each row in the KxK virtual test case set to obtain K suspicious values, wherein the range of the K suspicious values is 0-1, and the larger the value of the suspicious value is, the higher the possibility that the statement affects the target program to make mistakes is.
2. The context-aware-based defect localization method of claim 1, wherein: the method for constructing the defect context in S120 specifically includes the following steps:
the defect context is a statement set influencing program errors, a statement set is determined from output statements of a failed test case in the test case set T by using a dynamic slicing technology, and the statements in the statement set have a dependency relationship with the output statements of the selected failed test case;
The following slicing criteria were used:
failSC=(outStm,incorrectVar,failEx)
the outpstm is an output statement, one variable in the statement is incorrectVar, incorrectVar indicates that the error is Var, and failx is a failed execution path.
CN202110152656.5A 2021-02-04 2021-02-04 Defect positioning method based on context awareness Active CN112965894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110152656.5A CN112965894B (en) 2021-02-04 2021-02-04 Defect positioning method based on context awareness

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110152656.5A CN112965894B (en) 2021-02-04 2021-02-04 Defect positioning method based on context awareness

Publications (2)

Publication Number Publication Date
CN112965894A CN112965894A (en) 2021-06-15
CN112965894B true CN112965894B (en) 2023-07-07

Family

ID=76275004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110152656.5A Active CN112965894B (en) 2021-02-04 2021-02-04 Defect positioning method based on context awareness

Country Status (1)

Country Link
CN (1) CN112965894B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113791976B (en) * 2021-09-09 2023-06-20 南京大学 Method and device for enhancing defect positioning based on program dependence
CN115629995B (en) * 2022-12-21 2023-03-14 中南大学 Software defect positioning method, system and equipment based on multi-dependency LSTM

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572474A (en) * 2015-01-30 2015-04-29 南京邮电大学 Dynamic slicing based lightweight error locating implementation method
CN109144882A (en) * 2018-09-19 2019-01-04 哈尔滨工业大学 A kind of software fault positioning method and device based on program invariants
CN110515826A (en) * 2019-07-03 2019-11-29 杭州电子科技大学 A kind of software defect positioning method based on number frequency spectrum and neural network algorithm
EP3696771A1 (en) * 2019-02-13 2020-08-19 Robert Bosch GmbH System for processing an input instance, method, and medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572474A (en) * 2015-01-30 2015-04-29 南京邮电大学 Dynamic slicing based lightweight error locating implementation method
CN109144882A (en) * 2018-09-19 2019-01-04 哈尔滨工业大学 A kind of software fault positioning method and device based on program invariants
EP3696771A1 (en) * 2019-02-13 2020-08-19 Robert Bosch GmbH System for processing an input instance, method, and medium
CN110515826A (en) * 2019-07-03 2019-11-29 杭州电子科技大学 A kind of software defect positioning method based on number frequency spectrum and neural network algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Toward Location-Enabled IoT (LE-IoT): IoT Positioning Techniques, Error Sources, and Error Mitigation;You Li等;《IEEE Internet of Things Journal》;4035 - 4062 *
基于上下文的错误定位方法研究;张旭;《中国优秀硕士学位论文全文数据库 信息科技辑》;I138-281 *
增强上下文的错误定位技术;张卓等;《软件学报》;266-281 *

Also Published As

Publication number Publication date
CN112965894A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
Zhang et al. Data transformation in cross-project defect prediction
CN112965894B (en) Defect positioning method based on context awareness
CN109086799A (en) A kind of crop leaf disease recognition method based on improvement convolutional neural networks model AlexNet
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN112418212A (en) Improved YOLOv3 algorithm based on EIoU
CN105760295A (en) Multi-defect positioning method based on search algorithm
CN112668809B (en) Method for establishing autism children rehabilitation effect prediction model
Land Measurements of software maintainability
EP4075281A1 (en) Ann-based program test method and test system, and application
CN115629998B (en) Test case screening method based on KMeans clustering and similarity
Ouni et al. Multiobjective optimization for software refactoring and evolution
CN113011509B (en) Lung bronchus classification method and device, electronic equipment and storage medium
CN114494756A (en) Improved clustering algorithm based on Shape-GIoU
CN112785585B (en) Training method and device for image video quality evaluation model based on active learning
CN113592008A (en) System, method, equipment and storage medium for solving small sample image classification based on graph neural network mechanism of self-encoder
CN112862063A (en) Complex pipe network leakage positioning method based on deep belief network
CN114266352B (en) Model training result optimization method, device, storage medium and equipment
KR20190109194A (en) Apparatus and method for learning neural network capable of modeling uncerrainty
CN114678083A (en) Training method and prediction method of chemical genetic toxicity prediction model
CN113096070A (en) Image segmentation method based on MA-Unet
Hsu et al. Testing monotonicity of conditional treatment effects under regression discontinuity designs
Sokolova et al. Computing lower and upper bounds on the probability of causal statements
CN111914952A (en) AD characteristic parameter screening method and system based on deep neural network
CN111581086A (en) Hybrid software error positioning method and system based on RankNet
Sakieh et al. Rules versus layers: which side wins the battle of model calibration?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant