CN112965894A

CN112965894A - Defect positioning method based on context sensing

Info

Publication number: CN112965894A
Application number: CN202110152656.5A
Authority: CN
Inventors: 雷晏; 张卓; 刘春燕; 谢欢; 鄢萌; 徐玲; 徐洲
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2021-02-04
Filing date: 2021-02-04
Publication date: 2021-06-15
Anticipated expiration: 2041-02-04
Also published as: CN112965894B

Abstract

The invention relates to a defect positioning method based on context sensing, which utilizes a program slicing technology to construct a defect context, wherein the context can be represented as a directed graph represented by a program dependency graph, nodes in the graph are sentences which have direct or indirect incidence relation with failure, and edges are incidence relation among the sentences. Based on the graph, each node in the CAN graph adopts one-hot coding to embed a node representation vector, the GNN is used for acquiring the dependency relationship between sentences, and the CAN is trained by using a test case on the basis of the node representation vectors, so that a more accurate node representation vector CAN be acquired. And finally, constructing a virtual test case set by a method that each statement in the defect context statements of the defective target program is only covered by one test case and one test case also only covers one defect context statement. And inputting the test case set into the trained GNN to obtain the suspicious value of each statement. The method is used for analyzing the defect context and bringing the defect context into suspicious evaluation so as to improve the defect positioning, and experimental analysis proves that the method can obviously improve the effectiveness of the defect positioning.

Description

Defect positioning method based on context sensing

Technical Field

The invention relates to a defect positioning method, in particular to a defect positioning method based on context sensing.

Background

The automatic software debugging technology plays a necessary role in helping developers reduce time-consuming and labor-consuming manual work in the testing process, and can greatly reduce the burden of the developers. Researchers have therefore proposed many software bug-locating methods that assist developers in finding bugs in programs by analyzing program executions that result in unexpected outputs. Among them, the program spectrum based method (SFL) is one of the most popular defect localization methods.

A program spectrum-based defect localization method (SFL) utilizes program coverage information and test case results to build a defect localization model and calculate a suspicious value of each executable statement in a program as a defect statement. SFL defines an information model called program spectrum, which is the input for calculating the suspicious value of a statement. The program spectrum records the running information and the test case result of the program after the program is executed. Suppose that a program P includes N statements, and the test case set T of the program includes M test cases, at least one of which fails, as shown in fig. 1. X _ij1 indicates that test case i executed statement j, X _ij0 means that test case i has not executed statement j. The matrix mxn records the execution information of each statement in the test case set T. The error vector e is the test case result. Wherein the element e _i1 denotes that test case i is a failed test case, e _i0 indicates that test case i is a successful test case. Based on the program spectrum, the SFL defined suspicious value calculation formula comprises four parametersEach of which is a_np、a_nf、a_epAnd a_efThese four parameters are bound to a specific statement. They represent the number of pass or fail test cases, i.e. a, respectively, that execute or not execute the statement_npNumber of test cases indicating passage of not executing the statement, a_nfNumber of test cases indicating failure to execute the statement, a_epNumber of test cases indicating passage of execution of the sentence, a_efIndicating the number of test cases that failed to execute the statement. In practice, we expect the true error statement to have a higher a_efValue and lower a_epValue of when a_efMaximum value and a_epWhen the minimum value is reached, the statement is not executed by the successful test case, all the failed test cases execute the statement, and the maximum value should be returned when all the suspicious value calculation formulas calculate the suspicious value of the statement. In a typical case, different suspect value calculation formulas may output different suspect values.

FIG. 2 shows an architecture diagram for defect localization using deep learning techniques. The model includes an input layer, a deep learning component, and an output layer. And in an input layer, the test case coverage information matrix and the test case result vector are used as training samples and corresponding labels. According to the M x N matrix and the corresponding result vector, each time the M x N matrix with h rows is used as the input of the model, the corresponding result vector is used as the label, the ith row is started,

at a deep learning component layer, the MLP-FL, the CNN-FL and the BilSTM-FL respectively use a multilayer perceptron, a convolutional neural network and a bidirectional long-and-short-range memory network as deep learning components. At the output layer, SFL uses a sigmoid function (the sigmoid function can output the input data as a value between 0 and 1). The value in the result vector e is different from the value in the sigmoid function output result vector y. And then, parameters of the optimization model are continuously updated by repeatedly iterative training by using a BP algorithm, so that the difference value between a training result vector e and a result vector y is continuously reduced. Despite SFL acquisitionA good positioning effect is achieved, but it still has certain limitations. Their suspect value calculation model does not use the defect context. A defect context is a set of statements of a program, the statements in the set having direct or indirect data or control dependencies with output statements of a failed test case. In fact, the defect context shows the propagation process and mechanism of the defect in the program, which is very important for understanding the program and locating the defect. Thus, relying only on test case results and test case coverage information without taking into account the complex inherent relationships in the defect context can affect the accuracy of the defect location.

Disclosure of Invention

Aiming at the problems in the prior art, the technical problems to be solved by the invention are as follows: how to accurately locate defects using the defect context.

In order to solve the technical problems, the invention adopts the following technical scheme: a defect positioning method based on context sensing comprises the following steps:

s100, data extraction and preparation, which comprises the following specific steps:

s110, a program containing N sentences is arranged, a test case set T containing M test cases runs the program, the test case set T at least contains one failed test case, and the output of each test case in the test case set T is known;

each test case in the T is handed to a program to be executed, and statement coverage information of each test case in the program is obtained, so that the statement coverage information of the test cases in the program forms an M multiplied by N coverage information matrix, and known outputs of all the test cases form a result vector;

s120, constructing a defect context which is a set of statements influencing program errors, determining a statement set from an output statement of a failed test case in the test case set T by using a dynamic slicing technology, wherein the statements in the statement set have a dependency relationship with the selected output statement of the failed test case;

s130: constructing a program statement dependency graph and an adjacency matrix A, and specifically comprising the following steps:

s131: building a program statement dependency graph by adopting a dynamic slicing technology, wherein a node in the program statement dependency graph represents a statement in a defect context, and an edge in the program statement dependency graph represents an association relation between two statements;

the program statement dependency graph is represented as G, G is (V, xi), V represents a set of nodes in the program statement dependency graph, xi represents a set of edges in the program statement dependency graph, and one node V_iRepresents a statement in the context of a defect, v_iE.g. V, a side (V)_i,v_j) Represents two sentences v_iAnd v_j(v) correlation between them_i,v_j)∈ξ；

S132: the incidence relation between the nodes on the program statement dependency graph is represented by an adjacency matrix A, wherein A_ijRepresenting the elements of row i and column j in A, A _ij1 indicates that there is a directed edge a between node i and node j pointing from node i to node j_ij；A _ij0 denotes that there is no directed edge between node i and node j that points from node i to node j, a_ijReflecting whether information flows from node i to node j;

s140: setting a defect context with K sentences, checking N sentences of the program, discarding the sentences not in the defect context to obtain an M multiplied by K matrix, recording the execution information of the K sentences in the defect context in a test case set T, wherein each row of the M multiplied by K matrix represents a test case, y represents a test case, and_ijthe value of the element in the ith row and the jth column of the M multiplied by K matrix is 0 or1 when y _ij1 denotes that the jth statement in the defect context is executed by test case i, y _ij0 means not executed;

s200: the specific steps of the model training process are as follows:

s210: embedding node representation vectors into all nodes of the program statement dependency graph through one-hot coding, and initializing the node representation vectors into one-hot representation vector_i ⁽¹⁾，vector_i ⁽¹⁾∈R^dThe length of the node representation vector is the number d of the nodes, the superscript (1) represents that the iteration is the first round, and then after t rounds of iteration, each node i fully collects the information of all adjacent nodes so as to obtain the iterative representation m of the nodes_i ^(t)，m_i ^(t)∈R^d。

S220: taking the ith row in the M multiplied by K matrix, when the value of the jth column element in the ith row is 1, namely y _ij1, the vector j is represented by a node (where vector j is different from the neighbor node vector j, and where vector j represents only the value of j and y_ijWhere j has the same value) instead of y_ijWhen the value of the jth column element in the ith row is 0, y_ijWhen equal to 0, the y is replaced by a zero vector_ij；

S230: taking the ith row in the M multiplied by K matrix processed by S220 as the input of the neural network model to obtain an o_iValue of o_iThe value and the known output e of the test case represented by row i_iAnd (3) obtaining a loss value by subtracting the values, and performing iteration on parameters and node vector vectors of the neural network model by using a back propagation algorithm (back propagation algorithm) according to the loss value₁To vector_KAnd (6) updating. Setting specific iteration times S according to empirical values, wherein values of different programs S are different, and executing the next step if the iteration times is less than S, otherwise executing S300;

s231: the method for updating the node representation vector comprises the following steps:

taking the node i as a central node, the neural network a: R^d×R^d->R distributes different weights delta to each neighbor node j according to the importance of the neighbor node j to the central node i_ijNormalizing the correlation coefficients of all the neighbor nodes by using a softmax function;

wherein a represents a neural network a R^d×R^d->R；

S232: computingInteractive information

Interaction information for the central node i:

wherein A is_ijIndicating whether information flows from the node i to the node j, vector or not in the adjacent matrix A of the central node i and the neighbor node j_jNode representation vector, N, representing neighbor node j_iAll neighbor nodes representing a central node i;

s233: GRU is a gate control loop unit, and when a node representation vector is iterated in the t round, GRU enables a vector of a neighbor node j to be used_j ^(t-1)And m_i ^(t)New node representation vector with output as node i as input_i ^(t)；

After the node representation vector is updated, returning to S220;

s300: and (3) carrying out defect positioning on the defective target program:

for a given faulty program, we want to find out which statement among the statements contained in the faulty program causes the program fault, and for this reason we get a suspicious value between 0 and 1, the larger the suspicious value, the higher the probability that this statement causes the program fault. We then find the location of the defect in the program.

S310: constructing a defect context for the method of S120 of the defective target program, and constructing a K x K virtual test case set, wherein

K is the number of statements of the defect context, each virtual test case only covers one statement of the defect context, and each line in the KxK set of virtual test cases is one virtual test case;

s320: taking the ith 'row in the K multiplied by K virtual test case set, when the value of the jth' column element in the ith 'row is 1, y'_ijIf the value of 1 is 1, y 'is replaced by the node representation vector j'_ijWhen the value of the j 'th column element in the ith row is 0, that is, y'_ij0, then replace y 'with a zero vector'_ij；

S330: taking the ith' row in the KxK virtual test case set processed by the S320 as the input of the trained neural network model to obtain an o_i' value, said o_i' is a suspect value;

s330: and traversing each line in the K multiplied by K virtual test case set to obtain K suspicious values, wherein the range of the K suspicious values is 0-1, and the larger the value of the suspicious value is, the higher the possibility that the statement influences the target program to make mistakes is.

Preferably, the method for constructing the defect context in S120 specifically includes:

determining a statement set from an output statement of a failed test case in the test case set T by using a dynamic slicing technology, wherein the statement in the statement set has a dependency relation with the selected output statement of the failed test case;

the following slicing criteria were used:

failSC＝(outStm,incorrectVar,failEx)

wherein outStm is an output statement, a variable in the statement is incorrectVar, incorrectVar represents an erroneous Var, and failEx is a failed execution path.

Compared with the prior art, the invention has at least the following advantages:

the method of the invention is a neural defect localization based on context-aware to analyze the fault context and bring it into suspicion evaluation to improve the defect localization; the method utilizes program slices to model the defect context, represents the defect context as a program dependency graph, and then constructs a graph neural network to analyze and learn the complex relation between sentences in the defect context; a model is eventually learned that evaluates whether each error statement is suspect. The inventors have performed experiments on 12 actual large procedures and compared the method of the present invention with 10 recent defect localization methods. The results show that the method of the present invention can significantly improve the effectiveness of defect localization, for example, the 4.61%, 20%, 29.23%, 49.23% and 64.62% failures are located at the first 1, first 3, first 5, first 10 and first 10 bits, respectively.

Drawings

FIG. 1 is a statement coverage information matrix for M test cases.

Fig. 2 is a diagram of a suspicion evaluation using a neural network.

FIG. 3 is a schematic diagram of the method of the present invention.

FIG. 4 is a set of virtual test cases.

Fig. 5 is an example of the method of the present invention.

FIGS. 6(a) -6 (d) are comparative EXAM values for CAN and 10 defect localization methods.

FIGS. 7(a) and 7(b) are CAN and RImp plots.

Detailed Description

The present invention is described in further detail below.

The invention utilizes learning ability to construct a model capable of simulating complex incidence relation in the defect context, thereby achieving the purpose of integrating the error statement context into the defect positioning technology. Therefore, we propose a context-aware defect localization technique, abbreviated as CAN, and it is worth noting that CAN simulates a defect context by building a program dependency graph, which CAN exhibit a set of interacting (data-dependent and control-dependent) statements. The CAN simulates the propagation of the defect in the defect context by utilizing the technology of the neural network, and the defect context is merged into a defect positioning system, so that the position of the defect statement CAN be accurately positioned. Experiments show that in 12 large real programs, the CAN CAN achieve very accurate positioning effect, and 49.23% of defects are positioned in TOP 10. Therefore, the CAN significantly improves the efficiency of the defect localization technology.

Graph Neural Networks (GNNs) can be modeled according to nodes on a graph structure and the incidence relation between the nodes, and through information transfer between the nodes, iterative models are continuously trained, so that convergence of the models can be obtained, and classification or regression problems can be solved. The invention discloses a method for representing a defect context by using a graph structure, which is characterized in that the defect context is represented as data of the graph structure, namely a program dependency graph, the program dependency graph comprises nodes and edges, the nodes are sentences in a program, and the edges are incidence relations among the sentences, including data dependency relations and control dependency relations. The GNNs can learn the element complex association relation in the defect context, so that the defect context is merged into a defect positioning system.

The method utilizes the program dependence graph to model the defect context, utilizes the Graph Neural Networks (GNNs) to analyze and understand the defect context, and further fuses the defect context into a defect positioning system so as to complete the positioning of the program defects. Specifically, the CAN first constructs a defect context by using a program slicing technique, where the context is a directed graph represented by a program dependency graph, nodes in the graph are statements having a direct association with a failure, and edges are associations between the statements. Based on this graph, the CAN uses GNNs to obtain dependencies between statements and then generate corresponding node representation vectors, which are not well represented in conventional defect localization methods (e.g., SFL). The CAN trains by using test cases on the basis of the node representation vectors, so that more accurate statement representation CAN be obtained. Finally, the CAN evaluates each statement as a suspect value of a defective statement using a virtual test case.

Referring to fig. 3, a defect location method based on context sensing includes the following steps:

s110, a program containing N sentences is arranged, a test case set T containing M test cases runs the program, the test case set T at least contains one failed test case, and the output of each test case in the test case set T is known.

And (3) submitting each test case in the T to a program for execution to obtain statement coverage information of each test case in the program, so that the statement coverage information of the test cases in the program forms an M multiplied by N coverage information matrix, and known outputs of all the test cases form a result vector.

Representing the execution information of the program P by using a coverage information matrix and a test case result vector; the CAN uses a graph neural network model to fuse the defect context into a defect positioning system on the basis of a defect program and an information model (covering an information matrix and a test case result vector). The graph neural network model comprises an input layer, a deep learning component and an output layer. And in an input layer, the coverage information matrix of the test case is used as a training sample, and the result vector of the test case is used as a corresponding label. According to the M × N coverage information matrix and the corresponding result vector, each time there are h rows of the M × N coverage information matrix as the input of the model, the corresponding result vector as the label, the initial row is the ith row, i ∈ {1, 1+ h, 1+2h, …,

}. At a deep learning component layer, the MLP-FL, the CNN-FL and the BilSTM-FL respectively use a multilayer perceptron, a convolutional neural network and a bidirectional long-and-short-range memory network as deep learning components. At the output layer, the program spectrum based method outputs the input data as a value between 0 and 1 using sigmoid function). The value in the result vector e is different from the value in the sigmoid function output result vector y. And then, continuously updating parameters of the optimization model by repeatedly iterative training by using a BP algorithm (Back propagation algorithm), so that the difference value between a training result vector e and a result vector y is continuously reduced.

S120, constructing a defect context which is a set of statements influencing program errors, and determining a statement set from output statements of a failed test case in the test case set T by using a dynamic slicing technology, wherein the statements in the statement set have a dependency relationship with the selected output statements of the failed test case.

The method for constructing the defect context specifically comprises the following steps:

since the defect context is strongly correlated with a failed execution, statements that directly or indirectly affect the calculation of the erroneous output value of the fault through dynamic data chain/control correlation are included in the defect context, the defect context is constructed using a dynamic slicing method. Determining a statement set from an output statement of a failed test case in the test case set T by using a dynamic slicing technology, wherein the statement in the statement set has a dependency relation with the selected output statement of the failed test case; in particular implementations, such dependencies may be direct or indirect data or control dependencies.

The following slicing criteria were used:

failSC＝(outStm,incorrectVar,failEx)

It should be noted that: dynamic slicing is one prior art.

Randomly selecting one execution failSC (outStm, incorrectVar, failEx) slicing standard from all failed test cases, thereby obtaining a defect context. After the calculation by using the fail stm (incorrectVar) slicing algorithm, the defect context of the program can be obtained, and then a dependency graph of the program statement is constructed according to the defect context of the program statement.

the program statement dependency graph is represented as G, G is (V, xi), V represents a set of nodes in the program statement dependency graph, xi represents the program statement dependency graphDepending on the set of edges in the graph, a node v_iRepresents a statement in the context of a defect, v_iE.g. V, a side (V)_i,v_j) Represents two sentences v_iAnd v_j(v) correlation between them_i,v_j) E.g. xi; one node in the program statement dependency graph represents one statement in a defect context, edges are incidence relations among the statements and comprise data dependency relations and control dependency relations, failed test cases contain the defect context statements, and therefore relations exist certainly, and nonlinear relations between test case coverage information and test case results are found by means of the neural network and data correlation.

S132: the incidence relation between the nodes on the program statement dependency graph is represented by an adjacency matrix A, wherein A_ijRepresenting the elements of row i and column j in A, A_ij1 indicates that there is a directed edge a between node i and node j pointing from node i to node j_ij；A_ij0 denotes that there is no directed edge between node i and node j that points from node i to node j, a_ijReflecting whether information flows from node i to node j;

since the coverage information matrix cannot show the incidence relation between statements and the propagation process of defects in the program, a program statement dependency graph needs to be constructed.

S200: the specific steps of the model training process are as follows:

s210: embedding node representation vectors into all nodes of the program statement dependency graph through one-hot coding, and carrying out one-hot coding on all nodesInitialization of node representation vector to one-hot representation vector_i ⁽¹⁾，vector_i ⁽¹⁾∈R^dThe length of the node representation vector is the number d of the nodes, the superscript (1) represents that the iteration is the first round, and then after t rounds of iteration, each node i fully collects the information of all adjacent nodes so as to obtain the iterative representation m of the nodes_i ^(t)，m_i ^(t)∈R^d。

y

_ij1 means node v_jTested case T_iExecute, node v_jIs represented by vector_jInput into a neural network, y _ij0 denotes the node v_jCase T not tested_iExecution, the 0 vector is input into this iteration of the neural network.

S230: taking the h-th row in the matrix of M multiplied by K after S220 processing as the input of the neural network model to obtain o_hValue of o_iThe value and the known output e of the test case represented by row h_hAnd (3) obtaining a loss value by subtracting the values, and performing iteration on parameters and node vector vectors of the neural network model by using a back propagation algorithm (back propagation algorithm) according to the loss value₁To vector_KAnd (6) updating. Setting specific iteration times S according to empirical values, wherein values of different programs S are different, and executing the next step if the iteration times is less than S, otherwise executing S300;

taking the node i as a central node, the neural network a: R^d×R^d->R is applied to each neighbor node according to the importance of the neighbor node j to the center node ij assigns different weights δ_ijNormalizing the correlation coefficients of all the neighbor nodes by using a softmax function;

wherein a represents a neural network a R^d×R^d->R；

S232: computing interaction information

Interaction information for the central node i:

After the node representation vector is updated, returning to S220;

according to the node representation information updating process, the CAN carries out continuous iterative training on the model, and each iteration updates the node representation vector. Suppose the CAN uses the ith row ([ y ] of the information coverage matrix M K_i1,y_i2,…,y_iK]) And corresponding test case result vector e_i. For y_ijIn other words, y _ij1 means node v_jTested case T_iExecution, CAN will node v_jIs represented by vector_jInputting the data into the iteration of the model; y is_ij0 denotes the node v_jCase T not tested_iExecution, CAN inputs a 0 vector into this iteration of the model.

After the test case T is executed_iThen, selecting a sentence set strongly related to the defect output and inputting the sentence set into the GNN model, then carrying out iterative training on the node representation vector of the model, and carrying out T during training_iThe node representation vectors of the non-covered statements remain unchanged, and only the node representation vectors of the covered statements are updated. After the node indicates that the update is completed, the model outputs a value o of 0 to 1 through the linear transformation layer_i(i∈{1,2，…,M})。

The CAN continuously iterates each row of the matrix M multiplied by K and the corresponding test case result vector as input, and during iteration, a back propagation algorithm (back propagation algorithm) is used for iterating parameters of the model and the node vector₁To vector_KAnd (6) updating. The goal is to continuously narrow the difference between the values in the output o and the test case result vector e. The algorithm continuously calculates from the input layer to the output layer, and then updates the parameters and the node representation in reverse. The CAN trains by adopting a method of dynamically adjusting the learning rate, and the method has two advantages that firstly, the method CAN use a larger learning rate to reduce the loss quickly at the beginning and use a smaller learning rate to prevent the training from missing the optimal point.

In the following formula for calculating LR, one Epoch represents that training data is completely trained once, LR represents the learning rate, DropRate represents the value of each adjustment of the learning rate, and Epoch drop represents the frequency of updating the learning rate. We set the initial learning rate to 0.01 and DropRate to 0.98. The EpochDrap is set according to the size of the test case set.

LR＝LR*DropRate^{(Epoch+1)/EpochDrop}

S300: and (3) carrying out defect positioning on the defective target program:

S310: constructing a defect context for the method of S120 of the defective target program, and constructing a K × K virtual test case set, see fig. 4, where K is the number of statements of the defect context, and each virtual test case only covers one statement of the defect context, so that a total of K test cases, that is, the K × K virtual test case set, is constructed;

each line in the KxK virtual test case set is a virtual test case;

s320: taking the ith 'row in the K multiplied by K virtual test case set, when the value of the jth' column element in the ith 'row is 1, y'_ijIf the value of 1 is 1, then the vector j ' is represented by a node (vector j ' here represents only the value of j and y '_ijWherein j has the same value) instead of y'_ijWhen the value of the j 'th column element in the ith row is 0, that is, y'_ij0, then replace y 'with a zero vector'_ij；

s330: traversing each line in the K multiplied by K virtual test case set to obtain K suspicious values, wherein the range of the K suspicious values is 0-1, and the larger the value of the suspicious value is, the higher the possibility that the statement influences the error of the target program is, that is, the larger the suspicious value is, the more likely the statement is the position of the defect of the program, and the defect is located.

One example of a CAN

FIG. 5 is a diagram showing how CAN CANWorking example, program P in the example contains 8 statements, of which there is a defect statement S₄. FIG. 5(a) shows a statement S with a defect₄Program P of (1). FIG. 5(b) shows 6 test cases, where T₂,T₃Is a failed test case. FIG. 5(c) shows the use of test case T by program P₃The result of dynamic slicing of (2) includes 6 of the 8 statements. We can see that in the slicing result, S₁,S₃,S₄,S₅And S₈Influence S₈The variable z in (1). Fig. 5(d) shows a diagram representation of the program P, which includes both control dependencies and data dependencies of the program.

Fig. 5(e) shows the CAN training process. The CAN converts the program dependency graph into an adjacency matrix and inputs the adjacency matrix into the GNN model. And then the CNN trains the model by using the coverage information and the test case result vector. In the example, 6 vectors represent a representation of 6 nodes. For example, S₁Is a node in the program P, vector1 is S₁Is represented by the node(s). Specifically, according to the test case T₁＝[1,1,1,1,1,1]And its result 0 (the right-most vector of fig. 5 (b)), we input (vector1, vector2, vector3, vector4, vector5, vector6) and 0 in the result vector into the model; according to test case T₂＝[1,1,1,1,1,1]And its result 1, we input 1 in the (vector1, vector2, vector3, vector4, vector5, vector6) and result vectors after the first iteration into the model; according to test case T₃＝[1,1,1,1,1,1]And its result 1, we input 1 in the (vector1, vector2, vector3, vector4, vector5, vector6) and result vectors after the previous iteration into the model; according to test case T₄＝[1,1,1,1,0,1]And its result 0, we input the (vector1, vector2, vector3, vector4, zero vector, vector6) of the previous iteration and 0 in the result vector into the model; according to test case T₅＝[1,1,1,1,0,1]And its result 0, we input the (vector1, vector2, vector3, vector4, zero vector, vector6) of the previous iteration and 0 in the result vector into the model; according to test case T₆＝[1,1,1,1,0,1]And its result 0, we will turn the previous roundIterative (vector1, vector2, vector3, vector4, zero vector, vector6) and 0 in the result vector are input into the model. The convergence condition is achieved by continuously training the network repeatedly until the loss is small to some extent. After training, the model reflects the complex nonlinear relationship between the statement representation and the test case coverage information and the test case results.

Finally, the CAN constructs a virtual test case (see fig. 5(f)), which contains 6 test cases, each of which contains only one covered statement. And inputting one virtual test case into the trained model, wherein the output of the model is the suspicious value of the defect statement of the statement covered by the test case set. For example, we set the virtual test case VT1 to [1,0,0,0,0,0]Inputting the input into the trained model, and outputting the output as a sentence S₁Is 0.6. Similarly, we can calculate the suspect values for other statements. Since S is not included in the defect context₂And S₇Thus CAN assigns S₂And S₇The lowest suspect value is 0. As can be seen from FIG. 5(g), the final list of suspect values is (S)₄,S₁,S₃,S₅,S₆,S₈,S₂,S₇). True defect statement S₄Ranked first.

Experimental testing

A. Construction of the experiment

To verify the validity of CAN, CAN was compared to 10 better defect localization methods. These 10 methods are MLP-FL, CNN-FL, BilSTM-FL, Ochiai, ER5, GP02, GP03, Dstar, GP19 and ER 1', respectively. Further, the experiment used large program objects widely used in the field of defect localization, whose number of code lines varied from 5.4 to 491 kilos.

Table 1 summarizes the characteristics of these subject procedures. For each procedure, the "description" column of table 1 describes the subject procedure; the "version number" column describes the number of defective versions of the program; the "thousand rows" column describes the number of rows of the code; "test case number" describes the number of test cases of a program. The first 4 programs (Chart, m)ath, lang, and time) from Defect4J (b)http://defects4j.org)；pythonGzip and libtiff are from ManyBugs (A), (B), (Chttp:// repairbenchmarks.cs.umass.edu/ManyBugs /); space and 4Versions of nanoxml are derived from SIR (C:)http://sir.unl.edu/portal/index.php)。

The experimental environment is as follows: the CPU is I5-2640,64G memory, a 12G NVIDIA TITAN X Pascal GPU and the experimental operating system is Ubantu 16.04.3.

B. Evaluation method

To validate the CAN, we used three widely used evaluation methods: Top-N, EXAM and RImp. Top-N may show the best effect of defect localization, and EXAM and RImp show the overall effect of localization.

TABLE 1 Experimental procedures

Name of program	Brief description of the drawings	Number of versions	Code line number (thousands of lines)	Number of test cases
					python	General-purpose language	8	407	355
gzip	Data compression		5	491						12
					libtiff	Image processing	12	77	78
space	ADL interpreter	35	6.1	13585
					nanoxml_v1	XML parser	7	5.4	206
nanoxml_v2	XML parser	7	5.7	206
					nanoxml_v3	XML parser		10	8.4	206
nanoxml_v5	XML parser	7	8.8	206
					chart	JFreeChart	26	96	2205
math	Apache Commons Math	106	85	3602
					lang	Apachecommons lang	65	22	2245
time	Joda-Time	27	53	4130

Specifically, Top-N shows the positioning accuracy, i.e., how many defect versions in the positioning result of a defect positioning method position the real defect statements in the first N. A higher value of Top-N indicates that more real defect statements are located in the first N bits. Exam is defined as the percentage of statements that have been checked when a true error statement was found. Lower values of Exam indicate better defect localization performance. RImp is defined as the sum of all statements checked for errors that CAN find for all versions of a program divided by the sum of all statements checked for errors that CAN find for all versions of the program using another defect localization method. For CAN, lower rim values represent better positioning performance.

To verify the validity of CAN, CAN was compared to 10 typical defect localization methods. These 10 methods are MLP-FL, CNN-FL, BilSTM-FL, Ochiai, ER5, GP02, GP03, Dstar, GP19 and ER 1', respectively. To evaluate the effectiveness of defect localization, we used three widely used indicators: Top-N accuracy, defect localization accuracy (referred to as EXAM), and relatively improved accuracy (referred to as RImp). Top-N is a metric that shows the effectiveness of the optimal localization of a defect localization method, and EXAM and RImp are two metrics that show the effectiveness of the overall localization. Further, the experiment used large program objects widely used in the field of defect localization, whose number of code lines varied from 5.4 to 491 kilos. We used three widely used evaluation methods: Top-N, EXAM and RImp. Top-N may show the best effect of defect localization, and EXAM and RImp show the overall effect of localization.

Top-N shows the positioning accuracy, i.e. how many defect versions in the positioning result of a defect positioning method position the real defect sentences in the first N. A higher value of Top-N indicates that more real defect statements are located in the first N bits. Exam is defined as the percentage of statements that have been checked when a true error statement was found. Lower values of Exam indicate better defect localization performance. RImp is defined as the sum of all statements checked for errors that CAN find for all versions of a program divided by the sum of all statements checked for errors that CAN find for all versions of the program using another defect localization method. For CAN, lower rim values represent better positioning performance.

Top-N our experiments used Top-N (N ═ 1,3,4,10,20) to compare CAN with 10 superior defect localization methods. Table 2 shows the Top-N distributions of the 11 defect localization methods. In Table 2, CAN achieves optimal performance under all five scenarios of Top-N. Specifically, CAN locates 4.62% of the defective versions to Top-1, 20% of the defective versions to Top-3, 29.23% of the defective versions to Top-5, 49.23% of the defective versions to Top-10, and 64.62% of the defective versions to Top-20.

TABLE 2 Top-N comparison

To compare CAN with other defect localization methods, we plot four graphs of the value of EXAM versus the value of fig. 6(a) -6 (d), where the ordinate represents the proportion of the sentences that have been examined in all versions of the defect program and the abscissa represents the percentage of the number of versions of the defect sentence found. One point in fig. 6(a) -6 (d) represents the number of versions of a defect statement that can be checked by checking a certain proportion of executable code as a percentage of the total number of versions. The results of fig. 6(a) -6 (d) show that the CAN curves are much higher than the other 10 defect localization methods. The result shows that the positioning performance of the CAN is obviously superior to other 10 defect positioning methods.

To further validate the experimental results, we used RImp in two scenarios to evaluate CAN. Fig. 7 shows the distribution of RImp in two scenarios: FIG. 7(a) is a RImp comparison plot of RImp over 10 defect localization methods, and FIG. 7(b) is a RImp comparison plot of RImp over 12 experimental subject procedures.

In fig. 7(a), the value of RImp is below 100% in all defect localization methods, which means that CAN is superior to these comparative defect localization methods. The reduction in the total number of statements to be examined ranges from 12.99% for BilSTM-FL to 34.63% for Dstar. This also means that the maximum savings in the number of sentences requiring inspection over other defect localization methods is 87.01% (100% -12.99% ═ 87.01%) of BiLSTM-FL, and the minimum savings in the number of sentences requiring inspection is 65.37% (100% -34.63% ═ 65.37%) of Dstar. This shows that when CAN is compared with other defect locating methods, 65.37% to 87.01% of the number of sentences CAN be saved after all the defect sentences are located.

In fig. 7(b), the RImp value is less than 100% in all the subjects, which means that CAN has a relatively significant improvement in positioning accuracy in all the subjects. The number of statements that need to be examined is reduced from 1.75% of python to 57% of nanoxml _ v 1. This means that the number of sentences to be inspected CAN be reduced on average to 1.75% and 57% compared to the CAN and 10 defect localization methods when locating a defect sentence in all defect versions of python and a defect sentence in all defect versions of nanoxml _ v 1. The maximum saving value of CAN is 98.25% (100% -1.75% ═ 98.25%) in python, and the minimum saving value is 43% (100% -57% ═ 43%) of nanoxml _ v 1. This indicates that the number of sentences to be examined CAN be saved by 43% to 98.25% in all the subjects.

As CAN be seen from the RImp comparison graph, the number of sentences to be inspected is obviously reduced after the CAN is used, which shows that the CAN obviously improves the efficiency of defect location.

To further verify the effectiveness of the present invention, we used Wilcoxon-Signed-Rank for statistical analysis, since RImp only demonstrates specific scale of boost, it is only an overall effect boost demonstration, and some specific details may be missed. Wilcoxon-Signed-Rank Test non-parametric statistics were used to Test the difference between a pair of data, for example, F (x) and G (y), and given a parameter φ, we can use 2-tailed and 1-tailed p-value to obtain a result. For 2-tailed p-value, if p ≧ φ, assume H₀: no difference between F (x) and G (y) is accepted; otherwise, suppose H₁: differences between F (x) and G (y) are accepted. There are two cases for 1-tailed p-value: 1-tailed (right) and 1-tailed (left), respectively. For 1-tailed (right), if p ≧ φ, then H is assumed₀: f (x) and G (y) compare results not accepted by Better; otherwise, suppose H₁: f (x) and G (y) compare to result in acceptance of Better. For 1-tailed (left), if p ≧ φ, H is assumed₀: f (x) and G (y) compare results not accepted as Worse; otherwise, suppose H₁F (x) and G (y) compare to result in Worse being accepted.

In the experiment, the EXAM values of CAN in all defect versions are used as f (x), and the EXAM value of the defect localization method FL1 is g (y). If p <0.05, then assume that the EXAM value for H1 CAN is significantly less than the EXAM value for defect localization method FL1 is accepted. This means that CAN has BETTER positioning performance than FL1, we denote by BETTER. Otherwise, H0: the value of EXAM for CAN is not acceptably smaller than that of defect localization method FL1, which means that CAN is not more efficient than FL1 for localization.

We used Wilcoxon-Signed-Rank Test nonparametric statistics to Test the difference between a pair of data, such as F (x) and G (y), to verify whether CAN improves significantly over other defect localization methods, and we CAN use 2-tailed and 1-tailed p-value to obtain a result given a parameter φ. For 2-tailed p-value, if p ≧ φ, assume H₀: no difference between F (x) and G (y) is accepted; otherwise, suppose H₁: differences between F (x) and G (y) are accepted. There are two cases for 1-tailed p-value: 1-tailed (right) and 1-tailed (left), respectively. For 1-tailed (right), if p ≧ φ, then H is assumed₀: f (x) and G (y) compare results not accepted by Better; otherwise, suppose H₁: f (x) and G (y) compare to result in acceptance of Better. For 1-tailed (left), if p ≧ φ, H is assumed₀: f (x) and G (y) compare results not accepted as Worse; otherwise, suppose H₁F (x) and G (y) compare to result in Worse being accepted. Specifically, the value of EXAM of CAN in all defect versions is used as f (x), and the value of EXAM of the defect location method FL1 is g (y). If p is<0.05, it is assumed that H1, the EXAM value of CAN is significantly less than the EXAM value of defect localization method FL1 is accepted. This means that CAN has BETTER positioning performance than FL1, we denote by BETTER. Otherwise, H0: the value of EXAM for CAN is not acceptably smaller than that of defect localization method FL1, which means that CAN is not more efficient than FL1 for localization.

Table 3 shows that the results of Wilcoxon-Signed-Rank Test non-parameter statistics show that the EXAM value of most of the results of CAN is obviously smaller than that of other defect positioning methods. For A-test, the further the deviation of the A statistic from 0.5 for both comparison methods indicates the greater the difference between the two comparison methods. A-test values greater than 0.64 or less than 0.36 are "medium" differences, and A-test values greater than 0.71 or less than 0.29 are "large" differences. Table 3 shows that CAN is mostly a "large" difference. Therefore, the positioning efficiency of the CAN is higher than that of other defect positioning methods.

TABLE 3 statistical analysis of CAN and 10 Defect localization methods

Therefore, based on the above experimental results and analysis, we CAN conclude that CAN significantly enhances the efficacy of defect localization, and indicate that the neural network has great potential in understanding the defect context and enhancing the efficacy of defect localization.

Finally, the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in the claims of the present invention.

Claims

1. A defect positioning method based on context sensing is characterized by comprising the following steps:

s140: setting a defect context with K sentences, checking N sentences of the program, discarding the sentences not in the defect context to obtain an M multiplied by K matrix, recording the execution information of the K sentences in the defect context in a test case set T, wherein each row of the M multiplied by K matrix represents a test case, y represents a test case, and_ijthe value of the element in the ith row and the jth column of the M multiplied by K matrix is 0 or1 when y_ij1 indicates that the jth statement in the defect context is testedExample i execution, y_ij0 means not executed;

s200: the specific steps of the model training process are as follows:

S220: taking the ith row in the M multiplied by K matrix, when the value of the jth column element in the ith row is 1, namely y_ij1, the vector j is represented by a node (where vector j is different from the neighbor node vector j, and where vector j represents only the value of j and y_ijWhere j has the same value) instead of y_ijWhen the value of the jth column element in the ith row is 0, y_ijWhen equal to 0, the y is replaced by a zero vector_ij；

S230: taking the ith row in the M multiplied by K matrix processed by S220 as the input of the neural network model to obtain an o_iValue of o_iThe value and the known output e of the test case represented by row i_iThe values are subtracted to obtain a loss value,

during iteration, parameters and node vector of the neural network model are subjected to back propagation algorithm according to loss value₁To vector_KUpdating, specifically setting the iteration times according to the empirical value, executing the next step if the iteration times are less than the preset value, and otherwise executing S300;

taking the node i as a central node, the neural network a: R^d×R^d->R distributes different weights delta to each neighbor node j according to the importance of the neighbor node j to the central node i_ijNormalizing all neighbors by using softmax functionCorrelation coefficients of the nodes;

wherein a represents a neural network a R^d×R^d->R；

S232: computing interaction information

Interaction information for the central node i:

After the node representation vector is updated, returning to S220;

s300: and (3) carrying out defect positioning on the defective target program:

2. The method of claim 1, wherein the defect location based on context awareness comprises: the method for constructing the defect context in S120 is specifically as follows:

the following slicing criteria were used:

failSC＝(outStm,incorrectVar,failEx)