CN103198016A

CN103198016A - Software error positioning method based on joint dependent probability modeling

Info

Publication number: CN103198016A
Application number: CN2013100999976A
Authority: CN
Inventors: 苏小红; 龚丹丹; 马培军; 王甜甜; 赵玲玲; 王煜
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2013-03-26
Filing date: 2013-03-26
Publication date: 2013-07-10
Anticipated expiration: 2033-03-26
Also published as: CN103198016B

Abstract

The invention provides a software error positioning method based on joint dependent probability modeling and relates to the field of computer program analysis. The software error positioning method based on the joint dependent probability modeling solves the problem that a traditional software error positioning method is low in positioning accuracy. The software error positioning method based on the joint dependent probability modeling includes: step one, performing a correct test case and a wrong test case respectively, and building joint dependent probability models on the correct test case and the wrong test case respectively; step two, on the basis of the step one, calculating equivocation of each node according to the joint dependent probability models; and step three, arranging error positioning messages in descending order mode according to the equivocation, identifying nodes with high equivocation as nodes which are more likely to go wrong, and finishing positioning of errors of software based on the joint dependent probability modeling. The software error positioning method based on the joint dependent probability modeling is applied to the field of computer program analysis.

Description

Based on uniting the software error localization method that relies on probabilistic Modeling

Technical field

The present invention relates to the computer program analysis field.

Background technology

Be widely used in along with computer software in the every field such as economy, military affairs, commerce, its integrity problem obtains people's extensive attention day by day.Yet along with software systems become increasingly complex, software often moves unlike people's expection, and in other words, software is operation reliably always not, computer application system is brought adverse effect, even cause enormous economic loss and catastrophic consequence.Therefore, guarantee that high-quality, the high reliability of software have become an indispensable importance of system development and maintenance work.

Causing the insecure main cause of software is mistake in the program source code.Program design is the activity of a complexity, is difficult to all possible execution route in the derivation program, and predicts the environmental factor that may influence program or be influenced by program.Even program seems correct execution, still may exist under few situation or the mistake of specified conditions when satisfying.Therefore software error is a problem needing solution at present badly.

Software test and debugging are the important stages in the software development process, and software error can be effectively identified and be eliminated in their collaborative works: test is used for exposing software error, and debugging is used for eliminating these software errors.Yet the speed of eliminating software error in the software debugging process does not often catch up with the speed of finding software error in the software test procedure.At present existing a lot of automation software testing instruments, however software debugging but adopts the method for manual analysis mostly, and this is a task very difficult and consuming time, because: (1) at first wants the mistake in the positioning software.In some cases, when the developer finds software error in program process, may be far from erroneous point, require a great deal of time and search the program code that causes wrong generation with energy.(2) secondly be appreciated that software error.The positioning software mistake only is the software debugging first step, next must eliminate mistake in the statement by update routine source code suitably.In some cases, how suitably revising statement is not clearly, needs developer's manual analysis debugging enironment to be interpreted as that what certain statement is wrong, then, seek a kind of method and revise mistake in the code, and avoid in modification process, introducing new mistake.

If can realize software automation debugging, namely find errors present, the profiling error reason in the program code automatically by computing machine and then correct a mistake automatically, then can more effectively guarantee software reliability.Software error automatically location comes the run time behaviour that produces in routine analyzer source code or the operational process by computing machine, calculate and routine analyzer in abnormal conditions, and it is independent as suspect code.To fall with the code automatic fitration that software error has nothing to do, only keeping needs the further correlative code of debugging, can dwindle the scope that error code is searched for, and comes auxiliary development personnel identification error statement quickly, effectively improves the efficient of debugging.Therefore, order of the present invention is intended at software reliability actual application background and demand, the research software error is located automatically, is software debugging and software error correction based theoretical, improves software quality, guarantees the software high reliability, improves understanding and maintainability of software.

Summary of the invention

The present invention will solve traditional low problem of software error localization method bearing accuracy, and the software error localization method that relies on probabilistic Modeling based on uniting is provided.

May further comprise the steps based on uniting the software error localization method that relies on probabilistic Modeling:

Step 1: carry out correct test case and error checking use-case respectively, and respectively correct test case and error checking use-case are set up associating dependence probability model;

Step 2: on the basis of step 1, according to uniting the dependence probability model, calculate the suspicious degree of each node;

Step 3: location of mistake information is carried out descending sort by suspicious degree, and the node that suspicious degree is high is regarded as the node of more likely makeing mistakes, and has namely finished positioning based on the mistake of uniting the software that relies on probabilistic Modeling.

The invention effect:

Basic thought of the present invention is: uniting dependence and can well being illustrated in different executing state lower nodes and his father's data between nodes dependence of node helps to carry out location of mistake.If the frequency that the associating dependence of certain node occurs in error checking use-case implementation is higher, and the frequency that occurs in correct test case implementation is lower or do not occur, and then the associating dependence of this node is likely wrong.Calculate the suspicious degree of the associating dependence of each statement according to this thought, and then effective positioning software mistake.

Of the present invention based on uniting the software error localization method that relies on probabilistic Modeling, can effectively locate with data and rely on relevant software error.Compare with location of mistake method SBI, SOBER, Tarantula, bearing accuracy can improve more than 15%, is applicable to the location of mistake technical field of extensive program code.

Description of drawings

Fig. 1 is the schematic flow sheet of method of the present invention;

Fig. 2 is that dependence probability model synoptic diagram is united in embodiment one foundation;

Fig. 3 is embodiment one programmed control flow graph and data dependency graph example schematic;

Fig. 4 is embodiment one programmed control independent path and data independent path example schematic.

Embodiment

Embodiment one: present embodiment is described in conjunction with Fig. 1～Fig. 4: may further comprise the steps based on uniting the software error localization method that relies on probabilistic Modeling of present embodiment:

In the present embodiment, Fig. 3 left upper portion is divided into program void distance (), and the lower part, left side is the corresponding data dependency graph of program for this reason, and the right side is the corresponding control flow graph of program for this reason;

Fig. 4 left side is the test case of operation, the control independent path that center section obtains when being implementation of test cases, the data independent path that obtains when the right side is divided into implementation of test cases; For example, for 5 (trues) of control in the independent path, expression node 5 is performed, and the state when being performed is true; For 5 in the data independent path (true, [d1 (i), d2 (n)]), when expression node 5 was performed, executing state was true, and data depend on node 1 and node 2, and data dependence variable is respectively i and n;

In the present embodiment step e for example n (5 (true, [d1 (i), d2 (n)])) be illustrated in the number of times that 5 (true, [d1 (i), d2 (n)]) in the data independent path occur, the state of n (5 (true)) expression node 5 is the total degree that true occurs.

The present embodiment effect:

The basic thought of present embodiment is: uniting dependence and can well being illustrated in different executing state lower nodes and his father's data between nodes dependence of node helps to carry out location of mistake.If the frequency that the associating dependence of certain node occurs in error checking use-case implementation is higher, and the frequency that occurs in correct test case implementation is lower or do not occur, and then the associating dependence of this node is likely wrong.Calculate the suspicious degree of the associating dependence of each statement according to this thought, and then effective positioning software mistake.

Present embodiment based on uniting the software error localization method that relies on probabilistic Modeling, can effectively locate with data and rely on relevant software error.Compare with location of mistake method SBI, SOBER, Tarantula, bearing accuracy can improve more than 15%, is applicable to the location of mistake technical field of extensive program code.

Embodiment two: what present embodiment and embodiment one were different is: described foundation is united the method that relies on probability model and is specially:

A, at first set up the control flow graph for program, record the control dependence between statement then;

B, set up the data dependency graph for program again, record the data dependence relation between statement then respectively;

C, then by the operation test case, plug-in mounting is caught control independent path and the data independent path of node;

D, according to the control independent path of control flow graph and node, the state that calculates each node relies on probability;

Wherein, the probability that described each node is performed is designated as P (node), and recording status is the probability of true and false on the basis of probability being performed for branch node, is designated as P (node (true)) and P (node (false));

Described each node, the probability P (node) that is performed according to following formula computing node node:

P (node) = \frac{n (node)}{n (para (node))} \times P (para (node)) - - - (1)

Wherein, the probability that P (para (node)) is performed for the father node of node node, n (node) are in the control independent path, and the number of times that node node is performed, n (para (node)) are the number of times that the father node of node node in the control independent path is performed;

Described branch node, the computing node state is the probability of true and false on the basis of probability being performed, i.e. P (node (true)) and P (node (false)):

P(node)＝P(node(true))+P(node(false)) (2)

Wherein, described

P (node (false)) = \frac{n (node (false))}{n (node)} \times P (node) = \frac{n (node (false))}{n (node (true)) + n (node (false))} \times P (node)

Wherein, n (node (true)) and n (node (false)) are respectively that the executing state of node node is the number of times of true and false in the control independent path;

E, according to the data independent path of data dependency graph and node, the condition of calculating each node relies on probability P (data dependency|state dependency):

P (data dependency | state dependency) = \frac{n (node (state dependency, data dependency))}{n (node (state dependency))} - - - (3)

N (node (state dependency, data dependency)) state for node node in the data independent path is that state, data rely on the number of times that occurs for data dependency, and n (node (state dependency)) is the total degree that state occurs for the state of node node in the data independent path;

F, rely on probability and condition relies on probability according to the state of each node, computing node unite the dependence probability:

Unite the dependence probability according to the following formula computing node:

Associating dependence probability=state relies on probability * condition and relies on probability (4)

G, foundation associating dependence probability model

Theorem: (unite rely on probability model): rely on the uniting of program code P probability model and be a tlv triple (D, S, R), wherein:

(1) (N E) is the data dependency graph of P to D=, and N is node set, and E is the set of data dependence edge, the data dependence relation of representation program;

(2) S is that node is to the mapping of state;

(3) R be node unite the dependence probability.

Other step and parameter are identical with embodiment one.

Embodiment three: what present embodiment and embodiment two were different is: each node of described calculating is performed the method for the suspicious degree of state, is specially:

Calculate the suspicious degree suspicious_score of the associating dependence of each node:

suspicious_score (node) = \frac{P_{failed} (joint dependency)}{P_{passed} (joint dependency)} - - - (5)

Wherein, P _Passed(joint dependency) when carrying out correct test case, node node unites the dependence probability; P _Failed(joint dependency) when the execution error test case, node node unites the dependence probability.

Other step and parameter are identical with embodiment two.

Embodiment four: what present embodiment and embodiment three were different is: described node is corresponding to every in program statement.Other step and parameter are identical with embodiment three.

Embodiment five: what present embodiment and embodiment four were different is: described branch node is for corresponding to the selection in the program and loop statement.Other step and parameter are identical with embodiment four.

Claims

1. based on uniting the software error localization method that relies on probabilistic Modeling, it is characterized in that may further comprise the steps based on uniting the software error localization method that relies on probabilistic Modeling:

2. according to claim 1 based on uniting the software error localization method that relies on probabilistic Modeling, it is characterized in that described foundation unites the method that relies on probability model and be specially:

P (node) = \frac{n (node)}{n (para (node))} \times P (para (node)) - - - (1)

P(node)＝P(node(true))+P(node(false)) (2)

Wherein, described

P (node (false)) = \frac{n (node (false))}{n (node)} \times P (node) = \frac{n (node (false))}{n (node (true)) + n (node (false))} \times P (node)

P (data dependency | state dependency) = \frac{n (node (state dependency, data dependency))}{n (node (state dependency))} - - - (3)

G, foundation associating dependence probability model

(2) S is that node is to the mapping of state;

(3) R be node unite the dependence probability.

3. according to claim 2 based on uniting the software error localization method that relies on probabilistic Modeling, it is characterized in that each node of described calculating is performed the method for the suspicious degree of state, is specially:

suspicious_score (node) = \frac{P_{failed} (joint dependency)}{P_{passed} (joint dependency)} - - - (5)

4. according to claim 3 based on uniting the software error localization method that relies on probabilistic Modeling, it is characterized in that described node is corresponding to every in program statement.

5. according to claim 4 based on uniting the software error localization method that relies on probabilistic Modeling, it is characterized in that described branch node is corresponding to the selection in the program and loop statement.