CN116467164A - Software debugging method, system, electronic equipment and storage medium - Google Patents

Software debugging method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN116467164A
CN116467164A CN202310193606.0A CN202310193606A CN116467164A CN 116467164 A CN116467164 A CN 116467164A CN 202310193606 A CN202310193606 A CN 202310193606A CN 116467164 A CN116467164 A CN 116467164A
Authority
CN
China
Prior art keywords
code
name
function
software
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310193606.0A
Other languages
Chinese (zh)
Inventor
秦亮
钱辉
张文昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
709th Research Institute of CSSC
Original Assignee
709th Research Institute of CSSC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 709th Research Institute of CSSC filed Critical 709th Research Institute of CSSC
Priority to CN202310193606.0A priority Critical patent/CN116467164A/en
Publication of CN116467164A publication Critical patent/CN116467164A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3624Software debugging by performing operations on the source code, e.g. via a compiler
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3636Software debugging by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3644Software debugging by instrumenting at runtime
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a software debugging method, a system, electronic equipment and a storage medium, wherein the method compiles a source code of a software system at the front end to generate an intermediate representation code, and performs static analysis on context environment information of each function based on the intermediate representation code to obtain a variable blood-edge relationship diagram in a system range; when the intermediate state of the business process from input to output is required to be tracked, the information such as the propagation path of the input message data in a software system, the change condition of the value, the interdependence and influence relation between the input message data and other data and the like is obtained based on the analysis of the variable blood-edge relation graph, and a full-link state traceability table for recording the propagation track of the message data is generated; and for the record items in the full-link state tracing table, automatically generating an intermediate code for collecting the record item data through instrumentation, and instrumentation into a target file in the compiling link period, wherein the instrumentation code is called to execute when the software system operates so as to throw out the record item data for showing the full-link operation state of the business process.

Description

Software debugging method, system, electronic equipment and storage medium
Technical Field
The invention belongs to the field of software debugging, and in particular relates to a software debugging method, a system, electronic equipment and a storage medium.
Background
Unknowable and uncontrollable software behavior is a major cause of software debugging difficulties. The execution flow and behavior change of the software are determined by the context environment information, wherein the context environment refers to a visible data resource set in the scope of the current task (usually a function or a method), and comprises global variables, local variables, parameter variables, values and the like, and the context environment is the driving force of the task operation and the representation of the result of the task operation; when tasks are executed through the calling chain sequence, data dependence and transmission exist between associated context environments, and the internal state of the software system is driven to continuously change until the internal state approaches to a target state.
The running of the software can be regarded as a series of conversion to the software state, the software state is composed of the variables visible in the action domain of the program point and the values of the variables at the point, when the software system receives external message data, the final result is output through a series of state change, and when the result does not accord with the expectation, the change condition of the intermediate state of the software system needs to be collected for root cause analysis. As a huge state machine, the software system cannot exhaust all its internal states, and usually only needs to screen out intermediate state data related to message data. The technical means commonly used for acquiring the intermediate state data at the present stage mainly comprise log printing, dynamic tracking and the like.
And (3) manually screening an execution path by log printing, and adding printing sentences before and after the data resource access concerned to output to a screen or a file. The interrupt mechanism depending on the kernel of the operating system is dynamically tracked, debugging tools such as GDB and the like are adopted, and after the program is suspended to execute by utilizing the characteristics such as breakpoint, single step execution and the like, the data resources in the register and the stack are observed.
Practice has shown that these several techniques each have their limitations. The log printing needs a series of processes such as code modification, compiling and linking, uploading and replacing, system restarting and the like, and the process is complicated, and the source code needs to be changed, so that the version control is not facilitated. Dynamic tracking can only acquire the current or adjacent context, so that state switching and space-time change conditions of the program in the whole service flow are difficult to outline, and the normal execution sequence of the multi-thread program is disturbed due to interruption of the current thread.
In order to cope with increasingly large software scale, increasingly complex distributed architecture and application scenes of 24X7 running uninterruptedly, a more convenient, efficient and global software debugging method needs to be researched to collect and evaluate internal state change conditions of a software system under specified conditions.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a software debugging method, a system, electronic equipment and a storage medium, and aims to solve the problems that the existing software debugging method is complex in process, insufficient in information collection, and is unfavorable for version control and has influence on a program.
In order to achieve the above object, in a first aspect, the present invention provides a software debugging method, including the steps of:
compiling a source code of software to generate an intermediate representation code in an IR format by adopting a clang front-end compiler of an LLVM compiler framework;
analyzing the context environment of all the functions in the intermediate representation code, and combing out the inter-dependent relationship of the variables inside the functions and among the functions to generate a variable blood relationship graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
aiming at a business process of software, analyzing a propagation path of an internal state of the business process in a variable blood-edge relation graph based on a graph reachability algorithm by taking input data as a starting point and output data as an end point, taking each edge in the propagation path as a record item, and forming a full-link state traceability table of the business process by the collection of the record items;
and automatically generating an IR code block for collecting record item data for each record item in the full-link state tracing table, and inserting the IR code block to a designated position in the software IR code in the optimizing stage of compiling and linking the software source code, so that when the software runs the business process, the intermediate state data of the business process are recorded by using the inserted IR code block, debugging is carried out on the software, and the fault source is accurately positioned.
In an alternative example, the context of the function refers to the data resources visible within the scope of the current running function, including: global variables, local variables, parametric variables, and values.
In an alternative example, each edge in the propagation path is taken as a record item, specifically: the content of each entry is a collection of attributes for the two nodes to which the edge is associated.
In an alternative example, different types of edges are described in different record item descriptions, wherein: the read variable side and the write variable side are described in a seven-tuple mode; the value definition edge and the value use edge are described in a six-tuple mode; the function call input edge is described in a ten-tuple mode; the function call output edge is described in a nine-tuple mode;
the seven-tuple includes: file name, function name, variable name, value name, type, IR code and source code;
the six-tuple includes: file name, function name, expression IR code, expression source code, value name and type;
the ten-tuple includes: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, shape parameter name, real parameter name, type and parameter serial number;
the nine tuples include: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, called function return value name, main call function return value name and type.
In a second aspect, the present invention provides a software debug system comprising:
an intermediate representation code generating unit, configured to compile a source code of software to generate an intermediate representation code in IR format using a clang front-end compiler of an LLVM compiler framework;
the variable blood edge relation graph generating unit is used for analyzing the context environment of all the functions in the intermediate representation code, combing out the interdependence relations of the variables inside the functions and among the functions and generating a variable blood edge relation graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
the system comprises a full-link state tracing table construction unit, a graph reachability algorithm-based data processing unit and a data processing unit, wherein the full-link state tracing table construction unit is used for analyzing a business process of software by taking input data as a starting point and output data as an end point in a variable blood-edge relation graph to obtain a propagation path of an internal state of the business process, each edge in the propagation path is used as a record item, and a set of the record items forms a full-link state tracing table of the business process;
and the code instrumentation unit is used for automatically generating an IR code block for collecting record item data for each record item in the full-link state tracing table, and instrumentation the IR code block to a designated position in the software IR code in the optimization stage of compiling and linking the software source code so as to record intermediate state data of the business process by using the instrumentation IR code block when the software runs the business process, debug the software and accurately position the fault source.
In an alternative example, the function context analyzed by the variable blood relationship graph generating unit refers to a data resource visible in a scope of a current running function, including: global variables, local variables, parametric variables, and values.
In an optional example, the full link state traceback table construction unit regards each edge in the propagation path as a record item, specifically: the content of each entry is a collection of attributes for the two nodes to which the edge is associated.
In an alternative example, different types of edges in the variable blood relationship graph adopt different record item description modes, wherein: the read variable side and the write variable side are described in a seven-tuple mode; the value definition edge and the value use edge are described in a six-tuple mode; the function call input edge is described in a ten-tuple mode; the function call output edge is described in a nine-tuple mode;
the seven-tuple includes: file name, function name, variable name, value name, type, IR code and source code;
the six-tuple includes: file name, function name, expression IR code, expression source code, value name and type;
the ten-tuple includes: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, shape parameter name, real parameter name, type and parameter serial number;
the nine tuples include: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, called function return value name, main call function return value name and type.
In a third aspect, the present invention provides an electronic device, comprising: a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to implement the method as provided in the first aspect above when executing the computer program.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as provided in the first aspect above.
In general, the above technical solutions conceived by the present invention have the following beneficial effects compared with the prior art:
the invention provides a software debugging method, a system, electronic equipment and a storage medium, wherein a variable blood relationship diagram is generated by carrying out static analysis on a software code, so that the interdependence and influence relationship among variables are revealed, the data flow in the software is comprehensively displayed, and the understanding and insight of development and maintenance personnel on a software system are improved;
the invention provides a software debugging method, a system, electronic equipment and a storage medium, aiming at a to-be-debugged business process, based on a variable blood relationship diagram, automatically combing out which key paths and conversion are passed through by input data to finally generate output data, applying influence and which variable data and value data are influenced on the paths, converging the combed information into a full-link state tracing table, improving the effectiveness, accuracy and comprehensiveness of debugging information acquisition, and reducing the heavy workload and incompleteness of manual code walking;
the invention provides a software debugging method, a system, electronic equipment and a storage medium, which are used for resolving a formalized full-link state traceability table and automatically generating a debugging data acquisition code, so that the complexity and the repetition of manually writing the debugging code are avoided;
the invention provides a software debugging method, a system, electronic equipment and a storage medium, wherein a debugging data acquisition code is inserted into a target file in an optimization stage of source code compiling and linking, so that the method, the system and the electronic equipment have no invasion to a source code and avoid changing a code view.
Drawings
FIG. 1 is a flowchart of a software debugging method provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a variable blood relationship graph provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a variable blood relationship graph according to an embodiment of the present invention;
FIG. 4 is a diagram of steps for implementing a software debugging method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a software debugging system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides a software debugging method. The method comprises the steps of performing front-end compiling on a source code of a software system to generate an intermediate representation code, and performing static analysis on context environment information of each function based on the intermediate representation code to obtain a variable blood-edge relation diagram in a system range; when the specific business process is required to be tracked from the input to the output intermediate state, the information such as the propagation path of the input message data in the software system, the change condition of the value, the interdependence and influence relation between the input message data and other data and the like is obtained based on the analysis of the variable blood-edge relation graph, and a full-link state traceability table for recording the propagation track of the message data is generated; for the record items in the full-link state tracing table, the intermediate codes for collecting the record item data are automatically generated through the instrumentation tool and are instrumented into the target file in the compiling link period, and the instrumentation codes are called and executed when the software system operates so as to throw out the record item data for full-link tracing of the operation state of the business flow.
FIG. 1 is a flowchart of a software debugging method provided by an embodiment of the present invention; as shown in fig. 1, the method comprises the following steps:
s101, compiling a source code of software to generate an intermediate representation code in an IR format by adopting a clang front-end compiler of an LLVM compiler framework;
s102, analyzing the context environment of all functions in the intermediate representation code, and combing out the inter-dependent relationship of the variables in the functions and among the functions to generate a variable blood-margin relationship graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
s103, aiming at a business process of software, analyzing a propagation path of an internal state of the business process in a variable blood-edge relation graph based on a graph reachability algorithm by taking input data as a starting point and output data as an end point, taking each edge in the propagation path as a record item, and forming a full-link state traceability table of the business process by the collection of the record items;
s104, for each record item in the full-link state tracing table, automatically generating an IR code block for collecting record item data, and inserting the IR code block to a designated position in the software IR code in an optimization stage of compiling and linking the software source code, so that when the software runs the business process, the intermediate state data of the business process is recorded by using the inserted IR code block, debugging the software and accurately positioning a fault source.
Specifically, front-end compiling is carried out on a source code of a software system to generate an intermediate representation code, and static analysis is carried out on context environment information of each function based on the intermediate representation code to obtain a variable blood-edge relation diagram in a system range; wherein the intermediate representation code adopts an intermediate language IR (Intermediate Representation) of the LLVM compiler framework; context information refers to data resources visible in the scope of the currently running function, including global variables, local variables, parameter variables and values, and is used for controlling the execution flow of the function.
In addition, the Variable blood relationship graph is a directed attribute graph, and three types of nodes are Variable (Variable), value (Value) and expression (Expr) respectively; there are six types of edges, variable Read (Read), variable Write (Write), value definition (Def), value Use (Use), function call input (CallInput), function call output (CallOutput), attribute triplets of "variable" nodes (name, type, scope), attribute triplets of "value" nodes (name, type, category, scope), wherein the value field of element "category" is { constant, parameter value, temporary value }, attribute triplets of "expression" nodes are (IR code, source code, scope), attribute triplets of "variable Read" edges are (IR code, source code, scope, variable, value), attribute triplets of "variable Write" edges are (IR code, source code, scope, variable, value), attribute quintuples of "value definition" edges are (IR code, source code, scope, expression, value), attribute quintuples of "value Use" edges are (IR code, source code, scope, expression, value), attribute octets of "function call input" edges are (IR code, source code, scope, major function, called function, shape parameter, real parameter value, parameter sequence number), attribute seven tuples of "function call output" edges are (IR code, source code, scope, major function, called function, major function return value, called function return value), and the principle schematic of a variable blood-edge relationship diagram is shown in fig. 2.
In the process of compiling the source code into the IR code, the left Value (i.e. address) of the source code Variable is mapped into a Variable in the IR, namely a node Variable, the naming mode of the IR Variable is that the source code Variable name is added with a suffix, addr, and the change of the right Value (i.e. state) of the source code Variable is mapped into a series of values in the IR, namely a node Value; the instructions in the functions in IR mainly comprise memory reading, memory writing, calculating expression and function call, wherein the memory reading gives a temporarily generated value (namely side Read) to the current value in the variable memory, the memory writing writes the value into the memory space (namely side Write) of the variable, the calculating expression is namely node Expr, a new value (namely side Def) is generated through calculating an operand value (namely side Use), the function call transmits a real parameter value to a shape parameter value (namely side CallInput) of a called function in a value copy/pointer reference mode, and the called function returns the new value generated after calculation to a main call function (namely side CallOutput).
The side Read and the side Write reflect the binding relation between the left value and the right value of the variable, the side Def and the side Use reflect a series of transformation relations between the values completed by means of the node Expr, and the side CallInput and the side CallOutput reflect the transfer relation between the values of the functions. The variable blood-edge relation diagram constructed based on the nodes and the edges can be used for tracing the whole link process of data from definition generation, processing fusion and circulation flow to result generation, and can also be used for revealing the circulation step of the variable to obtain a specific value by establishing direct and indirect dependency relations among the variables, and the architecture of the variable blood-edge relation diagram is schematically shown in figure 3.
Specifically, based on a variable blood-edge relation graph, for a specific business process, information such as a propagation path of input and output data of the business process in a software system, a change condition of a value, interdependence and influence relation between the input and output data and other data and the like can be analyzed and obtained, the input data is taken as a starting point, the output data is taken as an end point, the propagation path is calculated in the variable blood-edge relation graph based on a graph reachability algorithm, each edge in the propagation path is taken as a record item, the content of the record item is an attribute set of two nodes which are related by edges, and the set of the record items is a full-link state traceability table for recording the running track of the business process.
Wherein, different types of edges adopt different record item description modes, and the edge Read and the edge Write adopt seven-tuple modes to describe: (File name, function name, variable name, value name, IR code, type, source code), edge Def and edge Use are described in six tuples: (File name, function name, expression IR code, expression source code, value name, type), side CallInput is described in ten tuples: (source file name, source function name, destination file name, destination function name, function call IR code, function call source code, shape parameter name, real parameter name, type, parameter number), side CallOutput describes in nine tuples: (source file name, source function name, destination file name, destination function name, function call IR code, function call source code, called function return value name, main call function return value name, type).
Specifically, a plug-in is developed based on a Pass mechanism of an LLVM compiler framework, the plug-in is operated in the process of compiling a source code to generate an intermediate code and linking the intermediate code to generate a target file, the generated variable full-link state tracing table is loaded, an IR code block for data acquisition is created for each record item in the table, the IR code block is inserted into a specific position of the intermediate code, the target file is generated after linking, the IR code of the plug-in is called and executed when a software system runs, and therefore record item data is thrown out for analysis and positioning of software faults;
the software debugging method provided by the invention can be used for efficiently and comprehensively collecting the data required by debugging.
In a specific embodiment, the steps of the context-oriented static software debugging method according to the present invention are shown in fig. 4, and include:
s1, performing front-end compiling on a source code by a clang compiler running an LLVM compiler framework to generate an IR code in an intermediate representation form, wherein the compiling option is "-g-c-fno-discard-value-names-emit-LLVM";
s2, traversing an instruction set of each function in the IR code for the first time based on a PASS mechanism provided by the LLVM, and generating four types of edges of nodes and Def, use, read, write, wherein the specific method is as follows:
1) Screening out a call instruction of a special function @ llvm.dbg.declare, and analyzing the instruction to obtain a variable defined in the function, namely a node variable;
2) Screening add, sub and other computing instructions, and generating a node Expr based on the instructions;
3) Analyzing the calculation instruction to obtain an operand and a calculation result, generating a node Value aiming at the operand and the calculation result after de-duplication, and when the operand is constant, assigning the node attribute type to be constant and assigning the node attribute type to be temporary Value in other cases;
4) Analyzing the calculation instruction, creating an edge Use between the node Expr and the operand node Value, and creating an edge Def between the node Expr and the calculation result node Value;
5) Screening a load instruction, analyzing the instruction to obtain a source node variable and a destination node Value, and creating an edge Read between the node variable and the node Value;
6) Screening a store instruction, analyzing the instruction to obtain a source node Value and a destination node variable, and creating an edge Write between the node Value and the node variable;
7) Analyzing the function parameters to generate node Value, wherein the node attribute type is assigned as a parameter Value;
s3, traversing an instruction set of each function in the IR code for the second time based on a PASS mechanism provided by the LLVM, screening call instructions for calling the internal functions, analyzing the instructions to obtain real parameter Value nodes Value, and creating an edge callInput between the real parameter Value nodes Value and the real parameter Value nodes Value based on the shape parameter Value nodes Value in the modulated functions created in the step S2; analyzing the call instruction to obtain a main call function return Value node Value, screening a ret instruction from a called function to obtain a dropped function return Value node Value, and creating an edge CallOutput between the dropped function return Value node Value and the main call function return Value node Value;
s4, correlating all the obtained nodes with edges, and storing by adopting a graph data structure to generate a variable blood edge relationship graph;
s5, aiming at a specific business process, analyzing and obtaining information such as a propagation path of input and output data of the business process in a software system, change conditions of values, mutual dependence and influence relation between the input and output data and other data, and calculating and obtaining the propagation path of the business process in a code in a variable blood-edge relation graph based on a graph reachability algorithm by taking the input data as a starting point and the output data as an ending point;
s6, taking each edge in the propagation path as a record item, wherein the content of the record item is an attribute set of the edge and two nodes associated with the edge, and the record item set is a full-link state tracing table and is stored as a full-link state tracing table file;
s7, developing a instrumentation code based on a Pass mechanism of an LLVM compiler framework and compiling and generating an instrumentation plug-in a dynamic library form, wherein the execution flow of the instrumentation code is to load the generated full-link state tracing table file, generate an IR code block for collecting variable data and value data after analyzing each record item in the table, the IR code field in the record item is used for designating a reference instrumentation position, insert the IR code block before the reference instrumentation position for a side Write, insert the IR code block after the reference instrumentation position for a side Read, def, use, insert a section of IR code block before the reference instrumentation position of a source function and before an entry instruction of a destination function for a side CallInput, and insert a section of IR code block after the reference instrumentation position of the source function and before a return instruction of the destination function for the side CallOutput; the content of the IR code block is an interface function of a call log library so as to record the address, variable data and value data of a program point;
s8, expanding a current compiling link script in the construction system, after generating an intermediate file by a clang front-end compiler, running an opt optimization tool, calling the pile inserting plug-in, inserting a debugging code into the intermediate file, and finally converting the intermediate file into an executable file of a target platform by a rear-end compiler;
s9, the instrumentation code is called and executed when the software system runs so as to throw out record item data for carrying out full-link tracking of the running state of the business process.
In addition, the present invention provides the following modules:
the intermediate representation code generation module is used for compiling the source code into an IR format intermediate representation code by adopting a clang front-end compiler of an LLVM compiler framework;
the variable blood-edge relation diagram generation module is used for analyzing the context environment of the function in the intermediate representation code, combing out the interdependence relation among the variables and generating a variable blood-edge relation diagram;
the system comprises a full-link state tracing table construction module, a graph reachability algorithm, a full-link state tracing table, a data processing module and a data processing module, wherein the full-link state tracing table construction module calculates information such as a propagation path of input and output data of the business process in a software system, a change condition of a value, mutual dependence and influence relation between the input and output data and other data of the business process in a variable blood-edge relation graph, calculates the propagation path in the variable blood-edge relation graph based on the graph reachability algorithm, and uses each edge in the propagation path as a record item, wherein the content of the record item is an attribute set of two nodes related to the edge, and the set of the record items is the full-link state tracing table;
and the code instrumentation module is used for automatically generating an IR code block for collecting record item data for each record item in the full-link state tracing table, and instrumentation the IR code block to a designated position in the software IR code in the optimization stage of compiling and linking the software source code so as to record intermediate state data of the business process by using the instrumentation IR code block when the software runs the business process, debug the software and accurately position the fault source.
Fig. 5 is a schematic diagram of a software debug system according to an embodiment of the present invention, as shown in fig. 5, including:
an intermediate representation code generating unit 510, configured to compile a source code of software to generate an intermediate representation code in IR format using a clang front-end compiler of an LLVM compiler framework;
the variable blood-edge relationship graph generating unit 520 is configured to analyze the context environment of all the functions in the intermediate representation code, and comb out the interdependencies of the variables inside the functions and between the functions to generate a variable blood-edge relationship graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
the full link state tracing table construction unit 530 is configured to analyze, in a variable blood-edge relationship graph, a propagation path of an internal state of a software business process based on a graph reachability algorithm with input data as a start point and output data as an end point, and use each edge in the propagation path as a record item, where a set of record items forms a full link state tracing table of the business process;
the code instrumentation unit 540 is configured to automatically generate, for each record item in the full link state tracing table, an IR code block for collecting record item data, and instrumentation the IR code block to a specified position in the IR code of the software in an optimization stage of compiling and linking the software source code, so that when the software runs the business process, intermediate state data of the business process is recorded by using the instrumented IR code block, and debug the software, thereby accurately locating a fault source.
It should be understood that the detailed functional implementation of each unit may be referred to the description in the foregoing method embodiment, and will not be repeated herein.
In addition, an embodiment of the present invention provides an electronic device, including: a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to implement the method in the above-described embodiments when executing the computer program.
Furthermore, the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method in the above embodiments.
Based on the method in the above embodiments, an embodiment of the present invention provides a computer program product, which when run on a processor causes the processor to perform the method in the above embodiments.
Based on the method in the above embodiment, the embodiment of the present invention further provides a chip, including one or more processors and an interface circuit. Optionally, the chip may also contain a bus. Wherein:
the processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. The methods and steps disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The interface circuit can be used for sending or receiving data, instructions or information, the processor can process by utilizing the data, instructions or other information received by the interface circuit, and the processing completion information can be sent out through the interface circuit.
Optionally, the chip further comprises a memory, which may include read only memory and random access memory, and provides operating instructions and data to the processor. A portion of the memory may also include non-volatile random access memory (NVRAM).
Optionally, the memory stores executable software modules or data structures and the processor may perform corresponding operations by invoking operational instructions stored in the memory (which may be stored in an operating system).
Alternatively, the interface circuit may be configured to output the execution result of the processor.
It should be noted that, the functions corresponding to the processor and the interface circuit may be implemented by hardware design, or may be implemented by software design, or may be implemented by a combination of software and hardware, which is not limited herein.
It will be appreciated that the steps of the method embodiments described above may be performed by logic circuitry in the form of hardware in a processor or instructions in the form of software.
It should be understood that, the sequence number of each step in the foregoing embodiment does not mean the execution sequence, and the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way. In addition, in some possible implementations, each step in the foregoing embodiments may be selectively performed according to practical situations, and may be partially performed or may be performed entirely, which is not limited herein.
It is to be appreciated that the processor in embodiments of the present application may be a central processing unit (cen tral processing unit, CPU), but may also be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), field programmable gate arrays (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. The general purpose processor may be a microprocessor, but in the alternative, it may be any conventional processor.
The method steps in the embodiments of the present application may be implemented by hardware, or may be implemented by a processor executing software instructions. The software instructions may be comprised of corresponding software modules that may be stored in random access memory (random access memory, RAM), flash memory, read-only memory (ROM), programmable ROM (PROM), erasable programmable PROM (EPROM), electrically erasable programmable EPROM (EEPROM), registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. A software debugging method, comprising the steps of:
compiling a source code of software to generate an intermediate representation code in an IR format by adopting a clang front-end compiler of an LLVM compiler framework;
analyzing the context environment of all the functions in the intermediate representation code, and combing out the inter-dependent relationship of the variables inside the functions and among the functions to generate a variable blood relationship graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
aiming at a business process of software, analyzing a propagation path of an internal state of the business process in a variable blood-edge relation graph based on a graph reachability algorithm by taking input data as a starting point and output data as an end point, taking each edge in the propagation path as a record item, and forming a full-link state traceability table of the business process by the collection of the record items;
and automatically generating an IR code block for collecting record item data for each record item in the full-link state tracing table, and inserting the IR code block to a designated position in the software IR code in the optimizing stage of compiling and linking the software source code, so that when the software runs the business process, the intermediate state data of the business process are recorded by using the inserted IR code block, debugging is carried out on the software, and the fault source is accurately positioned.
2. The method of claim 1, wherein the context of the function refers to data resources visible within the scope of the currently running function, comprising: global variables, local variables, parametric variables, and values.
3. The method according to claim 1, wherein each edge in the propagation path is taken as a record, in particular: the content of each entry is a collection of attributes for the two nodes to which the edge is associated.
4. A method according to claim 1 or 3, wherein different types of edges are described in different record patterns, wherein: the read variable side and the write variable side are described in a seven-tuple mode; the value definition edge and the value use edge are described in a six-tuple mode; the function call input edge is described in a ten-tuple mode; the function call output edge is described in a nine-tuple mode;
the seven-tuple includes: file name, function name, variable name, value name, type, IR code and source code;
the six-tuple includes: file name, function name, expression IR code, expression source code, value name and type;
the ten-tuple includes: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, shape parameter name, real parameter name, type and parameter serial number;
the nine tuples include: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, called function return value name, main call function return value name and type.
5. A software debug system, comprising:
an intermediate representation code generating unit, configured to compile a source code of software to generate an intermediate representation code in IR format using a clang front-end compiler of an LLVM compiler framework;
the variable blood edge relation graph generating unit is used for analyzing the context environment of all the functions in the intermediate representation code, combing out the interdependence relations of the variables inside the functions and among the functions and generating a variable blood edge relation graph; the variable blood relationship graph is a directed attribute graph, and three types of nodes are provided: variables, values, and expressions, edges are of six types: reading a variable, writing a variable, defining a value, using the value, inputting a function call and outputting the function call;
the system comprises a full-link state tracing table construction unit, a graph reachability algorithm-based data processing unit and a data processing unit, wherein the full-link state tracing table construction unit is used for analyzing a business process of software by taking input data as a starting point and output data as an end point in a variable blood-edge relation graph to obtain a propagation path of an internal state of the business process, each edge in the propagation path is used as a record item, and a set of the record items forms a full-link state tracing table of the business process;
and the code instrumentation unit is used for automatically generating an IR code block for collecting record item data for each record item in the full-link state tracing table, and instrumentation the IR code block to a designated position in the software IR code in the optimization stage of compiling and linking the software source code so as to record intermediate state data of the business process by using the instrumentation IR code block when the software runs the business process, debug the software and accurately position the fault source.
6. The system according to claim 5, wherein the function context analyzed by the variable blood relationship graph generating unit refers to data resources visible within a scope of a current running function, comprising: global variables, local variables, parametric variables, and values.
7. The system according to claim 5, wherein the full link state traceback table construction unit regards each edge in the propagation path as a record item, specifically: the content of each entry is a collection of attributes for the two nodes to which the edge is associated.
8. The system of claim 5 or 7, wherein different types of edges in the variable blood relationship graph are described by different entries, wherein: the read variable side and the write variable side are described in a seven-tuple mode; the value definition edge and the value use edge are described in a six-tuple mode; the function call input edge is described in a ten-tuple mode; the function call output edge is described in a nine-tuple mode;
the seven-tuple includes: file name, function name, variable name, value name, type, IR code and source code;
the six-tuple includes: file name, function name, expression IR code, expression source code, value name and type;
the ten-tuple includes: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, shape parameter name, real parameter name, type and parameter serial number;
the nine tuples include: source file name, source function name, destination file name, destination function name, function call IR code, function call source code, called function return value name, main call function return value name and type.
9. An electronic device, comprising: a memory and a processor;
the memory is used for storing a computer program;
the processor being adapted to implement the method of any of claims 1-4 when executing the computer program.
10. A computer readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by a processor, implements the method according to any of claims 1-4.
CN202310193606.0A 2023-03-02 2023-03-02 Software debugging method, system, electronic equipment and storage medium Pending CN116467164A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310193606.0A CN116467164A (en) 2023-03-02 2023-03-02 Software debugging method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310193606.0A CN116467164A (en) 2023-03-02 2023-03-02 Software debugging method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116467164A true CN116467164A (en) 2023-07-21

Family

ID=87183138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310193606.0A Pending CN116467164A (en) 2023-03-02 2023-03-02 Software debugging method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116467164A (en)

Similar Documents

Publication Publication Date Title
US10209962B2 (en) Reconstructing a high level compilable program from an instruction trace
Binkley The application of program slicing to regression testing
Woodside et al. The future of software performance engineering
Gosain et al. Static analysis: A survey of techniques and tools
US7346486B2 (en) System and method for modeling, abstraction, and analysis of software
US8286149B2 (en) Apparatus for and method of implementing feedback directed dependency analysis of software applications
US8938729B2 (en) Two pass automated application instrumentation
US20060253739A1 (en) Method and apparatus for performing unit testing of software modules with use of directed automated random testing
US20010042226A1 (en) System and method for automatically configuring a debug system
US20070011669A1 (en) Software migration
US20120131559A1 (en) Automatic Program Partition For Targeted Replay
US20180032320A1 (en) Computer-implemented method for allowing modification of a region of original code
Gu et al. Deepprof: Performance analysis for deep learning applications via mining gpu execution patterns
Sutar et al. Regression test cases selection using natural language processing
Liuying et al. Test selection from UML statecharts
US11074153B2 (en) Collecting application state in a runtime environment for reversible debugging
Kozik et al. Platform for software quality and dependability data analysis
CN116467164A (en) Software debugging method, system, electronic equipment and storage medium
CN116225377A (en) Unified development method and device for cross-platform applet and electronic equipment
Capuano et al. A Graph-Based Java Projects Representation for Antipatterns Detection
Ehlers Self-adaptive performance monitoring for component-based software systems
Strnad et al. Casual, adaptive, distributed, and efficient tracing system (cadets)
Angerer et al. An experiment comparing lifted and delayed variability-aware program analysis
Patil Regression Testing in Era of Internet of Things and Machine Learning
Jorba et al. Performance Analysis of Parallel Applications with KappaPI 2.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination