CN111813675A - SSA structure analysis method and device, electronic equipment and storage medium - Google Patents

SSA structure analysis method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111813675A
CN111813675A CN202010651559.6A CN202010651559A CN111813675A CN 111813675 A CN111813675 A CN 111813675A CN 202010651559 A CN202010651559 A CN 202010651559A CN 111813675 A CN111813675 A CN 111813675A
Authority
CN
China
Prior art keywords
cfg
node
analysis
source code
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010651559.6A
Other languages
Chinese (zh)
Inventor
张煜昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202010651559.6A priority Critical patent/CN111813675A/en
Publication of CN111813675A publication Critical patent/CN111813675A/en
Priority to CN202110579664.8A priority patent/CN113157597A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/425Lexical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/436Semantic checking

Abstract

The application provides an SSA structure analysis method, an SSA structure analysis device, electronic equipment and a storage medium, wherein the method comprises the following steps: dividing a plurality of CFG nodes according to a data stream analysis rule; and aiming at each CFG node, searching a target node corresponding to each CFG node from the data type of the source code of the programming language, extracting information in the target node, writing the information into the CFG node corresponding to the target node, obtaining the analyzed CFG node, and performing relation association on the analyzed CFG node to obtain CFG structural data which can be used for data flow analysis. Therefore, the CFG node comprises original path information of a program language source code and a name of a corresponding packet, and cross-packet data stream detection of the source code can be realized when data stream analysis is carried out through a CFG structure.

Description

SSA structure analysis method and device, electronic equipment and storage medium
Technical Field
The present application relates to the technical field of programming languages, and in particular, to an SSA structure parsing method, an SSA structure parsing device, an electronic device, and a storage medium.
Background
With the continuous development of network technology, the number of applications written in different types of programming languages is increasing. To ensure the usage experience of an application, vulnerability analysis is typically performed on the application before a new application is released.
The white-box testing technique is a commonly used vulnerability analysis technique, and can be generally divided into two techniques, namely static analysis and dynamic analysis. The static analysis mainly comprises: control flow analysis techniques, data flow analysis techniques, information flow analysis techniques. When the white box test is performed by using the data flow analysis technology, the source program needs to be analyzed into the data flow first, and then the data flow is analyzed.
At present, when the existing white-box test software tests the golang language, specific code content cannot be associated with a specific package, so that cross-packet data stream tracking cannot be performed when data stream analysis is performed, and therefore, the existing white-box test software only takes effect in a single file package of the golang language.
Disclosure of Invention
In view of this, an object of the present application is to provide an SSA structure parsing method, an apparatus, an electronic device, and a storage medium, which implement cross-packet data stream detection of a source code.
In a first aspect, an embodiment of the present invention provides a method for analyzing an SSA structure of a programming language, which is applied to an electronic device, and the method includes:
dividing a plurality of CFG nodes according to a data stream analysis rule;
aiming at each CFG node, searching a target node corresponding to each CFG node from the data type of a source code of a programming language, extracting information in the target node, writing the information into the CFG node corresponding to the target node, and obtaining the analyzed CFG node, wherein the data type of the source code comprises a plurality of AST node types and a plurality of SSA node types;
and carrying out relationship association on the analyzed CFG nodes to obtain CFG structural data which can be used for data flow analysis.
In an optional embodiment, the associating the relationship of the analyzed CFG nodes includes:
analyzing the corresponding relation of the CFG nodes according to the source codes, and connecting the CFG nodes;
connecting CFG edges according to the execution logic of the context of the source code;
and connecting the AST edges according to the inclusion relation among the CFG nodes.
In an optional embodiment, analyzing a corresponding relationship of the CFG node according to the source code, and connecting the CFG node includes:
and circularly traversing the corresponding relation of all CFG nodes in the source code, and connecting and corresponding the CFG nodes.
In an alternative embodiment, the method further comprises:
and carrying out vulnerability analysis on the CFG structure according to the vulnerability analysis rule and the data flow analysis method to obtain a vulnerability detection result.
In an optional embodiment, before searching for a target node corresponding to each CFG node from a data type of source code of a program language for each CFG node, the method further includes:
analyzing the source code to obtain a corresponding program language type, wherein the program language type comprises a go language;
and inputting the source code into a corresponding parsing engine for parsing according to the program language type, wherein the parsing engine comprises a go language parsing engine.
In an alternative embodiment, inputting the source code into a corresponding parsing engine for parsing according to the program language type includes:
and performing lexical analysis and syntactic analysis on the source code to obtain SSA structure data of the source code.
In an optional embodiment, performing lexical analysis and syntactic analysis on a source code to obtain SSA structure data of the source code includes:
performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token, wherein the marker sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a character and a comment;
carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics;
and converting the abstract syntax tree AST into SSA structure data according to the abstract syntax tree and by combining the syntax characteristics of the source code.
In a second aspect, an embodiment of the present invention provides an SSA structure parsing apparatus for programming language, applied to an electronic device, where the apparatus includes:
the CFG node dividing module is used for dividing a plurality of CFG nodes according to the data stream analysis rule;
the analysis module is used for searching a target node corresponding to each CFG node from the data type of the source code of the programming language for each CFG node, extracting information in the target node and writing the information into the CFG node corresponding to the target node to obtain the analyzed CFG node, wherein the data type of the source code comprises a plurality of AST node types and a plurality of SSA node types;
and the association module is used for associating the relationship of the analyzed CFG nodes to obtain CFG structural data which can be used for data flow analysis.
In an alternative embodiment, the association module comprises:
the node connection sub-module is used for analyzing the corresponding relation of the CFG nodes according to the source codes and connecting the CFG nodes;
the CFG edge connecting sub-module is used for connecting CFG edges according to the execution logic of the context of the source code;
and the AST edge connecting submodule is used for connecting the AST edges according to the inclusion relationship among the CFG nodes.
In an optional embodiment, the node connection submodule is specifically configured to:
and circularly traversing the corresponding relation of all CFG nodes in each line of codes in the source code, and connecting and corresponding the CFG nodes.
In an alternative embodiment, the apparatus further comprises:
and the vulnerability analysis module is used for carrying out vulnerability analysis on the CFG structure according to the vulnerability analysis rule and the data flow analysis method to obtain a vulnerability detection result.
In an alternative embodiment, the apparatus further comprises:
the source code analysis module is used for analyzing the source code to obtain a corresponding program language type, wherein the program language type comprises a go language;
and the source code input module is used for inputting the source code into a corresponding analysis engine for analysis according to the program language type, wherein the analysis engine comprises a go language analysis engine.
In an alternative embodiment, the source code input module includes:
and the analysis submodule is used for performing lexical analysis and syntactic analysis on the source code to obtain SSA structural data of the source code.
In an alternative embodiment, the analysis submodule is specifically configured to:
performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token, wherein the marker sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a character and a comment;
carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics;
and converting the abstract syntax tree AST into SSA structure data according to the abstract syntax tree and by combining the syntax characteristics of the source code.
In a third aspect, an embodiment of the present invention provides an electronic device, including: the electronic device comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the electronic device runs, the processor and the storage medium are communicated through the bus, and the processor executes the machine-readable instructions to execute the steps of the method according to any one of the preceding implementation modes.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of any one of the methods in the foregoing embodiments.
Based on any one of the above aspects, the present application searches for a target node corresponding to a CFG node in each node type included in the data type of the source code of the programming language, and extracts information in the target node and writes the information in the CFG node corresponding to the target node, thereby forming a CFG structure. Therefore, the CFG node comprises original path information of a program language source code and a name of a corresponding packet, and cross-packet data stream detection of the source code can be realized when data stream analysis is carried out through a CFG structure.
In addition, in some embodiments, data flow analysis is performed through a CFG structure to achieve white-box testing of source codes, and the false alarm rate of vulnerabilities can be reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a system architecture diagram of a data flow analysis technique provided by an embodiment of the present application;
fig. 2 is a flowchart of an SSA structure parsing method according to an embodiment of the present application;
fig. 3 is a flowchart illustrating sub-steps of step S103 in fig. 1 according to an embodiment of the present disclosure;
FIG. 4 is a diagram of a conventional CFG structure data;
fig. 5 is a second flowchart of an SSA structure parsing method according to the embodiment of the present application;
FIG. 6 is a schematic diagram of a CFG structure to be analyzed;
fig. 7 is a third flowchart of an SSA structure analysis method according to an embodiment of the present application;
fig. 8 is a functional block diagram of an SSA structure parsing apparatus 100 according to an embodiment of the present application;
fig. 9 is an architecture diagram of an electronic device provided in an embodiment of the present application.
Description of the drawings: 60-an electronic device; 61-a processor; 62-a memory; 63-bus; 100-SSA structure analysis device; 101-CFG node division module; 102-a parsing module; 103-an association module; 104-vulnerability analysis module; 105-a source code analysis module; 106-source code input module.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
The term "comprising" will be used in the embodiments of the present application to indicate the presence of the features claimed hereinafter, but does not exclude the addition of further features.
First, it should be noted that the source file of the programming language is written by various programming languages, and the programming language is a formal language for defining a computer program and is used for sending instructions to the computer, so that the computer can implement various functions. The current programming languages include many, such as C language, C + +, golang language, VB language, JAVA language, and so on. Either due to improper processing of data by program logic or improper invocation of an API (call interface) poses a risk to the entire program. Therefore, vulnerability detection needs to be performed on the program source file, so as to ensure the security of the system.
At present, methods for detecting a bug include a white box test method and a black box test method. Here, a box refers to software (i.e., a program) to be tested. The white box, as the name implies, is visible, making it possible to clarify the structure and the operating logic inside the box when tested.
The black box test is also called a functional test, and detects whether each function of the software can be normally used through a test. In the test, the program is regarded as a black box which cannot be opened, and the test is carried out on the program interface under the condition that the internal structure and the internal characteristics of the program are not considered at all, and the black box test only checks whether the functions of the program are normally used according to the requirements specification and whether the program can properly receive input data to generate correct output information. The black box test is mainly used for testing a software interface and a software function by focusing on an external structure of a program and not considering an internal logic structure.
The application scenario of the application is a data flow analysis technology in white box testing of a programming language. The data flow analysis technique is described below in conjunction with FIG. 1. FIG. 1 is a system architecture diagram of a data flow analysis technique provided by an embodiment of the present application. When analyzing a source code of a program language by using a data flow analysis technology, firstly, the source code needs to be input into a program language analysis engine for preliminary analysis, the source code is converted into an SSA (Static Single-Assignment) structure, and then the SSA structure is subjected to data flow analysis, so that an analysis result is obtained.
In the process of data flow analysis, because the existing white-box test algorithm cannot correspond the specific code of the golang with the specific packet, the cross-packet data flow detection of the source code cannot be realized.
In order to solve the technical problem, the application provides an SSA structure parsing method, which further parses an SSA structure corresponding to a source code into a CFG (Control flow graph) structure, thereby implementing cross-packet data flow detection of a golang language.
Referring to fig. 2, fig. 2 is a flowchart of an SSA structure analysis method according to an embodiment of the present disclosure. The analysis method is applied to electronic equipment and comprises the following steps:
step S101, according to the data flow analysis rule, a plurality of CFG nodes are divided.
Specifically, the CFG node may be a different node type in the CFG structure (i.e., control flow graph), and CFG nodes with different names are used for storing different code information according to different functions of the CFG node.
Step S102, aiming at each CFG node, searching a target node corresponding to each CFG node from the data type of the source code of the programming language, extracting information in the target node, writing the information into the CFG node corresponding to the target node, and obtaining the analyzed CFG node.
The data types of the source code comprise a plurality of AST node types and a plurality of SSA node types.
In particular, the AST nodes may be different node types included in an abstract Syntax tree AST (abstract Syntax tree), and the SSA (Static Single-Assignment) nodes may be different node types included in the SSA structure.
Step S103, the analyzed CFG nodes are associated in relation, so that CFG structural data which can be used for data flow analysis are obtained.
In the above step, the source code is analyzed into CFG structure data, and each CFG node included in the CFG structure data includes path information of the source code and information of each packet, and since the name of each packet under the same path is unique, the method in each packet can be found by the name and the path of the packet, so that the cross-packet data flow analysis of the source code can be realized by analyzing the source code into the CFG structure data.
It should be noted that, an application scenario of the embodiment of the present application is a data flow analysis technique in a white box testing technique. A data stream is an ordered set of data sequences of bytes having a start and an end, including an input stream and an output stream. A node is a data structure representing a particular code unit, which contains details of the source code.
In the implementation process of the above embodiment, first, a plurality of CFG nodes need to be partitioned according to the parsing rule of the data stream.
In one embodiment, a CFG node may include: the METHOD includes a META _ DATA node, a FILE node, a METHOD _ PARAMETER _ IN node, a METHOD _ RETURN node, a modified node, a TYPE _ DECL node, a TYPE _ PARAMETER node, a TYPE _ accumulator node, a MEMBER node, an NAMESPACE _ BLOCK node, a LITERAL node, a CALL node, a LOCAL node, an IDENTIFIER node, a RETURN node, a BLOCK node, a METHOD _ INST node, an ARRAY _ INITIALIZER node, a METHOD _ REF node, a CONTROL _ RUSTERE node, etc., and different nodes have different DATA STRUCTUREs for storing different code information.
In addition, different CFG nodes specify different annotation content. For example, the "IDENTIFIER" node is annotated as "IDENTIFIER/reference", i.e., it means that the "IDENTIFIER" node is used to store an IDENTIFIER; the notation of "METHOD" node is "A METHOD/function/procedure", i.e., it means that "METHOD" node is used to deposit a METHOD, function or step.
Optionally, in an implementation manner of this embodiment, the data types of the source code may include multiple AST node types and multiple SSA node types. For example, the AST node may include: ArrayType node, BadExpr node, Basiclit node, BinaryExpr node, CallExpr node, ChanType node, ComositeLit node, Ellipsis node, FuncLit node, FuncType node, Ident node, IndexExpr node, InterfaceType node, KeyValueExpr node, MapType node, ParenExpr node, SelecteExpr node, SliceExpr node, StarExpr node, StructType node, TypeAssertExpr node, UnnaryExpr node, etc.; the SSA node may comprise: the node comprises an Alloc node, a BinOp node, a Builtin node, a Call node, a ChangeInterface node, a ChangeType node, a Const node, a Convert node, a Debugref node, a Defer node, an Extract node, a Field node, a FieldAdddr node, a FreeVar node, a Function node, a Global node, a Go node, an If node, an Index node, an IndexAddr node, a Jump node, a Lookup node, a MakeChan node, a MakeClosure node, a MakeInterface node, a MakeMakeMakeMakeMap node, a MakeIce node, a Next node, a Panel node, a Parameter node, a Phi node, a Range node, a Return node, a RunDeferars node, a set node, a Send node, a Slice node, a Store node, a type, an OpnType node, and the like.
Specifically, in this embodiment, the data structures of different AST nodes and different SSA nodes are different, and the stored code information is also different.
In addition, each AST node and SSA node has its own annotation content. For example, the annotation content corresponding to the AST node "ArrayType" is "An ArrayType node representations An array or slab type", that is, the "ArrayType" is used for storing An array or a slice type; the annotation content corresponding to the "Ident" is "AnIdent node renderings an identifier", that is, the node is used for storing the identifier; the annotation content corresponding to The SSA node "Send" is "The Send instruction transmissions X on channel", that is, it indicates that The node is used to Send X on channel.
It should be noted that the above description is only an illustration of comments provided in the embodiments of the present application for some nodes in the plurality of AST node types and the plurality of SSA node types, and in this embodiment, each node includes a corresponding comment, which is not listed here.
In addition, in other embodiments of this embodiment, the data types of the source code may further include a greater number (or a smaller number) of AST node types and a greater number (or a smaller number) of SSA node types, for example, the data types of the source code may further include 46 AST node types or 57 SSA node types. The specific number of AST node types and SSA node types is not limited by this application.
In the subsequent steps, for each of the CFG nodes, a target node corresponding to each CFG node is searched from the AST node type or the SSA node type included in the data type of the source code of the programming language according to the comment content of each node, and information in the target node is extracted and written in the CFG node corresponding to the target node, thereby obtaining an analyzed CFG node.
For example, for An IDENTIFIER node in the CFG node, An IDENTIFIER node may be found from the AST node type, where the annotation content of the IDENTIFIER node is "An IDENTIFIER node presenters An IDENTIFIER", and thus, the IDENTIFIER node in the AST node type may be considered as a target node corresponding to the IDENTIFIER node in the CFG node, and then information in the IDENTIFIER node is extracted and written into the IDENTIFIER node in the CFG node, so as to complete one-time parsing, and obtain the parsed IDENTIFIER node, that is, obtain one parsed CFG node.
In the same way, the target node corresponding to each CFG node can be found from the AST node or the SSA node according to the annotation content of each AST node type and the annotation content of each SSA node type, and the information in each found target node is extracted and written into the corresponding CFG node, so that the analysis of all CFG nodes is completed, and the CFG nodes after the analysis include the path information of the source code and the information of each packet.
And finally, associating each analyzed CFG node to obtain CFG structural data which can be used for data flow analysis.
Further, in order to represent the whole execution logic of the program source code, a plurality of CFG nodes obtained by parsing need to be associated. The association procedure of the CFG node is described in detail below.
Referring to fig. 3, fig. 3 is a flowchart illustrating sub-steps of step S103 in fig. 1 according to an embodiment of the present disclosure. In this embodiment, step S103 specifically includes the following sub-steps:
and a substep S1031 of analyzing the corresponding relationship of the CFG nodes according to the source codes and connecting the CFG nodes.
Sub-step S1032 connects CFG edges according to the execution logic of the source code context.
And a substep S1033 of connecting the AST edges according to the inclusion relationship between the CFG nodes.
In a specific embodiment, each analyzed CFG node includes information of a source code, and a correspondence between the source codes is a correspondence between each analyzed CFG node. Therefore, the source code is analyzed to obtain the corresponding relation between the CFG nodes, and then the CFG nodes are connected according to the corresponding relation between the CFG nodes.
Further, in the sub-step S1031, the sub-step specifically includes: and circularly traversing the corresponding relation of all CFG nodes in the source code, and connecting and corresponding the CFG nodes.
In this embodiment, an edge refers to a representation of a correspondence between different nodes. The edges of the CFG nodes are connected according to the execution logic between the source code contexts, so that the connecting lines between the CFG nodes can be used to indicate the execution logic of the entire CFG structure data.
Fig. 4 is a diagram illustrating a conventional CFG structure data, as shown in fig. 4. A CFG structure (control flow graph), also called a control flow graph, is an abstract representation of a process or program, and represents all the paths that a program will traverse during its execution by means of an abstract data structure. The control flow graph may represent the possible flow direction of all basic block executions of a source code in the execution process, and may also be used to reflect the real-time execution process of a process.
In FIG. 4, the CFG structure includes a plurality of CFG nodes (e.g., nodes A, B, C, D, E in FIG. 4) connected by ordered arrows representing the context execution logic of the source code, e.g., the execution order of the source code is from CFG node A to CFG node B, from CFG node C to CFG node D and CFG node F, from CFG node D and CFG node F to CFG node E, from CFG node G to CFG node H and CFG node I, and from CFG node H to CFG node G.
It should be noted that fig. 4 is only a schematic diagram of the CFG structure in the present application, and in other embodiments of the present application, different source codes may correspond to different CFG structure diagrams, and are not limited herein.
The source code is analyzed into CFG structure data, each CFG node contains information of each path and each packet of the source code, and meanwhile, the execution logic of the source code can be represented through the arrow direction, so that the subsequent data flow analysis is facilitated.
For each execution procedure of the program source code, a control flow graph of the procedure is generally represented by a quadruple G ═ (N, E, Entry, Exit). Wherein N is a set of CFG nodes; e is a set of edges, each edge being an ordered pair of nodes<ni,nj>It means fromniTo njThe paths that may exist; entry and Exit denote the Entry and Exit nodes of the subroutine, respectively.
In addition, in a specific embodiment, AST edges may also be connected according to the inclusion relationship between the CFG nodes. Specifically, there is a corresponding relationship between different CFG nodes, for example, one CFG node a is used to represent a method of the source code, and another CFG node B represents parameters of the method, and then the node B is included in the node a, and the inclusion relationship between the CFG nodes can be represented by connecting the AST edges through the inclusion relationship, which is convenient for subsequent analysis.
For example, if the CALL node in the CFG nodes includes information "a ═ b", where a is an ident node and b may be any node, then nodes a and b are included in the CALL node, that is, the outgoing edge of the CALL node is node a and node b, and the incoming edge of nodes a and b is the CALL node. By the method, AST edges can be connected, so that the inclusion relationship among the nodes is shown.
Further, in order to perform vulnerability analysis on the source code, after the SSA structure of the source code is analyzed to obtain the CFG structure, the CFG structure needs to be analyzed by a corresponding data flow analysis technique, so as to obtain a detection result of the vulnerability analysis.
Specifically, as shown in fig. 5, fig. 5 is a second flowchart of the SSA structure parsing method provided in the embodiment of the present application, and after step S103, the method may further include:
and step S104, performing vulnerability analysis on the CFG structural data according to vulnerability analysis rules and a data flow analysis method to obtain a vulnerability detection result.
In a specific embodiment, with reference to fig. 1, the CFG structural data obtained after the source code is analyzed and the vulnerability analysis rule are input into a data flow analysis algorithm for data analysis, that is, an analysis result is output, and the analysis result is used to characterize whether a code vulnerability exists in the source code and in which section of code the specific code vulnerability appears.
Specifically, in this embodiment, the vulnerability analysis rule may be updated according to a preset frequency, and an analysis method for vulnerabilities with a high frequency in the source code writing process is included. In addition, the vulnerability analysis rule is updated according to the preset frequency, so that the vulnerability analysis rule can be suitable for most novel vulnerabilities, and the vulnerability detection accuracy is improved.
Specifically, as shown in fig. 6, fig. 6 shows one CFG structure data to be analyzed. The process of dataflow analysis is described below in connection with reach-fix analysis.
If there is a path from a program point immediately following the fixed value y to a program point z, and y is not "kill" (i.e., the variable x is not reassigned to another value) on the path, we say that the fixed value reaches the program point z.
In fig. 6, the transfer function of the reach-fix problem is defined as: out [ s ]]=In[s]+ gen-kill. gen sets are intra-blocks (i.e., B)1、B2、B3、B4) The kill set is within the block (i.e., B)1、B2、B3、B4) The other constant values of the assignment statement kill. For each variable, an assignment statement is added to gen, assignment statements at other positions are added to kill, and gen/kill sets of all blocks can be scanned at one time. The constraints on the path are: in [ B ]]=∪Out[P]Wherein P is B (i.e., B)1、B2、B3、B4) In addition, there are boundary conditions: out [ Entry ]]=Φ。
And then starting loop iteration, wherein In/Out set of each block is updated In each round of iteration until all In [ s ] and Out [ s ] are unchanged, and the final data stream analysis result is obtained.
By adopting the method, the data flow analysis is carried out on the control flow graph (namely CFG structural data) of the source code, the technical problem that the existing data flow analysis cannot be carried out in a cross-packet mode is solved, the cross-packet data tracking of the source code is realized, the false alarm rate of a bug is reduced, and the reliability of a detection result is improved.
Further, since there are many different types of program languages, such as C + +, java, golang, etc., and the parsing methods for the different types of program languages are also different, it is also necessary to analyze the types of program languages before parsing the source code of the program speech. Referring to fig. 7, fig. 7 is a third flowchart of an SSA structure analysis method according to an embodiment of the present disclosure. In this embodiment, before step S102, the SSA structure parsing method further includes:
step S201, analyzing the source code to obtain a corresponding program language type. Wherein the program language type comprises a go language.
Step S202, inputting the source code into a corresponding analysis engine according to the program language type for analysis. Wherein, the parsing engine comprises a go language parsing engine.
It should be noted that the program language is a formal language used to define the execution flow of computer instructions. Each programming language contains a complete set of lexical and grammatical specifications that typically include data types and data structures, instruction types and instruction controls, call mechanisms, library functions, and the like.
A program is composed of a plurality of statements, and a statement is an instruction (which may contain a plurality of operations). The statement has a prescribed keyword (command) and syntax structure, and the program language writes the program statement in a serial method. Control instructions (e.g., sequence, selection, loop, call, etc.) in a language may change the execution flow of a program to control the processing of a computer.
In a specific implementation process, before parsing the source code into CFG nodes, the source code needs to be analyzed to obtain the programming language type of the source code, i.e. to determine what programming language the source code is written in. For example, the program language type of the source code may be any one of a plurality of program languages, such as C language, golang language, and JAVA language.
Specifically, for program source files written in different language types, front-end analysis programs written in the same language type are required to perform front-end analysis work so as to retain original language characteristics, so that any details cannot be omitted, and the accuracy of subsequent detection is greatly improved.
Therefore, after the source file is subjected to program language analysis to obtain a corresponding program language type (such as a golang language), a corresponding parsing engine is then obtained according to the program language type, where the parsing engine is a source code parsing algorithm written in the same language as the program language type.
For example, in one embodiment of this embodiment, after analyzing the source code and obtaining that the program language type of the source code is the golang language, the source code is input into the golang analysis engine for subsequent analysis.
In a specific implementation process, step S202, inputting the source code into a corresponding parsing engine according to the program language type for parsing, specifically including: and performing lexical analysis and syntactic analysis on the source code to obtain SSA structure data of the source code.
First, it should be noted that SSA (Static Single-Assignment) is an intermediate representation, and is called a Single Assignment because the names of the packets in the source code are assigned only once in the SSA. The syntax analysis is a logical phase of the compilation process, and the task of the syntax analysis is to perform context-dependent property examination and type examination on structurally correct source code. The syntax analysis is used for examining the source code for semantic errors and collecting type information for the code generation stage.
Specifically, in this embodiment, after the source code is input into the parsing engine corresponding to the programming language thereof, the parsing engine performs lexical analysis and syntactic analysis on the source code, so as to convert the source code into the SSA structure, where the SSA structure includes a plurality of SSA nodes described in the foregoing embodiments.
The lexical analysis has the function of analyzing the source code file and converting the character string sequence in the file into a Token sequence, so that the subsequent processing and analysis are facilitated. The input of the Grammar analysis is Token sequences output by the lexical analysis, the sequences are analyzed by the Grammar analyzer according to the sequence, the Grammar analysis process is that Token generated by the lexical analysis is reduced from bottom to top or from top to bottom according to a Grammar (Grammar) defined by languages, and a source code file of each golang language is finally generalized into a SourceFile structure.
Further, in this embodiment, the step of performing lexical analysis and syntactic analysis on the source code to obtain the SSA structure specifically includes the following steps:
and performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token. Wherein the Token sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a word and a comment.
And (4) carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics.
And converting the abstract syntax tree into SSA structural data according to the abstract syntax tree and by combining the syntax characteristics of the source code.
Specifically, in the above-described present step, a golang language is exemplified. The source code of the golang language is stored in the cmd/build directory, and the golang parsing engine generally performs the work of lexical analysis, syntactic analysis, type checking, intermediate code generation, and the like.
After the golang language is input to the golang parsing engine, the golang parsing engine first performs lexical analysis on the source code written in the golang language, converting the source code into a series of Token. Token is a set of predefined, recognizable character strings, typically consisting of names and values, where names are generally lexical categories such as identifiers, keywords, delimiters, operators, words, and comments.
Subsequently, the golang parsing engine scans Token, and constructs an Abstract Syntax Tree (AST) of the source code according to the Syntax characteristics of the golang language. An Abstract Syntax Tree (AST) is an abstract representation of the structure of the source code syntax, which represents the syntax structure of a programming language in a tree-like manner. Each node in the abstract syntax tree represents an element in the source code and each sub-tree represents a syntax. As a common data structure, the abstract syntax tree is erased some characters, spaces, semicolons or brackets, etc., which are not important in the source code.
Each abstract syntax tree is an exact representation of the corresponding source code and can be used to determine if there are some types of mismatch or inconsistency in a correctly structured program. Where the nodes correspond to various elements of the source code, such as expressions, statements, and so on.
And the Golang parsing engine checks the types defined and used in the abstract syntax trees according to a specific sequence by combining the semantic features of the Golang language according to all the constructed abstract syntax trees.
And verifying each node through traversing each abstract syntax tree to ensure that the problem of type errors cannot occur on the current node. In addition, the type checking stage not only verifies the nodes of the tree structure, but also expands and rewrites some built-in functions, for example, the make key is replaced with a makeslice or makechan function according to the structure of the subtree at this stage.
After the golang parsing engine converts the source code into an abstract syntax tree, and parses and type-checks the syntax of the whole tree, it can be considered that the source code in the current file basically has no problem of being unable to be compiled or syntax error, and then the golang parsing engine converts the input abstract syntax tree AST into an intermediate code, i.e. an SSA structure. The SSA structure is a low-level intermediate representation (intermediate representation) with specific properties, which can achieve optimization and ultimately generate machine code more easily.
In the process of converting the abstract syntax tree into the SSA structure, the process of the built-in function (functionintrinsics) will be completed. These built-in functions belong to special functions and the golang parsing engine analyzes these built-in functions one by one and decides whether to replace them with deeply optimized code.
In summary, the source codes of different language types can be converted into the SSA structures by the parsing engine of each programming language, then the SSA structures of the source codes are parsed by the SSA structure parsing method provided by the embodiments to obtain the CFG structure data of the source codes, and then the CFG structure data of the source codes is subjected to data flow analysis by the data flow analysis technology, so that cross-packet data flow analysis of the source codes is realized, the false alarm rate of vulnerability detection is also reduced, and the reliability of vulnerability detection results is improved.
Based on the same inventive concept, an SSA structure analysis apparatus 100 corresponding to the SSA structure analysis method is further provided in the embodiment of the present application, and since the principle of the apparatus in the embodiment of the present application to solve the problem is similar to the SSA structure analysis method described above in the embodiment of the present application, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 8, fig. 8 is a functional block diagram of an SSA structure analysis apparatus 100 according to an embodiment of the present disclosure. In this embodiment, the apparatus is applied to an electronic device, and includes: the system comprises a CFG node dividing module 101, a parsing module 102 and an association module 103.
The CFG node dividing module 101 is configured to divide a plurality of CFG nodes according to a data stream parsing rule.
The parsing module 102 is configured to, for each CFG node, search for a target node corresponding to each CFG node from a data type of a source code of a programming language, extract information in the target node, write the information in the CFG node corresponding to the target node, and obtain a parsed CFG node, where the data type of the source code includes multiple AST node types and multiple SSA node types.
And the association module 103 is configured to perform relationship association on the analyzed CFG nodes to obtain CFG structure data that can be used for data flow analysis.
Further, referring to fig. 8, in the present embodiment, the SSA structure analyzing apparatus 100 further includes:
and the vulnerability analysis module 104 is used for carrying out vulnerability analysis on the CFG structure according to the vulnerability analysis rule and the data flow analysis method to obtain a vulnerability detection result.
Furthermore, in this embodiment, the association module 103 includes the following sub-modules:
and the node connection submodule is used for analyzing the corresponding relation of the CFG nodes according to the source codes and connecting the CFG nodes.
And the CFG edge connecting submodule is used for connecting the CFG edge according to the execution logic of the source code context.
And the AST edge connecting submodule is used for connecting the AST edges according to the inclusion relationship among the CFG nodes.
Optionally, in this embodiment, the node connection sub-module is specifically configured to: and circularly traversing the corresponding relation of all CFG nodes in each line of codes in the source code, and connecting and corresponding the CFG nodes.
Further, referring to fig. 8, in the present embodiment, the SSA structure analyzing apparatus 100 may further include:
and the source code analysis module 105 is configured to analyze the source code to obtain a corresponding program language type, where the program language type includes a go language.
And a source code input module 106, configured to input a source code into a corresponding parsing engine according to a program language type for parsing, where the parsing engine includes a go language parsing engine.
Specifically, in the present embodiment, the source code input module 106 includes an analysis submodule;
the analysis submodule is used for performing lexical analysis and syntactic analysis on the source code to obtain SSA structural data of the source code.
Further, in this embodiment, the analysis submodule is specifically configured to:
performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token, wherein the marker sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a character and a comment;
carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics;
and converting the abstract syntax tree into SSA structural data according to the abstract syntax tree and by combining the syntax characteristics of the source code.
The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.
An embodiment of the present application further provides an electronic device 60, as shown in fig. 9, which is a schematic structural diagram of the electronic device 60 provided in the embodiment of the present application, and includes: a processor 61, a memory 62, and a bus 63. The memory 62 stores machine-readable instructions executable by the processor 61 (for example, execution instructions corresponding to functions of the CFG node dividing module 101, the parsing module 102, the associating module 103, the vulnerability analyzing module 104, the source code analyzing module 105, and the source code inputting module 106 in the apparatus in fig. 8, and the like), when the electronic device 60 runs, the processor 61 and the memory 62 communicate through the bus 63, and the machine-readable instructions are executed by the processor 61 to perform the method of any one of the first to fifth embodiments.
The present embodiment also provides a storage medium, on which a computer program is stored, and when the computer program is executed by the processor 61, the computer program performs the steps of the method of any of the above embodiments.
Specifically, the storage medium can be a general storage medium, such as a removable disk, a hard disk, and the like, and when a computer program on the storage medium is executed, the method of any of the embodiments can be executed, so that a problem that a cross-packet analysis cannot be implemented by a current data stream analysis technology is solved, a false alarm rate of vulnerability analysis is reduced, and a reliability of an analysis result is improved.
In some embodiments, the processor 61 may include one or more processing cores (e.g., a single core processor 61(S) or a multi-core processor 61 (S)). Merely by way of example, Processor 61 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP) 61, a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP) 61, a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer, RISC), a microprocessor 61, or the like, or any combination thereof.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the electronic device and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed electronic device, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by the processor 61. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
In addition, in order to make the purpose, technical solution and advantages of the embodiments of the present application clearer, functional units in various embodiments of the present application may be integrated into one body, and the technical solution in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application.
It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

Claims (16)

1. An SSA structure analysis method is applied to electronic equipment, and is characterized by comprising the following steps:
dividing a plurality of CFG nodes according to a data stream analysis rule;
aiming at each CFG node, searching a target node corresponding to each CFG node from the data type of a source code of a programming language, extracting information in the target node, writing the information into the CFG node corresponding to the target node, and obtaining the analyzed CFG node, wherein the data type of the source code comprises a plurality of AST node types and a plurality of SSA node types;
and carrying out relationship association on the analyzed CFG nodes to obtain CFG structural data which can be used for data flow analysis.
2. The method according to claim 1, wherein the associating the analyzed CFG nodes with relationships comprises:
analyzing the corresponding relation of the CFG nodes according to the source codes, and connecting the CFG nodes;
connecting CFG edges according to the execution logic of the context of the source code;
and connecting AST edges according to the inclusion relation among the CFG nodes.
3. The method according to claim 2, wherein the parsing the correspondence of the CFG node according to the source code and connecting the CFG node comprises:
and circularly traversing the corresponding relation of all CFG nodes in the source code, and connecting and corresponding the CFG nodes.
4. The method of claim 1, further comprising:
and carrying out vulnerability analysis on the CFG structural data according to vulnerability analysis rules and a data flow analysis method to obtain a vulnerability detection result.
5. The method of claim 1, wherein before for each CFG node, finding a target node corresponding to each CFG node from a data type of source code of a programming language, the method further comprises:
analyzing the source code to obtain a corresponding program language type, wherein the program language type comprises a go language;
and inputting the source code into a corresponding analysis engine for analysis according to the program language type, wherein the analysis engine comprises a go language analysis engine.
6. The method of claim 5, wherein the inputting the source code into a corresponding parsing engine for parsing according to the programming language type comprises:
and performing lexical analysis and syntactic analysis on the source code to obtain SSA structural data of the source code.
7. The method of claim 6, wherein the lexical analysis and the syntactic analysis of the source code to obtain SSA structural data of the source code comprises:
performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token, wherein the marker sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a character and a comment;
carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics;
and converting the abstract syntax tree AST into SSA structural data according to the abstract syntax tree and by combining the syntax characteristics of source codes.
8. An SSA structure analysis device applied to electronic equipment is characterized by comprising:
the CFG node dividing module is used for dividing a plurality of CFG nodes according to the data stream analysis rule;
the analysis module is used for searching a target node corresponding to each CFG node from the data type of a source code of a programming language for each CFG node, extracting information in the target node and writing the information into the CFG node corresponding to the target node to obtain the analyzed CFG node, wherein the data type of the source code comprises a plurality of AST node types and a plurality of SSA node types;
and the association module is used for associating the analyzed CFG nodes with the relationship so as to obtain CFG structural data which can be used for data flow analysis.
9. The apparatus of claim 8, wherein the associating module comprises:
the node connection sub-module is used for analyzing the corresponding relation of the CFG nodes according to the source codes and connecting the CFG nodes;
the CFG edge connecting sub-module is used for connecting CFG edges according to the execution logic of the context of the source code;
and the AST edge connecting sub-module is used for connecting the AST edges according to the inclusion relationship among the CFG nodes.
10. The apparatus of claim 9, wherein the node connection submodule is specifically configured to:
and circularly traversing the corresponding relation of all CFG nodes in each line of codes in the source code, and connecting and corresponding the CFG nodes.
11. The apparatus of claim 8, further comprising:
and the vulnerability analysis module is used for carrying out vulnerability analysis on the CFG structure according to vulnerability analysis rules and a data flow analysis method to obtain a vulnerability detection result.
12. The apparatus of claim 8, further comprising:
the source code analysis module is used for analyzing the source code to obtain a corresponding program language type, wherein the program language type comprises a go language;
and the source code input module is used for inputting the source code into a corresponding analysis engine for analysis according to the program language type, wherein the analysis engine comprises a go language analysis engine.
13. The apparatus of claim 12, wherein the source code input module comprises:
and the analysis submodule is used for performing lexical analysis and syntactic analysis on the source code to obtain SSA structural data of the source code.
14. The apparatus of claim 13, wherein the analysis submodule is specifically configured to:
performing lexical analysis on the source code, and converting the source code into a corresponding marker sequence Token, wherein the marker sequence Token comprises at least one of an identifier, a keyword, a separator, an operator, a character and a comment;
carrying out syntactic analysis on the marker sequence Token, and constructing the marker sequence Token into an abstract syntax tree AST according to syntactic characteristics;
and converting the abstract syntax tree AST into SSA structural data according to the abstract syntax tree and by combining the syntax characteristics of source codes.
15. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to any one of claims 1 to 7.
16. A storage medium, having stored thereon a computer program which, when executed by a processor, performs the steps of the method according to any one of claims 1 to 7.
CN202010651559.6A 2020-07-08 2020-07-08 SSA structure analysis method and device, electronic equipment and storage medium Pending CN111813675A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010651559.6A CN111813675A (en) 2020-07-08 2020-07-08 SSA structure analysis method and device, electronic equipment and storage medium
CN202110579664.8A CN113157597A (en) 2020-07-08 2021-05-26 Structure analysis method, structure analysis device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010651559.6A CN111813675A (en) 2020-07-08 2020-07-08 SSA structure analysis method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111813675A true CN111813675A (en) 2020-10-23

Family

ID=72842941

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010651559.6A Pending CN111813675A (en) 2020-07-08 2020-07-08 SSA structure analysis method and device, electronic equipment and storage medium
CN202110579664.8A Pending CN113157597A (en) 2020-07-08 2021-05-26 Structure analysis method, structure analysis device, electronic equipment and storage medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202110579664.8A Pending CN113157597A (en) 2020-07-08 2021-05-26 Structure analysis method, structure analysis device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (2) CN111813675A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010891A (en) * 2021-02-26 2021-06-22 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113010890A (en) * 2021-02-26 2021-06-22 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113590489A (en) * 2021-08-03 2021-11-02 杭州默安科技有限公司 Golike language-based IAST safety testing method and system
CN116166276A (en) * 2023-04-25 2023-05-26 芯瞳半导体技术(山东)有限公司 Control flow analysis method, device, equipment, medium and product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023931A1 (en) * 2008-07-24 2010-01-28 Buqi Cheng Method and System for Intermediate Representation of Source Code
CN103729295A (en) * 2013-12-31 2014-04-16 北京理工大学 Method for analyzing taint propagation path
CN107844415A (en) * 2017-09-28 2018-03-27 西安电子科技大学 A kind of model inspection path reduction method, computer based on interpolation
CN110321458A (en) * 2019-05-21 2019-10-11 国家电网有限公司 A kind of dataflow analysis method and device based on controlling stream graph
CN110781086A (en) * 2019-10-23 2020-02-11 南京大学 Cross-project defect influence analysis method based on program dependency relationship and symbolic analysis

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107943481B (en) * 2017-05-23 2021-01-26 清华大学 C language program code specification construction method based on multiple models
CN109117633B (en) * 2018-08-13 2022-11-04 百度在线网络技术(北京)有限公司 Static source code scanning method and device, computer equipment and storage medium
CN109857641B (en) * 2018-12-29 2022-09-13 奇安信科技集团股份有限公司 Method and device for detecting defects of program source file
CN111240982A (en) * 2020-01-09 2020-06-05 华东师范大学 Static analysis method for source code

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023931A1 (en) * 2008-07-24 2010-01-28 Buqi Cheng Method and System for Intermediate Representation of Source Code
CN103729295A (en) * 2013-12-31 2014-04-16 北京理工大学 Method for analyzing taint propagation path
CN107844415A (en) * 2017-09-28 2018-03-27 西安电子科技大学 A kind of model inspection path reduction method, computer based on interpolation
CN110321458A (en) * 2019-05-21 2019-10-11 国家电网有限公司 A kind of dataflow analysis method and device based on controlling stream graph
CN110781086A (en) * 2019-10-23 2020-02-11 南京大学 Cross-project defect influence analysis method based on program dependency relationship and symbolic analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曾鸣 等: "BPF的实现机制分析与性能优化研究", 《计算机工程》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010891A (en) * 2021-02-26 2021-06-22 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113010890A (en) * 2021-02-26 2021-06-22 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113010890B (en) * 2021-02-26 2023-02-07 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113010891B (en) * 2021-02-26 2023-02-07 中科天齐(山西)软件安全技术研究院有限公司 Application program safety detection method and device, electronic equipment and storage medium
CN113590489A (en) * 2021-08-03 2021-11-02 杭州默安科技有限公司 Golike language-based IAST safety testing method and system
CN116166276A (en) * 2023-04-25 2023-05-26 芯瞳半导体技术(山东)有限公司 Control flow analysis method, device, equipment, medium and product
CN116166276B (en) * 2023-04-25 2023-07-11 芯瞳半导体技术(山东)有限公司 Control flow analysis method, device, equipment, medium and product

Also Published As

Publication number Publication date
CN113157597A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN111813675A (en) SSA structure analysis method and device, electronic equipment and storage medium
CN108614707B (en) Static code checking method, device, storage medium and computer equipment
CN106227668B (en) Data processing method and device
JP7172435B2 (en) Representation of software using abstract code graphs
US5870590A (en) Method and apparatus for generating an extended finite state machine architecture for a software specification
US6038378A (en) Method and apparatus for testing implementations of software specifications
US9122540B2 (en) Transformation of computer programs and eliminating errors
EP0204942A2 (en) Compiler for a source program, a method of making the same and its use
US10664655B2 (en) Method and system for linear generalized LL recognition and context-aware parsing
Herfert et al. Automatically reducing tree-structured test inputs
JP7237457B2 (en) computing equipment
US11262988B2 (en) Method and system for using subroutine graphs for formal language processing
CN111694746A (en) Flash defect fuzzy evaluation tool for compilation type language AS3
CN108563561B (en) Program implicit constraint extraction method and system
He et al. Selecting context-sensitivity modularly for accelerating object-sensitive pointer analysis
Basten et al. Parse forest diagnostics with Dr. Ambiguity
WO2001069391A2 (en) Difference engine method and apparatus
Gerasimov et al. Reachability confirmation of statically detected defects using dynamic analysis
Utkin et al. Evaluating the impact of source code parsers on ML4SE models
Lester et al. Information flow analysis for a dynamically typed language with staged metaprogramming
JP2011154568A (en) Information processing apparatus, program verification method and program
Arusoaie et al. Automating abstract syntax tree construction for context free grammars
Li An empirical study on bash language usage in Github
EP3167382A1 (en) Method and system for linear generalized ll recognition and context-aware parsing
Palanisamy et al. Modelica based parser generator with good error handling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201023

WD01 Invention patent application deemed withdrawn after publication