CN106227573A - Function call path extraction method based on controlling stream graph - Google Patents
Function call path extraction method based on controlling stream graph Download PDFInfo
- Publication number
- CN106227573A CN106227573A CN201610541747.7A CN201610541747A CN106227573A CN 106227573 A CN106227573 A CN 106227573A CN 201610541747 A CN201610541747 A CN 201610541747A CN 106227573 A CN106227573 A CN 106227573A
- Authority
- CN
- China
- Prior art keywords
- node
- function
- code block
- function call
- code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention provides a kind of function call path extraction method based on controlling stream graph, including: step 1, the intermediate code of acquisition source code, and intermediate code is analyzed so that the code block in intermediate code is identified, then generate controlling stream graph with this;Node in described controlling stream graph be the connecting line between code block, and node be the function calling relationship in code block;Step 2, the function call number being comprised the node in controlling stream graph are analyzed, so that controlling stream graph to be converted to the call graph of function.
Description
Technical field
The present invention relates to computer software technical field, refer in particular to a kind of function call path based on controlling stream graph and carry
Access method.
Background technology
One program can produce a plurality of function call path because of the case statement containing decision condition and control statement
[1].In programming, source code typically has sequentially, selects, circulates three kinds of sentence structures.Sequential statement does not increase function and adjusts
It is only possible to produce more program branch with the bar number in path, only case statement and Do statement.As a example by C language, crucial
Word if, for, while, switch will produce a plurality of execution statement.So, these statements that may produce branch are to extract letter
The emphasis of number call graph.
The method extracting function call path is broadly divided into two kinds, and one is to analyze source by static analysis tools [3-5]
Code, the interactive information between method call and module in extraction procedure, determine turning to of the control stream between module, use
Automat or other means set up function call relationship graph, and then extract function call path.Static analysis refers to be not required to
Under conditions of source code to be performed being analyzed source code, be relative with dynamically analyzing, dynamically analyzing is in execution source generation
Under conditions of Ma, source code is analyzed.
The another kind of method more intuitively extracting function call path is program inserting method [12].By in source code or remittance
Compile in code and insert the probe functions designed, when program performs again, entrance function can be collected and exit function
Plug-in mounting information, based on these plug-in mounting information, it is possible to obtain information such as the control streams of program, so extract function call path
[13-15].Decorate stream split algorithm [15] or other algorithms realize the extraction work in function call path wherein it is possible to use.
At present, in automatic test field, object code plug-in mounting and source code plug-in mounting are the plug-in mounting investigative techniques [16] of main flow.But
The integrity of dynamic instrumentation method depends on choosing of test case, if test case is chosen incomplete, can cause test process
Insufficient.
Summary of the invention
For problems of the prior art, the technical problem to be solved in the present invention is to provide a kind of based on controlling stream graph
The method extracting function call path, it is possible to simply carry out the extraction in function call path accurately.
In order to solve the problems referred to above, the embodiment of the present invention proposes a kind of function call path extraction based on controlling stream graph
Method, including:
Step 1, the intermediate code of acquisition source code, and intermediate code is analyzed with to the code block in intermediate code
It is identified, and generates controlling stream graph with this;Node in described controlling stream graph is that the connecting line between code block, and node is
Annexation between code block;
Step 2, the function call number being comprised the node in controlling stream graph are analyzed, to be changed by controlling stream graph
Call graph for function.
Wherein, described step 1 specifically includes:
Step 11, utilize gcc compiler obtain source code GCC-CFG form intermediate code;
Step 12, the code block obtained in intermediate code, and by following preset rules, function body is identified also
Acquisition function call information:
Rule P 1:([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9];
Rule P 2:<bb [ ] [ 0-9 ] +\>;
Rule P 3:<L [0-9]+>
Rule P 4:((goto) []<bb [ ] [ 0-9 ] +\>)
Rule P 5:((goto) []<bb [ ] [ 0-9 ] +\>) [] (<l [ 0-9 ] +\>)
Rule P 6:([_a-zA-Z] * [a-zA-Z0-9] [] [()];
When code block mates with rule P 1, then this code block is function declaration;At the beginning of code block execution to matched rule P1
Beginningization operates, and then updates the parameter value relevant to code block, and wherein said parameter value is following at least one: function
Number, node number, limit number;
When code block mates with rule P 2, then this code block is bb code block original position;When code block and rule P 3
Timing then this code block is L code block;Need to process its last layer code block to the code block of matched rule P2 and rule P 3
Information, and output function recalls information, then update the parameter value relevant to code block;Wherein said parameter value be following extremely
Few one: function number, node number, limit number;
When code block mates with rule P 4, then this code block is goto statement;Then should when code block mates with rule P 5
Code block is special format goto statement;It is designated whether to contain to the code block of matched rule P4 and rule P 5 and redirects, then
The limit connecting this node with the node pointed by goto is generated according to jump information;
When code block mates with rule P 6, then this code block is function call;Obtain its function call information;
Step 13, basis identify code block and generate controlling stream graph;Each node in wherein said controlling stream graph is one
Individual code segment, will be attached according to control stream information between node by connecting line.
Wherein, described step 1 also includes: use three enumerators to record the number of function, node number, limit number;
And whether record code block comprises goto statement, whether code block comprises function call, code block bag to use three enumerators
The function call contained is detailed.
Wherein, described step 2 specifically includes:
If code block does not has function call, node corresponding for this code block is merged into its upper strata or lower level node
In;
If code block has and only one of which function call, using this code block as in function call relationship graph
Node also connects other nodes according to function calling relationship;
If code block has multiple function call, then by corresponding for each function in a controlling stream graph node and by these
Linking together according to function calling relationship between node, then the function call between basis and the node of other code blocks is closed
System connects other nodes.
Wherein, described step 2 has method during multiple function call to specifically include in code block:
Code block N1 comprise function call Funs (f1, f2 ..., fn), and node N1 has on one or more
Node layer is N0s, and one or more lower level node is N2s;
Make function call Funs (f1, f2 ..., fn) in the corresponding node of each function, and according to function call
Order these nodes are linked in sequence;
Delete upper layer node N0s and point to the connecting line of code block N1, and upper layer node N0s is connected in code block N1 suitable
The function f1 of sequence first, and the most last function fn of order is connected lower level node N2s, with generate one from upper layer node N0s~
The control stream that all function calls of code block N1~of lower level node N2 are made up of node and connecting line.
Wherein, described step 2 does not has method during function call to specifically include in code block:
If the code block in controlling stream graph does not has function call, then judge that the node that this code block is corresponding the most only connects
One upper layer node and/or only one lower level node of connection, be if it is merged into upper layer node by this node or be merged into down
Node layer;If it is not, then this node to be merged into upper layer node and lower level node simultaneously, or it is non-letter by this node identification
Number.
Wherein, described step 2 also includes:
The storage of json form is used to analyze the controlling stream graph information obtained;The data of wherein said json form are at least wrapped
Including: for the Function Array of representative function controlling stream graph information, this array includes for recording function name and controlling stream graph
The parameter of information, is wherein divided into again parameter and the record joint of connecting line between record node for recording the parameter of controlling stream graph information
The parameter of some title, contains function call array in the parameter of record nodename;With the single function in controlling stream graph
For ultimate unit, the function call information according to comprising in code block processes.
Having the beneficial effect that of the technique scheme of the present invention: function call path be one by program entry point to going out
The function name sequence of mouth point, combines control logical AND function call, code analysis granularity is expanded to function from statement.Pin
To extracting function call path the most exactly, the embodiment of the present invention proposes function call path extraction based on controlling stream graph
Method: first, obtains by gcc compiler and comprises the intermediate code controlling flow message, analyze intermediate code and extract the control of function
Flow graph processed;Then, according to controlling stream graph and the incidence relation of function call relationship graph, by specific algorithm, controlling stream graph is carried out
Process, obtain function call relationship graph, extract all of function call path.Test result indicate that, the embodiment of the present invention proposes
Method can obtain function call path exactly.
Accompanying drawing explanation
Fig. 1 is extraction and the Analytic principle schematic diagram of the if_test-controlling stream graph of the embodiment of the present invention;
Fig. 2 is extraction and the Analytic principle schematic diagram of the if_test2-controlling stream graph of the embodiment of the present invention;
Fig. 3 is extraction and the Analytic principle schematic diagram of the for_test-controlling stream graph of the embodiment of the present invention;
Fig. 4 is extraction and the Analytic principle schematic diagram of the while_test-controlling stream graph of the embodiment of the present invention;
Fig. 5 is the schematic diagram that the node of the embodiment of the present invention does not comprise function call;
Fig. 6 is the controlling stream graph storage mode schematic diagram of the embodiment of the present invention;
Fig. 7 is the code analysis process schematic selecting loop nesting function;
Fig. 8 is controlling stream graph function call graph;
Fig. 9 is to analyze procedure chart containing ternary operator recursive function code sample;
Figure 10 calls figure for the recursive function sample control flow graph function containing ternary operator.
Detailed description of the invention
For making the technical problem to be solved in the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and tool
Body embodiment is described in detail.
Step 1, controlling stream graph extraction step: in order to ensure the accuracy of function call relationship graph, need to extract accurately
The controlling stream graph of function.
The embodiment of the present invention uses gcc compiler source code carries out pretreatment, obtain and comprise in control flow message
Between code, and then complete the extraction work of controlling stream graph.
Gcc is a powerful C language compiler, contains substantial amounts of function choosing-item for controlling the mistake of compiling link
Journey, wherein "-fdump-tree " option can obtain the gcc pretreatment information to source code, selects the most suitable sub-option,
Gcc can generate form properly, accurately in the middle of Debugging message." cfg " sub-option can generate the middle generation of class controlling stream graph
Code.Now some simple code being carried out example case study, wherein comprise the analysis examples of if statement, its analysis can be such as Fig. 1 institute
Show.Fig. 2 is the analysis examples of directly return after if conditional statement has performed.
As can be seen from Figure 1 intermediate code is divided into two parts: function declaration and function body, this format code is referred to as
GCC-CFG intermediate code.Function declaration part contains the index functions information within gcc, for control flow graph
Obtain without the biggest help.Function body part is that source code is carried out pretreated result, is the Partitioning Expression of A to source code, will
Whole code is divided into simple code block, code block internal code order to perform, or jump to another one according to goto statement
Code block.Wherein<bb>module represents basic code block basic block, and gcc may during code analysis simultaneously
Merge a part of code, cause a part of code block to there is two or more name.Goto statement in basic code block can
With the execution sequence between reflection code block.By GCC-CFG intermediate code is carried out static analysis, mate respectively code block,
Goto statement i.e. can obtain the controlling stream graph on right side in Fig. 1, Fig. 2.
The analysis examples that comprises for statement is as it is shown on figure 3, additionally use while circulation and realize and the showing of Fig. 3 identical function
Example code is as shown in Figure 4.Two circulation code samples use for and while to achieve identical function respectively, pass through
The GCC-CFG intermediate code that gcc obtains also is identical, is depicted as controlling stream graph display result consistent.
Owing to GCC-CFG intermediate code has distinguishing mark statement, the embodiment of the present invention uses a kind of pattern-
The mode of action carries out static analysis to it.Wherein, pattern is a kind of pattern match or rule match, and action is
When after the code string (or referred to as token) matching specified rule, the relevant action of execution.Centre for GCC-CFG form
For code, the pattern part of main coupling is function declaration, basic code block<bb*>and skip instruction goto.
Rule pattern is listed as follows:
Table 1 numerical characteristic
Sequence number | Rule | Explanation |
P1 | ([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9] | Adaptation function is stated |
P2 | \<bb[][0-9]+\> | Coupling bb code block starts |
P3 | \<L[0-9]+\> | Coupling L code block |
P4 | ((goto)[]\<bb[][0-9]+\>) | Coupling goto statement, connects statement |
P5 | ((goto)[]\<bb[][0-9]+\>)[]\(\<L[0-9]+\>\) | Coupling special format goto statement |
P6 | ([_a-zA-Z]*[a-zA-Z0-9][][\(]) | Adaptation function calls |
Having 6 kinds of rules in table 1, left side is rule numbers, and centre is the class regular expression of this rule, and right side is rule
Explanation.The extraction of controlling stream graph needs to find 3 key contents: function declaration, basic code block, skip instruction.Controlling stream graph
It is to extract with function for processing unit.The step extracting control flow graph according to rule P 1-P5 is as follows:
It is respectively adopted rule P 1-P6 GCC-CFG intermediate code is mated:
When matching P1 rule, then representing to match and put function declaration, it is the function in GCC-CFG intermediate code
Definition.When matching P2 and P3 rule, beginning or L code block that it is bb function code block are described.In a function
Portion, one basic code section of each node on behalf of controlling stream graph, basic code section in GCC-CFG intermediate code be with <
Bb*>or<L*>form represent, rule P 2 and P3 is for finding out each intermediate code within function.Matching P4
During with P5 rule, illustrate inside function, there is goto statement.Inside function, each limit of controlling stream graph illustrates one and redirects
Statement.In GCC-CFG intermediate code, produce and redirected two kinds of situations: one is not have any redirecting in a code block
Statement, so entering next adjacent code block or end according to execution sequence, now produces the limit that an order performs
(edge);Another kind is to have skip instruction, namely goto statement in code block, and every goto statement all can produce a jumping
Turn, generate a new limit (edge).
The full content of controlling stream graph can be obtained by above three steps, but for generating function call graph, control
Each node in flow graph processed needs more to assist information.Some difference of controlling stream graph and function call relationship graph is
Content represented by node is different, and controlling stream graph node represents a code segment, and function call relationship graph node represents a letter
Number.In order to controlling stream graph to be converted to function call relationship graph, need when extracting controlling stream graph, by each basic generation
Function call information in code block is retained in each node.From Fig. 1-Fig. 4 it can be seen that inside code block, function
Calling is that very simple order is called, and does not has redirecting of complexity.So needing to come according to P6 obtaining controlling stream graph when
Adaptation function calls, and function call information is retained in the node of controlling stream graph in order.
Need to perform different actions (action) after matching different rules, specifically include:
After matching P1, need to perform initialization operation, the enumerator such as renewal function number, node number, limit number
Value;
After matching P2 or P3, need to process the information of a code block, form output node as required
Information includes the function call information in this node, the value that final updating is relevant to code block;
After matching P4 or P5, need to arrange is_bb_with_goto, i.e. whether this node comprises and redirects, and is used for
Auxiliary mastery routine judges whether to connect present node and next node.If comprising jump information, then generate and connect this node
Limit with the node pointed by goto;After matching P6, need to store corresponding function call information.
Therefore the extraction algorithm of the controlling stream graph in the embodiment of the present invention is as follows:
In algorithm above, use fun_num, node_num, edge_num these three counting for function and code block
Device, records the number of function, the number of some function interior joint, the number on limit respectively;Also it is the use of is_bb_with_
Goto, is_bb_with_function, called_functions these three enumerator comes whether record code block comprises goto
Whether statement, code block comprise function call, contain those function calls;Wherein yytext refers to arrive according to rule match
Title.And definition print_node and print_edge is for generating or export the control stream of specified format as required
Figure, as generated adjacency list or generating structure document (XML, JSON form) storage to hard disk in internal memory.
Step 2, function call relationship graph extraction step: according to controlling stream graph generating function call graph.
In the file depositing controlling stream graph information, for each basic code block (basic black), or L code
Block, the number according to comprising function call inside it can be divided three classes: does not has function call, only one of which function call, letter
Number calls number more than one.Classification difference, its processing mode is the most different, according to different classification, it is carried out different places
Reason, can be converted to function call relationship graph by controlling stream graph.
(1) for there is no the code block of function call: during controlling stream graph transfers function call relationship graph to, if
Code block does not has function call, typically can be in the way of taking to delete this node.The most in FIG, code block<bb 2>in
Not comprising any function call, only one of which node points to this code block simultaneously, so now deleting this node or title
It is correct for this node " is upwards merged ".But in the case of one is special, a controlling stream graph node points to multiple
, there is multiple node simultaneously and point to this node in node, now this node can not be deleted.In order to the analysis more refined is without function
Call situation, according to controlling stream graph node in-degree and the difference of out-degree, 4 kinds as shown in Figure 5 will be divided into without function call node
Situation.
For 1-3 kind situation in Fig. 5, the strategy of " merging " can be used.Union operation be divided into " upwards merge " and " to
Lower merging " two kinds.Upwards merge and refer to that the relevant information of this node merges with the upper layer information of this node, downstairs merger phase therewith
Instead.Upwards union operation: merged node is N1, upper layer node is N0, and lower level node is one or more N2s that are expressed as, and closes
And operating the limit pointing to N1 for deleting N0, N0 points to all of N2s, last deletion of node N1 simultaneously.Downstairs merger operates: closed
And node is N1, upper layer node is that one or more is expressed as N0s, and lower level node is N2, and union operation points to for deleting N1
The limit of N2, the most all of N0s points to N2, last deletion of node N1.Any one can be used to merge behaviour for the 1st kind of situation
Making, final result is the same;For the 2nd kind of situation, upwards union operation can only be performed, can only perform for the 3rd kind of situation
Downstairs merger operates.
For the in Fig. 5 the 4th kind of situation, this node can use two kinds of methods to process, and selects 1: delete this node, then hold
Row upwards merges and downstairs merger two operation;Select 2: if retaining this node, then need this node is specified a spy
Different title, represents that this node is not a function.First method is suitable for obtaining function call path, and second method is suitable for
Programmer analyzes reading, makes call graph more simple and clear.
(2), in the case of for a code block has a function call just, nodename is directly replaced, the most just
It is renaming, by original<bb*>or<l*>nodename RNTO function name.
(3), in the case of for a code block comprises multiple function call, " division " can be used to operate.Such as control
A node N1 in flow graph comprise function call Funs (f1, f2 ..., fn), the upper layer node (one or more) of node N1
Being expressed as N0s, lower level node (one or more) is expressed as N2s.Splitting operation is: first, for each function in Funs
Create a node (if this function node has existed, then without re-creating), and these nodes that are linked in sequence;So
After, deleting N0s and point to the limit of N1, N0s points to f1 simultaneously;Finally, fn is pointed to node N2s.Thus can generate one from
One limit of all function calls of N0s---N1---lower level node N2.
Controlling stream graph is that the node type according to controlling stream graph is done to the core concept of function call relationship graph transfer algorithm
Corresponding process, and detailed transfer algorithm is based on a kind of concrete data structure.The embodiment of the present invention uses json lattice
The controlling stream graph information obtained is analyzed in formula storage.
Json is the data interchange format of a kind of lightweight, and the format write of its data is key:value pair, wherein
Value can be numerical value, character string or array.The embodiment of the present invention needs multiple single instrument with the use of, so
Use json perdurable data, facilitate the distinct program process to analysis result.As shown in Figure 6, left side code is a C language
Initialization program in the mysql data base call program of speech version, centre is the intermediate code of its GCC-CFG form, right side
CFG data for json form.In embodiments of the present invention, the data form of json is:
Functions respective function array, each member in array represents the controlling stream graph information of a function, bag
Include funciton_name and tokens;The name of function_name representative function;
Tokens represents the controlling stream graph information that this function is corresponding, and type is divided into node node and edge limit;Node wraps
Containing nodename node_name, and function call array called_functions that this node comprises, edge includes controlling
The starting point node begin and peripheral node end on a limit in flow graph.
For data above structure, the embodiment of the present invention proposes and a kind of is transferred to function call relationship graph by controlling stream graph
Algorithm, this algorithm is as shown in table 2, and its input is the data file comprising controlling stream graph information, is output as comprising function call
The data file of graph of a relation information.In the algorithm of the embodiment of the present invention, in controlling stream graph data file, single function is as base
Our unit, is analyzed each item in tokens, performs merging, renaming or fractured operation according to transformational rule.With
Above-mentioned analysis has not same, analyzes in the algorithm in the case of node does not comprise any function call, if its in-degree
Or when out-degree is zero, then it is left intact.Because when conditions above is set up, this node is start node or end
Node, it is possible to be left intact.
The extraction in the function call path in the embodiment of the present invention is extracted based on function call relationship graph, inventor's base
In to the research for many years from function call graph to the conversion in function call path, it is proposed that the method for the embodiment of the present invention, permissible
Going to calculate the reachable path between start node to end node with simple method, each paths obtained is function
Call path.
In order to prove the effectiveness of embodiment of the present invention method, illustrated by an experiment at this:
In an experiment, judge program that statement and Do statement are nested and comprise ternary operator by including
Recursive function, to paper propose function call path extraction method based on controlling stream graph verify.
Example 1: select loop nesting function
Left part in Fig. 7 is the experiment source code of case statement use nested with Do statement, application definition two
Individual variable is as the Rule of judgment of branches different in program, after getting the two variable, enters while according to the value of variable
Circulation, it is then determined that function to be performed, in once circulation, only one of which function can perform, and once execution f2 will jump
Go out circulation.This program code, therefore can corresponding a plurality of function call path because the difference of variate-value can perform different functions.
Source code, after processing through gcc, generates GCC-CFG intermediate code as shown in Figure 7.Source code is carried out by gcc
Optimize, state the execution efficiency of multiple variable Optimized code, and do not affect the control logic of program.Then in this
Between code carry out static analysis, by 2.2 joint controlling stream graph extraction algorithms, intermediate code is converted to right part in Fig. 7
The controlling stream graph information of json form, totally 11 nodes, 12 limits.Then controlling stream graph is drawn by graphviz, such as Fig. 8
Shown in left side.
Use CFG2FCG algorithm that the controlling stream graph on the left of Fig. 8 is converted into the function call relationship graph on right side.At 9 joints
In point, only<bb 4>with<bb 5>comprise function call, and only comprise a function call, so performing rename behaviour
Make (<bb 2>function call scanf comprised is built-in function, and in experimental code, statement does not realizes, during CFG2FCG
Ignore this function call);Other nodes do not comprise function call, delete after finishing union operation.Finally, calculate from main to
The reachable path of end is 6, and details is shown in Table 2.
Table 2 function call patch test result
By analyzing 5 function call paths, the value condition of variable on the right side of upper table can be obtained.Hold when not entering circulation
Row the 1st paths;Perform else statement after entering circulation, generate the 2nd paths;After entering circulation execution, after if is judged as very
Call f1, be then again introduced into loop body and perform f2, generate the 3rd paths;F1 is performed a plurality of times after entering circulation, then performs
F2, generates the 4th paths;After entering circulation, only carry out a f1, be then log out circulation, generate the 5th paths.Entrance follows
After ring, f1 is performed a plurality of times, is then log out circulation, generate the 6th paths.
Test result indicate that, the function call path extracted is consistent with the expection of manual analysis, shows based on controlling stream
The function call path extraction method of figure, the path of calling that can correctly extract function in this example obtains the structure letter of program
Breath.
Example 2: the recursive function containing ternary operator
The Fibonacci Sequence function source code that Fig. 9 (a) realizes for using ternary operator, principal function is by following
Ring repeatedly calls fib function.Source code, after processing through gcc, generates GCC-CFG intermediate code, and wherein Fig. 9 (b) is main
The mid portion of function, Fig. 9 (c) is the mid portion of fib function.Equally, source code is optimized by gcc, states many
The execution efficiency of individual variable Optimized code, and do not affect the control logic of program.
Main function calls fib function in loop body, and its function calling relationship should be that fib points to oneself, and execution is many
Secondary.Fib function is in the internal recursive call oneself of function, and its function calling relationship should be fib equally and points to oneself.
Static analysis intermediate code obtains the controlling stream graph information of json form, as shown in Fig. 9 (d).Pass through graphviz
Draw the controlling stream graph of function, shown in the controlling stream graph of main function such as Figure 10 (a), the controlling stream graph of fib function such as Figure 10 (b)
Shown in.
CFG2FCG algorithm is used controlling stream graph to be converted into function call graph, in same main function<bb3>'s
Scanf is built-in function, and in experimental code, statement does not realizes, and ignores this function call during CFG2FCG.Main function
Have shown in identical function call pathway figure such as Figure 10 (c) with fib function, consistent with manual analysis before.The overall situation generated
Shown in function call Figure 10 (d), this function call relationship graph is fairly simple, no longer list analysis.Function from main to end is adjusted
Three are come to path:
This example can extract the infeasible paths obtained by static analysis.In this example, due to the value of i value it is
Pre-determined, its function call path also determines that, the third path in the most above-mentioned analysis.Spy according to static analysis
Point, it appeared that all possible function call path during function execution.This feature may apply to security fields, finds
The infeasible paths that may be utilized by hacker.Such problem will not be produced in dynamically analyzing, dynamically analyze and only can send out
Be bound in currently designed good test case the function call path performed.
A kind of method proposing new extraction function call path in the embodiment of the present invention:
First, obtain the GCC-CFG form intermediate code of source code by gcc compiler, by using pattern-
The pattern of action, this code of static analysis extracts the controlling stream graph of function;
Then, according to controlling stream graph, the number that each node in controlling stream graph is comprised function call is classified,
Complete the controlling stream graph conversion to the call graph of function;
Finally, merge the function call relationship graph of all functions, extract function call path.Function call path can lead to
The reachable path crossing the origin-to-destination analyzing function call relationship graph obtains.
Being experimentally confirmed, function call path extraction method based on controlling stream graph is effective, can be with simplified function
Call the analysis process in path, and be easier to obtain function call path accurately.
The list of references quoted in the embodiment of the present invention is as follows, is quoted in full by these lists of references in the embodiment of the present invention
In this:
Mu Yongmin, Li Huili. priorities of test cases based on function call path sequence [J]. computer engineering,
2014,40 (7): 242-246
Mu Yongmin, Yang Zhijia. software based on function call path realizes verifying [J] with Design consistency. Chinese science:
Information science, 2014,10:1290-1304
Foster J S,Terauchi T,Aiken A.Flow-Sensitive Type Qualifiers[J]
.Proc.acm Conf.programming Language Design&Implementation Acm Press,2002,37
(5):1-12.
Adams S,Ball T,Das M,et al.Speeding Up Dataflow Analysis Using Flow-
Insensitive Pointer Analysis[J].Sas Lncs,2002:117--132.
Evans D,Guttag J,Horning J,et al.LCLint:A Tool for Using
Specifications to Check Code[J].Fse,2002,19(5):87--96.
Zheng Y H,Mu Y M,Zhang Z H.Research on the static function call path
generating automatically.In:Proceedings of Information Management and
Engineering,Chengdu,2010.405–409
Mu Y M,Zheng Y H,Zhang Z H,et al.The algorithm of infeasible paths
extraction oriented the function calling relationship.Chinese J Electron,
2012,21:236–240
Mu Yongmin, Liu Mengting. C++ heavy duty uniqueness based on finite state machine determines [J]. computer utility is studied,
2014,31(4):1059-1062
Liu D F,Mu Y M,He Y J,et al.Generation of Static Function Calling
Paths in C++Based on Finite-State Machine[C].Applied Mechanics and
Materials.2014,568:1497-1504
Zhang Zhi China, Mu Yongmin. path based on function call covers Generation Technology [J]. electronic letters, vol, 2010,
138:1808-1811
Yan M M,Mu Y M,He Y J,et al.The Analysis of Function Calling Path in
Java Based on Soot[C].Applied Mechanics and Materials.2014,568:1479-1487
Huang J C.Program Instrumentation and Software Testing[J].Computer,
1978,11(4):25-32.
Mu Yongmin, Jiang Zhi are glimmering, Zhang Zhihua. towards the path extraction [J] of c program plug-in mounting. and computer engineering and application,
2011,47(1):67-69
Mu Y M,Li H L,Jiang B,et al.The Splitting and Matching Algorithm of
Dynamic Path Oriented the Function Calling Relationship[C].Intelligent Human-
Machine Systems and Cybernetics(IHMSC),2013 5th International Conference
on.IEEE,2013,2:343-346
Xu A P,Mu Y M,Zhang Z H,et al.The Dynamic Function Calling Path
Generation Based on Instrumentation[C].Applied Mechanics and Materials.2014,
568:1469-1478
Zhong Fangting, Liu Chao, Jin Maozhong. the improvement [J] of instrumentation in program dynamic analysis system. computer engineering with set
Meter, 2007 (28), 4585-4588
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art
For, on the premise of without departing from principle of the present invention, it is also possible to make some improvements and modifications, these improvements and modifications are also
Should be regarded as protection scope of the present invention.
Claims (7)
1. a function call path extraction method based on controlling stream graph, it is characterised in that including:
Step 1, the intermediate code of acquisition source code, and be analyzed intermediate code the code block in intermediate code is carried out
Identify, and generate controlling stream graph with this;Node in described controlling stream graph be the connecting line between code block, and node be code
Annexation between block;
Step 2, the function call number being comprised the node in controlling stream graph are analyzed, so that controlling stream graph is converted to letter
The call graph of number.
Function call path extraction method based on controlling stream graph the most according to claim 1, it is characterised in that described step
Rapid 1 specifically includes:
Step 11, utilize gcc compiler obtain source code GCC-CFG form intermediate code;
Step 12, the code block obtained in intermediate code, and by following preset rules function body be identified and obtain
Function call information:
Rule P 1:([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9];
Rule P 2:<bb [ ] [ 0-9 ] +\>;
Rule P 3:<L [0-9]+>
Rule P 4:((goto) []<bb [ ] [ 0-9 ] +\>)
Rule P 5:((goto) []<bb [ ] [ 0-9 ] +\>) [] (<l [ 0-9 ] +\>)
Rule P 6:([_a-zA-Z] * [a-zA-Z0-9] [] [()];
When code block mates with rule P 1, then this code block is function declaration;The code block of matched rule P1 is performed initialization
Operation, then updates the parameter value relevant to code block, and wherein said parameter value is following at least one: function number, joint
Point number, limit number;
When code block mates with rule P 2, then this code block is bb code block original position;When code block mates with rule P 3
Then this code block is L code block;The code block of matched rule P2 and rule P 3 is needed to process the information of its last layer code block,
And output function recalls information, then update the parameter value relevant to code block;Wherein said parameter value is following at least one
Kind: function number, node number, limit number;
When code block mates with rule P 4, then this code block is goto statement;Then this code when code block mates with rule P 5
Block is special format goto statement;It is designated whether to contain to the code block of matched rule P4 and rule P 5 and redirects, then basis
Jump information generates the limit connecting this node with the node pointed by goto;
When code block mates with rule P 6, then this code block is function call;Obtain its function call information;
Step 13, basis identify code block and generate controlling stream graph;Each node in wherein said controlling stream graph is a generation
Code section, will be attached according to control stream information between node by connecting line.
Function call path extraction method based on controlling stream graph the most according to claim 2, it is characterised in that described step
Rapid 1 also includes: use three enumerators to record the number of function, node number, limit number;And use three enumerators to remember
The record function call that whether code block comprises goto statement, whether code block comprises function call, code block comprises is detailed.
Function call path extraction method based on controlling stream graph the most according to claim 1, it is characterised in that described step
Rapid 2 specifically include:
If code block does not has function call, node corresponding for this code block is merged in its upper strata or lower level node;
If code block has and only one of which function call, using this code block as a node in function call relationship graph
And connect other nodes according to function calling relationship;
If code block has multiple function call, then by corresponding for each function in a controlling stream graph node and by these nodes
Between link together according to function calling relationship, then according to and the node of other code blocks between function calling relationship even
Connect other nodes.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step
In rapid 2, code block has method during multiple function call to specifically include:
Code block N1 comprise function call Funs (f1, f2 ..., fn), and node N1 have one or more upper strata joint
Point is N0s, and one or more lower level node is N2s;
Make function call Funs (f1, f2 ..., fn) in the corresponding node of each function, and suitable according to function call
These nodes are linked in sequence by sequence;
Delete upper layer node N0s and point to the connecting line of code block N1, and upper layer node N0s is connected the order in code block N1 the
The function f1 of one, and function fn last for order is connected lower level node N2s, to generate one from upper layer node N0s~code
The control stream that all function calls of block N1~of lower level node N2 are made up of node and connecting line.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step
In rapid 2, code block does not has method during function call to specifically include:
If the code block in controlling stream graph does not has function call, then judge that the node that this code block is corresponding the most only connects one
Upper layer node and/or only one lower level node of connection, be if it is merged into this node upper layer node or be merged into lower floor's joint
Point;If it is not, then this node to be merged into upper layer node and lower level node simultaneously, or it is non-functional by this node identification.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step
Also include in rapid 2:
The storage of json form is used to analyze the controlling stream graph information obtained;The data of wherein said json form at least include:
For the Function Array of representative function controlling stream graph information, this array includes for recording function name and controlling stream graph information
Parameter, be wherein divided into again parameter and the record node name of connecting line between record node for recording the parameter of controlling stream graph information
The parameter claimed, contains function call array in the parameter of record nodename;With the single function in controlling stream graph as base
Our unit, the function call information according to comprising in code block processes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610541747.7A CN106227573A (en) | 2016-07-11 | 2016-07-11 | Function call path extraction method based on controlling stream graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610541747.7A CN106227573A (en) | 2016-07-11 | 2016-07-11 | Function call path extraction method based on controlling stream graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106227573A true CN106227573A (en) | 2016-12-14 |
Family
ID=57519540
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610541747.7A Withdrawn CN106227573A (en) | 2016-07-11 | 2016-07-11 | Function call path extraction method based on controlling stream graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106227573A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951744A (en) * | 2017-03-15 | 2017-07-14 | 北京深思数盾科技股份有限公司 | The guard method of executable program and device |
CN108881032A (en) * | 2018-06-19 | 2018-11-23 | 福州大学 | A kind of P4 track performance method for improving based on matching optimization |
CN109189758A (en) * | 2018-07-26 | 2019-01-11 | 新华三技术有限公司 | O&M flow designing method, device and equipment, operation method, device and host |
CN109542942A (en) * | 2018-11-28 | 2019-03-29 | 网易(杭州)网络有限公司 | Querying method and device, the electronic equipment of function call |
CN109656568A (en) * | 2018-12-28 | 2019-04-19 | 黑龙江省工业技术研究院 | On-demand reducible program control flowchart figure accessibility indexing means |
CN110543427A (en) * | 2019-09-06 | 2019-12-06 | 五八有限公司 | Test case storage method and device, electronic equipment and storage medium |
CN112181808A (en) * | 2020-09-08 | 2021-01-05 | 北京邮电大学 | Program concurrency defect detection method, device, equipment and storage medium |
CN113760700A (en) * | 2020-08-06 | 2021-12-07 | 北京京东振世信息技术有限公司 | Program endless loop detection method, device, electronic equipment and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138335A (en) * | 2015-08-28 | 2015-12-09 | 牟永敏 | Function call path extracting method and device based on control flow diagram |
-
2016
- 2016-07-11 CN CN201610541747.7A patent/CN106227573A/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138335A (en) * | 2015-08-28 | 2015-12-09 | 牟永敏 | Function call path extracting method and device based on control flow diagram |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951744B (en) * | 2017-03-15 | 2019-12-13 | 北京深思数盾科技股份有限公司 | protection method and device for executable program |
CN106951744A (en) * | 2017-03-15 | 2017-07-14 | 北京深思数盾科技股份有限公司 | The guard method of executable program and device |
CN108881032A (en) * | 2018-06-19 | 2018-11-23 | 福州大学 | A kind of P4 track performance method for improving based on matching optimization |
CN108881032B (en) * | 2018-06-19 | 2021-01-29 | 福州大学 | P4 pipeline performance improving method based on matching optimization |
CN109189758B (en) * | 2018-07-26 | 2021-02-09 | 新华三技术有限公司 | Operation and maintenance flow design method, device and equipment, operation method, device and host |
CN109189758A (en) * | 2018-07-26 | 2019-01-11 | 新华三技术有限公司 | O&M flow designing method, device and equipment, operation method, device and host |
CN109542942A (en) * | 2018-11-28 | 2019-03-29 | 网易(杭州)网络有限公司 | Querying method and device, the electronic equipment of function call |
CN109542942B (en) * | 2018-11-28 | 2021-09-24 | 网易(杭州)网络有限公司 | Function call query method and device and electronic equipment |
CN109656568A (en) * | 2018-12-28 | 2019-04-19 | 黑龙江省工业技术研究院 | On-demand reducible program control flowchart figure accessibility indexing means |
CN109656568B (en) * | 2018-12-28 | 2022-04-05 | 黑龙江省工业技术研究院 | On-demand contractable program control flow graph reachability indexing method |
CN110543427A (en) * | 2019-09-06 | 2019-12-06 | 五八有限公司 | Test case storage method and device, electronic equipment and storage medium |
CN113760700A (en) * | 2020-08-06 | 2021-12-07 | 北京京东振世信息技术有限公司 | Program endless loop detection method, device, electronic equipment and storage medium |
CN112181808A (en) * | 2020-09-08 | 2021-01-05 | 北京邮电大学 | Program concurrency defect detection method, device, equipment and storage medium |
CN112181808B (en) * | 2020-09-08 | 2022-06-28 | 北京邮电大学 | Program concurrency defect detection method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106227573A (en) | Function call path extraction method based on controlling stream graph | |
US11036614B1 (en) | Data control-oriented smart contract static analysis method and system | |
CN105138335B (en) | A kind of function call path extraction method and device based on controlling stream graph | |
Huang et al. | Cldiff: generating concise linked code differences | |
Kamimura et al. | Extracting candidates of microservices from monolithic application code | |
CN107704382B (en) | Python-oriented function call path generation method and system | |
Higo et al. | Refactoring support based on code clone analysis | |
Bastide et al. | Petri net objects for the design, validation and prototyping of user-driven interfaces. | |
CN107193739A (en) | A kind of black box regression testing method | |
Higo et al. | On software maintenance process improvement based on code clone analysis | |
Hamou-Lhadj et al. | A metamodel for the compact but lossless exchange of execution traces | |
CN113508385B (en) | Method and system for formal language processing using subroutine graph | |
Koni-N’Sapu | A scenario based approach for refactoring duplicated code in object oriented systems | |
CN110162474A (en) | A kind of intelligent contract reentry leak detection method based on abstract syntax tree | |
Dwyer et al. | A compact petri net representation and its implications for analysis | |
CN113835952B (en) | Linux system call monitoring method based on compiler code injection | |
El-Boussaidi et al. | Detecting patterns of poor design solutions using constraint propagation | |
CN113010400B (en) | Computer processing technology document intelligent generation and multiple disk system and method | |
Trifu | Improving the dataflow-based concern identification approach | |
JP2002288004A (en) | Program source processing device and method, and program source processing program | |
Georget et al. | Kayrebt: An activity diagram extraction and visualization toolset designed for the Linux codebase | |
CN109117142A (en) | A kind of fundamental type reconstructing method based on variable association tree | |
Kaplan et al. | An architecture for tool integration | |
Stepney et al. | AZ Patterns Catalogue: I | |
Ngo et al. | Automated Extraction of database interactions in web applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20161214 |