CN106227573A - Function call path extraction method based on controlling stream graph - Google Patents

Function call path extraction method based on controlling stream graph Download PDF

Info

Publication number
CN106227573A
CN106227573A CN201610541747.7A CN201610541747A CN106227573A CN 106227573 A CN106227573 A CN 106227573A CN 201610541747 A CN201610541747 A CN 201610541747A CN 106227573 A CN106227573 A CN 106227573A
Authority
CN
China
Prior art keywords
node
function
code block
function call
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201610541747.7A
Other languages
Chinese (zh)
Inventor
牟永敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN201610541747.7A priority Critical patent/CN106227573A/en
Publication of CN106227573A publication Critical patent/CN106227573A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention provides a kind of function call path extraction method based on controlling stream graph, including: step 1, the intermediate code of acquisition source code, and intermediate code is analyzed so that the code block in intermediate code is identified, then generate controlling stream graph with this;Node in described controlling stream graph be the connecting line between code block, and node be the function calling relationship in code block;Step 2, the function call number being comprised the node in controlling stream graph are analyzed, so that controlling stream graph to be converted to the call graph of function.

Description

Function call path extraction method based on controlling stream graph
Technical field
The present invention relates to computer software technical field, refer in particular to a kind of function call path based on controlling stream graph and carry Access method.
Background technology
One program can produce a plurality of function call path because of the case statement containing decision condition and control statement [1].In programming, source code typically has sequentially, selects, circulates three kinds of sentence structures.Sequential statement does not increase function and adjusts It is only possible to produce more program branch with the bar number in path, only case statement and Do statement.As a example by C language, crucial Word if, for, while, switch will produce a plurality of execution statement.So, these statements that may produce branch are to extract letter The emphasis of number call graph.
The method extracting function call path is broadly divided into two kinds, and one is to analyze source by static analysis tools [3-5] Code, the interactive information between method call and module in extraction procedure, determine turning to of the control stream between module, use Automat or other means set up function call relationship graph, and then extract function call path.Static analysis refers to be not required to Under conditions of source code to be performed being analyzed source code, be relative with dynamically analyzing, dynamically analyzing is in execution source generation Under conditions of Ma, source code is analyzed.
The another kind of method more intuitively extracting function call path is program inserting method [12].By in source code or remittance Compile in code and insert the probe functions designed, when program performs again, entrance function can be collected and exit function Plug-in mounting information, based on these plug-in mounting information, it is possible to obtain information such as the control streams of program, so extract function call path [13-15].Decorate stream split algorithm [15] or other algorithms realize the extraction work in function call path wherein it is possible to use. At present, in automatic test field, object code plug-in mounting and source code plug-in mounting are the plug-in mounting investigative techniques [16] of main flow.But The integrity of dynamic instrumentation method depends on choosing of test case, if test case is chosen incomplete, can cause test process Insufficient.
Summary of the invention
For problems of the prior art, the technical problem to be solved in the present invention is to provide a kind of based on controlling stream graph The method extracting function call path, it is possible to simply carry out the extraction in function call path accurately.
In order to solve the problems referred to above, the embodiment of the present invention proposes a kind of function call path extraction based on controlling stream graph Method, including:
Step 1, the intermediate code of acquisition source code, and intermediate code is analyzed with to the code block in intermediate code It is identified, and generates controlling stream graph with this;Node in described controlling stream graph is that the connecting line between code block, and node is Annexation between code block;
Step 2, the function call number being comprised the node in controlling stream graph are analyzed, to be changed by controlling stream graph Call graph for function.
Wherein, described step 1 specifically includes:
Step 11, utilize gcc compiler obtain source code GCC-CFG form intermediate code;
Step 12, the code block obtained in intermediate code, and by following preset rules, function body is identified also Acquisition function call information:
Rule P 1:([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9];
Rule P 2:<bb [ ] [ 0-9 ] +\>;
Rule P 3:<L [0-9]+>
Rule P 4:((goto) []<bb [ ] [ 0-9 ] +\>)
Rule P 5:((goto) []<bb [ ] [ 0-9 ] +\>) [] (<l [ 0-9 ] +\>)
Rule P 6:([_a-zA-Z] * [a-zA-Z0-9] [] [()];
When code block mates with rule P 1, then this code block is function declaration;At the beginning of code block execution to matched rule P1 Beginningization operates, and then updates the parameter value relevant to code block, and wherein said parameter value is following at least one: function Number, node number, limit number;
When code block mates with rule P 2, then this code block is bb code block original position;When code block and rule P 3 Timing then this code block is L code block;Need to process its last layer code block to the code block of matched rule P2 and rule P 3 Information, and output function recalls information, then update the parameter value relevant to code block;Wherein said parameter value be following extremely Few one: function number, node number, limit number;
When code block mates with rule P 4, then this code block is goto statement;Then should when code block mates with rule P 5 Code block is special format goto statement;It is designated whether to contain to the code block of matched rule P4 and rule P 5 and redirects, then The limit connecting this node with the node pointed by goto is generated according to jump information;
When code block mates with rule P 6, then this code block is function call;Obtain its function call information;
Step 13, basis identify code block and generate controlling stream graph;Each node in wherein said controlling stream graph is one Individual code segment, will be attached according to control stream information between node by connecting line.
Wherein, described step 1 also includes: use three enumerators to record the number of function, node number, limit number; And whether record code block comprises goto statement, whether code block comprises function call, code block bag to use three enumerators The function call contained is detailed.
Wherein, described step 2 specifically includes:
If code block does not has function call, node corresponding for this code block is merged into its upper strata or lower level node In;
If code block has and only one of which function call, using this code block as in function call relationship graph Node also connects other nodes according to function calling relationship;
If code block has multiple function call, then by corresponding for each function in a controlling stream graph node and by these Linking together according to function calling relationship between node, then the function call between basis and the node of other code blocks is closed System connects other nodes.
Wherein, described step 2 has method during multiple function call to specifically include in code block:
Code block N1 comprise function call Funs (f1, f2 ..., fn), and node N1 has on one or more Node layer is N0s, and one or more lower level node is N2s;
Make function call Funs (f1, f2 ..., fn) in the corresponding node of each function, and according to function call Order these nodes are linked in sequence;
Delete upper layer node N0s and point to the connecting line of code block N1, and upper layer node N0s is connected in code block N1 suitable The function f1 of sequence first, and the most last function fn of order is connected lower level node N2s, with generate one from upper layer node N0s~ The control stream that all function calls of code block N1~of lower level node N2 are made up of node and connecting line.
Wherein, described step 2 does not has method during function call to specifically include in code block:
If the code block in controlling stream graph does not has function call, then judge that the node that this code block is corresponding the most only connects One upper layer node and/or only one lower level node of connection, be if it is merged into upper layer node by this node or be merged into down Node layer;If it is not, then this node to be merged into upper layer node and lower level node simultaneously, or it is non-letter by this node identification Number.
Wherein, described step 2 also includes:
The storage of json form is used to analyze the controlling stream graph information obtained;The data of wherein said json form are at least wrapped Including: for the Function Array of representative function controlling stream graph information, this array includes for recording function name and controlling stream graph The parameter of information, is wherein divided into again parameter and the record joint of connecting line between record node for recording the parameter of controlling stream graph information The parameter of some title, contains function call array in the parameter of record nodename;With the single function in controlling stream graph For ultimate unit, the function call information according to comprising in code block processes.
Having the beneficial effect that of the technique scheme of the present invention: function call path be one by program entry point to going out The function name sequence of mouth point, combines control logical AND function call, code analysis granularity is expanded to function from statement.Pin To extracting function call path the most exactly, the embodiment of the present invention proposes function call path extraction based on controlling stream graph Method: first, obtains by gcc compiler and comprises the intermediate code controlling flow message, analyze intermediate code and extract the control of function Flow graph processed;Then, according to controlling stream graph and the incidence relation of function call relationship graph, by specific algorithm, controlling stream graph is carried out Process, obtain function call relationship graph, extract all of function call path.Test result indicate that, the embodiment of the present invention proposes Method can obtain function call path exactly.
Accompanying drawing explanation
Fig. 1 is extraction and the Analytic principle schematic diagram of the if_test-controlling stream graph of the embodiment of the present invention;
Fig. 2 is extraction and the Analytic principle schematic diagram of the if_test2-controlling stream graph of the embodiment of the present invention;
Fig. 3 is extraction and the Analytic principle schematic diagram of the for_test-controlling stream graph of the embodiment of the present invention;
Fig. 4 is extraction and the Analytic principle schematic diagram of the while_test-controlling stream graph of the embodiment of the present invention;
Fig. 5 is the schematic diagram that the node of the embodiment of the present invention does not comprise function call;
Fig. 6 is the controlling stream graph storage mode schematic diagram of the embodiment of the present invention;
Fig. 7 is the code analysis process schematic selecting loop nesting function;
Fig. 8 is controlling stream graph function call graph;
Fig. 9 is to analyze procedure chart containing ternary operator recursive function code sample;
Figure 10 calls figure for the recursive function sample control flow graph function containing ternary operator.
Detailed description of the invention
For making the technical problem to be solved in the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and tool Body embodiment is described in detail.
Step 1, controlling stream graph extraction step: in order to ensure the accuracy of function call relationship graph, need to extract accurately The controlling stream graph of function.
The embodiment of the present invention uses gcc compiler source code carries out pretreatment, obtain and comprise in control flow message Between code, and then complete the extraction work of controlling stream graph.
Gcc is a powerful C language compiler, contains substantial amounts of function choosing-item for controlling the mistake of compiling link Journey, wherein "-fdump-tree " option can obtain the gcc pretreatment information to source code, selects the most suitable sub-option, Gcc can generate form properly, accurately in the middle of Debugging message." cfg " sub-option can generate the middle generation of class controlling stream graph Code.Now some simple code being carried out example case study, wherein comprise the analysis examples of if statement, its analysis can be such as Fig. 1 institute Show.Fig. 2 is the analysis examples of directly return after if conditional statement has performed.
As can be seen from Figure 1 intermediate code is divided into two parts: function declaration and function body, this format code is referred to as GCC-CFG intermediate code.Function declaration part contains the index functions information within gcc, for control flow graph Obtain without the biggest help.Function body part is that source code is carried out pretreated result, is the Partitioning Expression of A to source code, will Whole code is divided into simple code block, code block internal code order to perform, or jump to another one according to goto statement Code block.Wherein<bb>module represents basic code block basic block, and gcc may during code analysis simultaneously Merge a part of code, cause a part of code block to there is two or more name.Goto statement in basic code block can With the execution sequence between reflection code block.By GCC-CFG intermediate code is carried out static analysis, mate respectively code block, Goto statement i.e. can obtain the controlling stream graph on right side in Fig. 1, Fig. 2.
The analysis examples that comprises for statement is as it is shown on figure 3, additionally use while circulation and realize and the showing of Fig. 3 identical function Example code is as shown in Figure 4.Two circulation code samples use for and while to achieve identical function respectively, pass through The GCC-CFG intermediate code that gcc obtains also is identical, is depicted as controlling stream graph display result consistent.
Owing to GCC-CFG intermediate code has distinguishing mark statement, the embodiment of the present invention uses a kind of pattern- The mode of action carries out static analysis to it.Wherein, pattern is a kind of pattern match or rule match, and action is When after the code string (or referred to as token) matching specified rule, the relevant action of execution.Centre for GCC-CFG form For code, the pattern part of main coupling is function declaration, basic code block<bb*>and skip instruction goto.
Rule pattern is listed as follows:
Table 1 numerical characteristic
Sequence number Rule Explanation
P1 ([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9] Adaptation function is stated
P2 \<bb[][0-9]+\> Coupling bb code block starts
P3 \<L[0-9]+\> Coupling L code block
P4 ((goto)[]\<bb[][0-9]+\>) Coupling goto statement, connects statement
P5 ((goto)[]\<bb[][0-9]+\>)[]\(\<L[0-9]+\>\) Coupling special format goto statement
P6 ([_a-zA-Z]*[a-zA-Z0-9][][\(]) Adaptation function calls
Having 6 kinds of rules in table 1, left side is rule numbers, and centre is the class regular expression of this rule, and right side is rule Explanation.The extraction of controlling stream graph needs to find 3 key contents: function declaration, basic code block, skip instruction.Controlling stream graph It is to extract with function for processing unit.The step extracting control flow graph according to rule P 1-P5 is as follows:
It is respectively adopted rule P 1-P6 GCC-CFG intermediate code is mated:
When matching P1 rule, then representing to match and put function declaration, it is the function in GCC-CFG intermediate code Definition.When matching P2 and P3 rule, beginning or L code block that it is bb function code block are described.In a function Portion, one basic code section of each node on behalf of controlling stream graph, basic code section in GCC-CFG intermediate code be with < Bb*>or<L*>form represent, rule P 2 and P3 is for finding out each intermediate code within function.Matching P4 During with P5 rule, illustrate inside function, there is goto statement.Inside function, each limit of controlling stream graph illustrates one and redirects Statement.In GCC-CFG intermediate code, produce and redirected two kinds of situations: one is not have any redirecting in a code block Statement, so entering next adjacent code block or end according to execution sequence, now produces the limit that an order performs (edge);Another kind is to have skip instruction, namely goto statement in code block, and every goto statement all can produce a jumping Turn, generate a new limit (edge).
The full content of controlling stream graph can be obtained by above three steps, but for generating function call graph, control Each node in flow graph processed needs more to assist information.Some difference of controlling stream graph and function call relationship graph is Content represented by node is different, and controlling stream graph node represents a code segment, and function call relationship graph node represents a letter Number.In order to controlling stream graph to be converted to function call relationship graph, need when extracting controlling stream graph, by each basic generation Function call information in code block is retained in each node.From Fig. 1-Fig. 4 it can be seen that inside code block, function Calling is that very simple order is called, and does not has redirecting of complexity.So needing to come according to P6 obtaining controlling stream graph when Adaptation function calls, and function call information is retained in the node of controlling stream graph in order.
Need to perform different actions (action) after matching different rules, specifically include:
After matching P1, need to perform initialization operation, the enumerator such as renewal function number, node number, limit number Value;
After matching P2 or P3, need to process the information of a code block, form output node as required Information includes the function call information in this node, the value that final updating is relevant to code block;
After matching P4 or P5, need to arrange is_bb_with_goto, i.e. whether this node comprises and redirects, and is used for Auxiliary mastery routine judges whether to connect present node and next node.If comprising jump information, then generate and connect this node Limit with the node pointed by goto;After matching P6, need to store corresponding function call information.
Therefore the extraction algorithm of the controlling stream graph in the embodiment of the present invention is as follows:
In algorithm above, use fun_num, node_num, edge_num these three counting for function and code block Device, records the number of function, the number of some function interior joint, the number on limit respectively;Also it is the use of is_bb_with_ Goto, is_bb_with_function, called_functions these three enumerator comes whether record code block comprises goto Whether statement, code block comprise function call, contain those function calls;Wherein yytext refers to arrive according to rule match Title.And definition print_node and print_edge is for generating or export the control stream of specified format as required Figure, as generated adjacency list or generating structure document (XML, JSON form) storage to hard disk in internal memory.
Step 2, function call relationship graph extraction step: according to controlling stream graph generating function call graph.
In the file depositing controlling stream graph information, for each basic code block (basic black), or L code Block, the number according to comprising function call inside it can be divided three classes: does not has function call, only one of which function call, letter Number calls number more than one.Classification difference, its processing mode is the most different, according to different classification, it is carried out different places Reason, can be converted to function call relationship graph by controlling stream graph.
(1) for there is no the code block of function call: during controlling stream graph transfers function call relationship graph to, if Code block does not has function call, typically can be in the way of taking to delete this node.The most in FIG, code block<bb 2>in Not comprising any function call, only one of which node points to this code block simultaneously, so now deleting this node or title It is correct for this node " is upwards merged ".But in the case of one is special, a controlling stream graph node points to multiple , there is multiple node simultaneously and point to this node in node, now this node can not be deleted.In order to the analysis more refined is without function Call situation, according to controlling stream graph node in-degree and the difference of out-degree, 4 kinds as shown in Figure 5 will be divided into without function call node Situation.
For 1-3 kind situation in Fig. 5, the strategy of " merging " can be used.Union operation be divided into " upwards merge " and " to Lower merging " two kinds.Upwards merge and refer to that the relevant information of this node merges with the upper layer information of this node, downstairs merger phase therewith Instead.Upwards union operation: merged node is N1, upper layer node is N0, and lower level node is one or more N2s that are expressed as, and closes And operating the limit pointing to N1 for deleting N0, N0 points to all of N2s, last deletion of node N1 simultaneously.Downstairs merger operates: closed And node is N1, upper layer node is that one or more is expressed as N0s, and lower level node is N2, and union operation points to for deleting N1 The limit of N2, the most all of N0s points to N2, last deletion of node N1.Any one can be used to merge behaviour for the 1st kind of situation Making, final result is the same;For the 2nd kind of situation, upwards union operation can only be performed, can only perform for the 3rd kind of situation Downstairs merger operates.
For the in Fig. 5 the 4th kind of situation, this node can use two kinds of methods to process, and selects 1: delete this node, then hold Row upwards merges and downstairs merger two operation;Select 2: if retaining this node, then need this node is specified a spy Different title, represents that this node is not a function.First method is suitable for obtaining function call path, and second method is suitable for Programmer analyzes reading, makes call graph more simple and clear.
(2), in the case of for a code block has a function call just, nodename is directly replaced, the most just It is renaming, by original<bb*>or<l*>nodename RNTO function name.
(3), in the case of for a code block comprises multiple function call, " division " can be used to operate.Such as control A node N1 in flow graph comprise function call Funs (f1, f2 ..., fn), the upper layer node (one or more) of node N1 Being expressed as N0s, lower level node (one or more) is expressed as N2s.Splitting operation is: first, for each function in Funs Create a node (if this function node has existed, then without re-creating), and these nodes that are linked in sequence;So After, deleting N0s and point to the limit of N1, N0s points to f1 simultaneously;Finally, fn is pointed to node N2s.Thus can generate one from One limit of all function calls of N0s---N1---lower level node N2.
Controlling stream graph is that the node type according to controlling stream graph is done to the core concept of function call relationship graph transfer algorithm Corresponding process, and detailed transfer algorithm is based on a kind of concrete data structure.The embodiment of the present invention uses json lattice The controlling stream graph information obtained is analyzed in formula storage.
Json is the data interchange format of a kind of lightweight, and the format write of its data is key:value pair, wherein Value can be numerical value, character string or array.The embodiment of the present invention needs multiple single instrument with the use of, so Use json perdurable data, facilitate the distinct program process to analysis result.As shown in Figure 6, left side code is a C language Initialization program in the mysql data base call program of speech version, centre is the intermediate code of its GCC-CFG form, right side CFG data for json form.In embodiments of the present invention, the data form of json is:
Functions respective function array, each member in array represents the controlling stream graph information of a function, bag Include funciton_name and tokens;The name of function_name representative function;
Tokens represents the controlling stream graph information that this function is corresponding, and type is divided into node node and edge limit;Node wraps Containing nodename node_name, and function call array called_functions that this node comprises, edge includes controlling The starting point node begin and peripheral node end on a limit in flow graph.
For data above structure, the embodiment of the present invention proposes and a kind of is transferred to function call relationship graph by controlling stream graph Algorithm, this algorithm is as shown in table 2, and its input is the data file comprising controlling stream graph information, is output as comprising function call The data file of graph of a relation information.In the algorithm of the embodiment of the present invention, in controlling stream graph data file, single function is as base Our unit, is analyzed each item in tokens, performs merging, renaming or fractured operation according to transformational rule.With Above-mentioned analysis has not same, analyzes in the algorithm in the case of node does not comprise any function call, if its in-degree Or when out-degree is zero, then it is left intact.Because when conditions above is set up, this node is start node or end Node, it is possible to be left intact.
The extraction in the function call path in the embodiment of the present invention is extracted based on function call relationship graph, inventor's base In to the research for many years from function call graph to the conversion in function call path, it is proposed that the method for the embodiment of the present invention, permissible Going to calculate the reachable path between start node to end node with simple method, each paths obtained is function Call path.
In order to prove the effectiveness of embodiment of the present invention method, illustrated by an experiment at this:
In an experiment, judge program that statement and Do statement are nested and comprise ternary operator by including Recursive function, to paper propose function call path extraction method based on controlling stream graph verify.
Example 1: select loop nesting function
Left part in Fig. 7 is the experiment source code of case statement use nested with Do statement, application definition two Individual variable is as the Rule of judgment of branches different in program, after getting the two variable, enters while according to the value of variable Circulation, it is then determined that function to be performed, in once circulation, only one of which function can perform, and once execution f2 will jump Go out circulation.This program code, therefore can corresponding a plurality of function call path because the difference of variate-value can perform different functions.
Source code, after processing through gcc, generates GCC-CFG intermediate code as shown in Figure 7.Source code is carried out by gcc Optimize, state the execution efficiency of multiple variable Optimized code, and do not affect the control logic of program.Then in this Between code carry out static analysis, by 2.2 joint controlling stream graph extraction algorithms, intermediate code is converted to right part in Fig. 7 The controlling stream graph information of json form, totally 11 nodes, 12 limits.Then controlling stream graph is drawn by graphviz, such as Fig. 8 Shown in left side.
Use CFG2FCG algorithm that the controlling stream graph on the left of Fig. 8 is converted into the function call relationship graph on right side.At 9 joints In point, only<bb 4>with<bb 5>comprise function call, and only comprise a function call, so performing rename behaviour Make (<bb 2>function call scanf comprised is built-in function, and in experimental code, statement does not realizes, during CFG2FCG Ignore this function call);Other nodes do not comprise function call, delete after finishing union operation.Finally, calculate from main to The reachable path of end is 6, and details is shown in Table 2.
Table 2 function call patch test result
By analyzing 5 function call paths, the value condition of variable on the right side of upper table can be obtained.Hold when not entering circulation Row the 1st paths;Perform else statement after entering circulation, generate the 2nd paths;After entering circulation execution, after if is judged as very Call f1, be then again introduced into loop body and perform f2, generate the 3rd paths;F1 is performed a plurality of times after entering circulation, then performs F2, generates the 4th paths;After entering circulation, only carry out a f1, be then log out circulation, generate the 5th paths.Entrance follows After ring, f1 is performed a plurality of times, is then log out circulation, generate the 6th paths.
Test result indicate that, the function call path extracted is consistent with the expection of manual analysis, shows based on controlling stream The function call path extraction method of figure, the path of calling that can correctly extract function in this example obtains the structure letter of program Breath.
Example 2: the recursive function containing ternary operator
The Fibonacci Sequence function source code that Fig. 9 (a) realizes for using ternary operator, principal function is by following Ring repeatedly calls fib function.Source code, after processing through gcc, generates GCC-CFG intermediate code, and wherein Fig. 9 (b) is main The mid portion of function, Fig. 9 (c) is the mid portion of fib function.Equally, source code is optimized by gcc, states many The execution efficiency of individual variable Optimized code, and do not affect the control logic of program.
Main function calls fib function in loop body, and its function calling relationship should be that fib points to oneself, and execution is many Secondary.Fib function is in the internal recursive call oneself of function, and its function calling relationship should be fib equally and points to oneself.
Static analysis intermediate code obtains the controlling stream graph information of json form, as shown in Fig. 9 (d).Pass through graphviz Draw the controlling stream graph of function, shown in the controlling stream graph of main function such as Figure 10 (a), the controlling stream graph of fib function such as Figure 10 (b) Shown in.
CFG2FCG algorithm is used controlling stream graph to be converted into function call graph, in same main function<bb3>'s Scanf is built-in function, and in experimental code, statement does not realizes, and ignores this function call during CFG2FCG.Main function Have shown in identical function call pathway figure such as Figure 10 (c) with fib function, consistent with manual analysis before.The overall situation generated Shown in function call Figure 10 (d), this function call relationship graph is fairly simple, no longer list analysis.Function from main to end is adjusted Three are come to path:
This example can extract the infeasible paths obtained by static analysis.In this example, due to the value of i value it is Pre-determined, its function call path also determines that, the third path in the most above-mentioned analysis.Spy according to static analysis Point, it appeared that all possible function call path during function execution.This feature may apply to security fields, finds The infeasible paths that may be utilized by hacker.Such problem will not be produced in dynamically analyzing, dynamically analyze and only can send out Be bound in currently designed good test case the function call path performed.
A kind of method proposing new extraction function call path in the embodiment of the present invention:
First, obtain the GCC-CFG form intermediate code of source code by gcc compiler, by using pattern- The pattern of action, this code of static analysis extracts the controlling stream graph of function;
Then, according to controlling stream graph, the number that each node in controlling stream graph is comprised function call is classified, Complete the controlling stream graph conversion to the call graph of function;
Finally, merge the function call relationship graph of all functions, extract function call path.Function call path can lead to The reachable path crossing the origin-to-destination analyzing function call relationship graph obtains.
Being experimentally confirmed, function call path extraction method based on controlling stream graph is effective, can be with simplified function Call the analysis process in path, and be easier to obtain function call path accurately.
The list of references quoted in the embodiment of the present invention is as follows, is quoted in full by these lists of references in the embodiment of the present invention In this:
Mu Yongmin, Li Huili. priorities of test cases based on function call path sequence [J]. computer engineering, 2014,40 (7): 242-246
Mu Yongmin, Yang Zhijia. software based on function call path realizes verifying [J] with Design consistency. Chinese science: Information science, 2014,10:1290-1304
Foster J S,Terauchi T,Aiken A.Flow-Sensitive Type Qualifiers[J] .Proc.acm Conf.programming Language Design&Implementation Acm Press,2002,37 (5):1-12.
Adams S,Ball T,Das M,et al.Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis[J].Sas Lncs,2002:117--132.
Evans D,Guttag J,Horning J,et al.LCLint:A Tool for Using Specifications to Check Code[J].Fse,2002,19(5):87--96.
Zheng Y H,Mu Y M,Zhang Z H.Research on the static function call path generating automatically.In:Proceedings of Information Management and Engineering,Chengdu,2010.405–409
Mu Y M,Zheng Y H,Zhang Z H,et al.The algorithm of infeasible paths extraction oriented the function calling relationship.Chinese J Electron, 2012,21:236–240
Mu Yongmin, Liu Mengting. C++ heavy duty uniqueness based on finite state machine determines [J]. computer utility is studied, 2014,31(4):1059-1062
Liu D F,Mu Y M,He Y J,et al.Generation of Static Function Calling Paths in C++Based on Finite-State Machine[C].Applied Mechanics and Materials.2014,568:1497-1504
Zhang Zhi China, Mu Yongmin. path based on function call covers Generation Technology [J]. electronic letters, vol, 2010, 138:1808-1811
Yan M M,Mu Y M,He Y J,et al.The Analysis of Function Calling Path in Java Based on Soot[C].Applied Mechanics and Materials.2014,568:1479-1487
Huang J C.Program Instrumentation and Software Testing[J].Computer, 1978,11(4):25-32.
Mu Yongmin, Jiang Zhi are glimmering, Zhang Zhihua. towards the path extraction [J] of c program plug-in mounting. and computer engineering and application, 2011,47(1):67-69
Mu Y M,Li H L,Jiang B,et al.The Splitting and Matching Algorithm of Dynamic Path Oriented the Function Calling Relationship[C].Intelligent Human- Machine Systems and Cybernetics(IHMSC),2013 5th International Conference on.IEEE,2013,2:343-346
Xu A P,Mu Y M,Zhang Z H,et al.The Dynamic Function Calling Path Generation Based on Instrumentation[C].Applied Mechanics and Materials.2014, 568:1469-1478
Zhong Fangting, Liu Chao, Jin Maozhong. the improvement [J] of instrumentation in program dynamic analysis system. computer engineering with set Meter, 2007 (28), 4585-4588
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, on the premise of without departing from principle of the present invention, it is also possible to make some improvements and modifications, these improvements and modifications are also Should be regarded as protection scope of the present invention.

Claims (7)

1. a function call path extraction method based on controlling stream graph, it is characterised in that including:
Step 1, the intermediate code of acquisition source code, and be analyzed intermediate code the code block in intermediate code is carried out Identify, and generate controlling stream graph with this;Node in described controlling stream graph be the connecting line between code block, and node be code Annexation between block;
Step 2, the function call number being comprised the node in controlling stream graph are analyzed, so that controlling stream graph is converted to letter The call graph of number.
Function call path extraction method based on controlling stream graph the most according to claim 1, it is characterised in that described step Rapid 1 specifically includes:
Step 11, utilize gcc compiler obtain source code GCC-CFG form intermediate code;
Step 12, the code block obtained in intermediate code, and by following preset rules function body be identified and obtain Function call information:
Rule P 1:([;][;][]Function)[][_a-zA-Z]*[a-zA-Z0-9];
Rule P 2:<bb [ ] [ 0-9 ] +\>;
Rule P 3:<L [0-9]+>
Rule P 4:((goto) []<bb [ ] [ 0-9 ] +\>)
Rule P 5:((goto) []<bb [ ] [ 0-9 ] +\>) [] (<l [ 0-9 ] +\>)
Rule P 6:([_a-zA-Z] * [a-zA-Z0-9] [] [()];
When code block mates with rule P 1, then this code block is function declaration;The code block of matched rule P1 is performed initialization Operation, then updates the parameter value relevant to code block, and wherein said parameter value is following at least one: function number, joint Point number, limit number;
When code block mates with rule P 2, then this code block is bb code block original position;When code block mates with rule P 3 Then this code block is L code block;The code block of matched rule P2 and rule P 3 is needed to process the information of its last layer code block, And output function recalls information, then update the parameter value relevant to code block;Wherein said parameter value is following at least one Kind: function number, node number, limit number;
When code block mates with rule P 4, then this code block is goto statement;Then this code when code block mates with rule P 5 Block is special format goto statement;It is designated whether to contain to the code block of matched rule P4 and rule P 5 and redirects, then basis Jump information generates the limit connecting this node with the node pointed by goto;
When code block mates with rule P 6, then this code block is function call;Obtain its function call information;
Step 13, basis identify code block and generate controlling stream graph;Each node in wherein said controlling stream graph is a generation Code section, will be attached according to control stream information between node by connecting line.
Function call path extraction method based on controlling stream graph the most according to claim 2, it is characterised in that described step Rapid 1 also includes: use three enumerators to record the number of function, node number, limit number;And use three enumerators to remember The record function call that whether code block comprises goto statement, whether code block comprises function call, code block comprises is detailed.
Function call path extraction method based on controlling stream graph the most according to claim 1, it is characterised in that described step Rapid 2 specifically include:
If code block does not has function call, node corresponding for this code block is merged in its upper strata or lower level node;
If code block has and only one of which function call, using this code block as a node in function call relationship graph And connect other nodes according to function calling relationship;
If code block has multiple function call, then by corresponding for each function in a controlling stream graph node and by these nodes Between link together according to function calling relationship, then according to and the node of other code blocks between function calling relationship even Connect other nodes.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step In rapid 2, code block has method during multiple function call to specifically include:
Code block N1 comprise function call Funs (f1, f2 ..., fn), and node N1 have one or more upper strata joint Point is N0s, and one or more lower level node is N2s;
Make function call Funs (f1, f2 ..., fn) in the corresponding node of each function, and suitable according to function call These nodes are linked in sequence by sequence;
Delete upper layer node N0s and point to the connecting line of code block N1, and upper layer node N0s is connected the order in code block N1 the The function f1 of one, and function fn last for order is connected lower level node N2s, to generate one from upper layer node N0s~code The control stream that all function calls of block N1~of lower level node N2 are made up of node and connecting line.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step In rapid 2, code block does not has method during function call to specifically include:
If the code block in controlling stream graph does not has function call, then judge that the node that this code block is corresponding the most only connects one Upper layer node and/or only one lower level node of connection, be if it is merged into this node upper layer node or be merged into lower floor's joint Point;If it is not, then this node to be merged into upper layer node and lower level node simultaneously, or it is non-functional by this node identification.
Function call path extraction method based on controlling stream graph the most according to claim 4, it is characterised in that described step Also include in rapid 2:
The storage of json form is used to analyze the controlling stream graph information obtained;The data of wherein said json form at least include: For the Function Array of representative function controlling stream graph information, this array includes for recording function name and controlling stream graph information Parameter, be wherein divided into again parameter and the record node name of connecting line between record node for recording the parameter of controlling stream graph information The parameter claimed, contains function call array in the parameter of record nodename;With the single function in controlling stream graph as base Our unit, the function call information according to comprising in code block processes.
CN201610541747.7A 2016-07-11 2016-07-11 Function call path extraction method based on controlling stream graph Withdrawn CN106227573A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610541747.7A CN106227573A (en) 2016-07-11 2016-07-11 Function call path extraction method based on controlling stream graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610541747.7A CN106227573A (en) 2016-07-11 2016-07-11 Function call path extraction method based on controlling stream graph

Publications (1)

Publication Number Publication Date
CN106227573A true CN106227573A (en) 2016-12-14

Family

ID=57519540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610541747.7A Withdrawn CN106227573A (en) 2016-07-11 2016-07-11 Function call path extraction method based on controlling stream graph

Country Status (1)

Country Link
CN (1) CN106227573A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951744A (en) * 2017-03-15 2017-07-14 北京深思数盾科技股份有限公司 The guard method of executable program and device
CN108881032A (en) * 2018-06-19 2018-11-23 福州大学 A kind of P4 track performance method for improving based on matching optimization
CN109189758A (en) * 2018-07-26 2019-01-11 新华三技术有限公司 O&M flow designing method, device and equipment, operation method, device and host
CN109542942A (en) * 2018-11-28 2019-03-29 网易(杭州)网络有限公司 Querying method and device, the electronic equipment of function call
CN109656568A (en) * 2018-12-28 2019-04-19 黑龙江省工业技术研究院 On-demand reducible program control flowchart figure accessibility indexing means
CN110543427A (en) * 2019-09-06 2019-12-06 五八有限公司 Test case storage method and device, electronic equipment and storage medium
CN112181808A (en) * 2020-09-08 2021-01-05 北京邮电大学 Program concurrency defect detection method, device, equipment and storage medium
CN113760700A (en) * 2020-08-06 2021-12-07 北京京东振世信息技术有限公司 Program endless loop detection method, device, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138335A (en) * 2015-08-28 2015-12-09 牟永敏 Function call path extracting method and device based on control flow diagram

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138335A (en) * 2015-08-28 2015-12-09 牟永敏 Function call path extracting method and device based on control flow diagram

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951744B (en) * 2017-03-15 2019-12-13 北京深思数盾科技股份有限公司 protection method and device for executable program
CN106951744A (en) * 2017-03-15 2017-07-14 北京深思数盾科技股份有限公司 The guard method of executable program and device
CN108881032A (en) * 2018-06-19 2018-11-23 福州大学 A kind of P4 track performance method for improving based on matching optimization
CN108881032B (en) * 2018-06-19 2021-01-29 福州大学 P4 pipeline performance improving method based on matching optimization
CN109189758B (en) * 2018-07-26 2021-02-09 新华三技术有限公司 Operation and maintenance flow design method, device and equipment, operation method, device and host
CN109189758A (en) * 2018-07-26 2019-01-11 新华三技术有限公司 O&M flow designing method, device and equipment, operation method, device and host
CN109542942A (en) * 2018-11-28 2019-03-29 网易(杭州)网络有限公司 Querying method and device, the electronic equipment of function call
CN109542942B (en) * 2018-11-28 2021-09-24 网易(杭州)网络有限公司 Function call query method and device and electronic equipment
CN109656568A (en) * 2018-12-28 2019-04-19 黑龙江省工业技术研究院 On-demand reducible program control flowchart figure accessibility indexing means
CN109656568B (en) * 2018-12-28 2022-04-05 黑龙江省工业技术研究院 On-demand contractable program control flow graph reachability indexing method
CN110543427A (en) * 2019-09-06 2019-12-06 五八有限公司 Test case storage method and device, electronic equipment and storage medium
CN113760700A (en) * 2020-08-06 2021-12-07 北京京东振世信息技术有限公司 Program endless loop detection method, device, electronic equipment and storage medium
CN112181808A (en) * 2020-09-08 2021-01-05 北京邮电大学 Program concurrency defect detection method, device, equipment and storage medium
CN112181808B (en) * 2020-09-08 2022-06-28 北京邮电大学 Program concurrency defect detection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106227573A (en) Function call path extraction method based on controlling stream graph
US11036614B1 (en) Data control-oriented smart contract static analysis method and system
CN105138335B (en) A kind of function call path extraction method and device based on controlling stream graph
Huang et al. Cldiff: generating concise linked code differences
Kamimura et al. Extracting candidates of microservices from monolithic application code
CN107704382B (en) Python-oriented function call path generation method and system
Higo et al. Refactoring support based on code clone analysis
Bastide et al. Petri net objects for the design, validation and prototyping of user-driven interfaces.
CN107193739A (en) A kind of black box regression testing method
Higo et al. On software maintenance process improvement based on code clone analysis
Hamou-Lhadj et al. A metamodel for the compact but lossless exchange of execution traces
CN113508385B (en) Method and system for formal language processing using subroutine graph
Koni-N’Sapu A scenario based approach for refactoring duplicated code in object oriented systems
CN110162474A (en) A kind of intelligent contract reentry leak detection method based on abstract syntax tree
Dwyer et al. A compact petri net representation and its implications for analysis
CN113835952B (en) Linux system call monitoring method based on compiler code injection
El-Boussaidi et al. Detecting patterns of poor design solutions using constraint propagation
CN113010400B (en) Computer processing technology document intelligent generation and multiple disk system and method
Trifu Improving the dataflow-based concern identification approach
JP2002288004A (en) Program source processing device and method, and program source processing program
Georget et al. Kayrebt: An activity diagram extraction and visualization toolset designed for the Linux codebase
CN109117142A (en) A kind of fundamental type reconstructing method based on variable association tree
Kaplan et al. An architecture for tool integration
Stepney et al. AZ Patterns Catalogue: I
Ngo et al. Automated Extraction of database interactions in web applications

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20161214