CN110187988A - Static function calling figure construction method suitable for Virtual Function and function pointer - Google Patents
Static function calling figure construction method suitable for Virtual Function and function pointer Download PDFInfo
- Publication number
- CN110187988A CN110187988A CN201910492850.0A CN201910492850A CN110187988A CN 110187988 A CN110187988 A CN 110187988A CN 201910492850 A CN201910492850 A CN 201910492850A CN 110187988 A CN110187988 A CN 110187988A
- Authority
- CN
- China
- Prior art keywords
- function
- basic block
- instruction
- emulation
- pointer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/547—Remote procedure calls [RPC]; Web services
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a kind of static function calling figure construction method suitable for Virtual Function and function pointer, step includes: the intermediate code of 1 acquisition source program;2 obtain the key message in intermediate code, and the key message includes basic block sequential queue and Virtual Function relevant information;3 are based on the key message, carry out simulation execution to the intermediate code, analyze the actual function that the function call instruction in the intermediate code calls, while recording function calling relationship;4, according to the function calling relationship, construct static function calling figure.The present invention can comprehensively analyze Virtual Function, function pointer calling and thread creation relationship, and complicated function pointer can accurately be analyzed and called, so as to better helper developer prehension program, while promoting the accuracy of the Static Analysis Method dependent on function call graph.
Description
Technical field
The present invention relates to computer software technical field more particularly to a kind of static state suitable for Virtual Function and function pointer
Function call graph construction method.
Background technique
The scale and complexity of modern software system are continuously improved, and function call graph also therefore is increasingly taken seriously.One
Aspect, function call graph can analyze code structure with helper applications developer, clear code logic.On the other hand, function tune
Program Static Analysis field is widely used in figure.For example, function call graph can be dead in program for detecting and eliminating
Code.Function call graph can also be in conjunction with controlling stream graph, controlling stream graph between forming process, this be in Program Static Analysis about
The basis of sensitive algorithm is flowed in Hole Detection, safety analysis etc..Complete and accurate function call graph can better helper
Developer's prehension program, while the accuracy of the Static Analysis Method dependent on function call graph can be promoted.
In the method for current constructor calling figure, it is broadly divided into dynamic analysis and two kinds of static analysis.Dynamic analysis
Method precision ratio is high, but the complete function call graph of building is more difficult, and recall ratio is not high.Static Analysis Method recall ratio is high,
It is able to maintain higher precision ratio simultaneously, therefore more favourable.
Emerged in large numbers many methods for being used to construct static function calling figure at present, but existing Static Analysis Method there is also
More problem, specifically including that 1., there is presently no Static Analysis Methods can comprehensively extract the functions such as Virtual Function, function pointer
Call relation.2. existing method can not accurately analyze function pointer call relation.3. for using the journey of multi-thread programming
Sequence, there is presently no methods static to extract cross-thread set membership, and the contextual information of thread function is in static analysis
It all loses, this leads to problems such as not high to deadlock, data contention, pseudo- shared static detection accuracy in multi-thread programming.
Summary of the invention
The present invention is provided a kind of suitable for Virtual Function and function to avoid above-mentioned existing deficiencies in the technology
The static function calling figure construction method of pointer, to can not only comprehensively analyze Virtual Function, function pointer call and
Cross-thread set membership, and complicated function pointer can accurately be analyzed and called, so as to the exploitation of better helper
Personnel understand code and promoted dependent on function call graph Static Analysis Method accuracy, have preferable research significance and
Practical value.
In order to achieve the above objectives, the present invention adopts the following technical scheme:
A kind of present invention suitable for the characteristics of static function calling figure construction method of Virtual Function and function pointer is by such as
Lower step carries out:
Step 1, the intermediate code for obtaining source program;
Key message in step 2, acquisition intermediate code, the key message include basic block sequential queue and empty letter
Number relevant information;
Step 3, specified static analysis entrance function, and it is based on the key message, the intermediate code is simulated
It executes, analyzes the actual function that the function call instruction in the intermediate code calls, while recording function calling relationship;
Step 4, according to the function calling relationship, construct static function calling figure.
It is of the present invention also to exist suitable for the characteristics of static function calling figure construction method of Virtual Function and function pointer
In,
It is acquisition basic block sequential queue according to the following procedure in the step 2:
Step 2A_1, i-th of function F in the intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to described
Controlling stream graph CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are function in intermediate code
Total number;
The entrance basic block EntryBB is setiFor current basic block, and by the controlling stream graph CFGiIn all circulations
Identifier be denoted as " False ", expression do not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps
2A_6;
If the identifier of the circulation L where step 2A_3, the described current basic block is " False ", 2A_ is thened follow the steps
4, it is no to then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by current basic block
Successor basic blocks in successor basic blocks in circulation L be set as current basic block, will be not in subsequent basic in circulation L
Block is pressed into i-th of untreated stack UnProcessSiIn, then by it is described circulation L identifier be set to " True " after, return execute step
Rapid 2A_2;
If step 2A_5, described i-th of untreated stack UnProcessSiFor non-empty stack, then pops up described i-th and do not locate
Manage stack UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if currently
There are successor basic blocks for basic block, and successor basic block meets first sequence condition, then any one by the first sequence condition of satisfaction is subsequent
Basic block is current basic block, remaining successor basic blocks is successively pressed into described i-th untreated stack UnProcessSiIn, then
Return step 2A_2 is executed, and otherwise, executes step 2A_5;
Elder generation's sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue
BBOrderQiIn.
It is to obtain the Virtual Function relevant information according to the following procedure in the step 2:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates j-th of empty table structure
Body variable VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body
Variable VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;N is the total number of empty table structure body variable, k
=1,2 ..., Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor jth
A void table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] conversion
At integer variable VPtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable
VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding
Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute
State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] conversion
At function pointer categorical variable Vfunjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor
Value constructs key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th member's array it is corresponding
The Virtual Function relevant information Table V irtualTab of classjkIn, wherein the pointer type variable V funjktIndicate j-th of empty table knot
Structure body variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk。
Simulation described in the step 3 executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of the static analysis entrance function, team's head basic block is set as working as
Preceding basic block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to the present instruction, obtains intermediate code simulation and executes to described current
In instruction process, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and successively store to argument pointer
Relation table R1, function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, the described present instruction is function call instruction, according to the argument pointer relation table R1, letter
Number pointer relationship table R2, Virtual Function relation table R3, the practical function called of the function call instruction is analyzed, and record letter
Several call relations, then after setting the function actually called for the static analysis entrance function, recurrence executes step
3_1 to step 3_4 simulates implementation procedure;Otherwise, step 3_5 is executed;
If there are still untreated instructions in step 3_5, the described current basic block, by next instruction of present instruction
It is set to present instruction, then executes step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic in the basic block sequential queue of step 3_6, the described static analysis entrance function
Next basic block of current basic block is then set to current basic block, then executes step 3_1 by block.
The step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as
Under type carries out emulation adress analysis:
If present instruction is that element address is taken to instruct, for each dimension for taking element address to instruct, obtain each
Operand value in dimension, and be aligned rear obtained byte number with the memory of corresponding operating number respectively and be multiplied, then will own
Product summation in dimension to obtain total emulation address offset amount, and is stored in emulation memory memory access list;
If present instruction is to deposit instruction, judgement deposits whether the first operand of instruction is common variables pointer, if so,
The emulation memory of each emulation address and second operand in the emulation memory emulation list of the first operand of instruction will then be deposited
After each emulation address carries out combination of two in access list, argument pointer relation table R is added1, otherwise, the of instruction is deposited in judgement
Whether one operand is function pointer, if so, by each of the emulation internal storage access list for the second operand for depositing instruction
Emulate address, after being combined with the function pointer, the addition function pointer relation table R2;Otherwise, it does not operate;
If present instruction is instruction fetch, the operand emulation address for obtaining the instruction fetch is closed in the argument pointer
It is table R1In corresponding emulation address, and will it is corresponding emulation address be added instruction fetch emulation internal storage access list in;
If present instruction is type conversion instructions, the emulation internal storage access list of the variable after type is converted is set to
The emulation internal storage access list of variable before type conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if
It is that the PHI emulation internal storage access list instructed is then set to the emulation internal storage access of variable corresponding to the basic block into circulation
Otherwise the PHI emulation internal storage access list instructed is set to change corresponding to the basic block of all operands in PHI instruction by list
The combination of the emulation internal storage access list of amount;
If present instruction is return instruction, judge whether return value is pointer, if so, by function call instruction
Emulation memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is function call instruction, judge whether the function call is Virtual Function calling form, if
It is that Virtual Function is then obtained according to the Virtual Function relevant information that step 2 obtains, and by the emulation of the operand of function call instruction
Address is combined with the Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is function call instruction, whether the parameter of discriminant function call instruction is pointer type, if
It is that the emulation internal storage access list of the argument variable of function call instruction is set to the emulation internal storage access list of parameter variable;
Otherwise, it does not operate.
Compared with prior art, the beneficial effects of the present invention are embodied in:
1. method proposed by the present invention can comprehensively analyze the function calling relationships such as Virtual Function, function pointer, and support
The library pthread can extract the set membership of cross-thread;Function pointer, the call back function, function pointer returned for function
The analysis result of the function pointer method of calling of the complexity such as array, the method for the present invention is more accurate.
2. the method for the present invention has extracted corresponding Virtual Function phase by analyzing the empty table variable in intermediate code
Information is closed, the calling of Virtual Function can be accurately analyzed.
3. the present invention carries out emulation ground by using the method for carrying out simulation execution to intermediate code, in simulation implementation procedure
Location is allocated, calculates and transmits, and analyzes the practical function called of function call instruction, and solving existing method cannot be quasi-
True analytic function pointer and the problem of cross-thread set membership cannot be analyzed, compared to existing method, have it is higher look into it is complete
Rate and precision ratio.
Detailed description of the invention
The process that Fig. 1 is SExecCG in the embodiment of the present invention describes figure;
Fig. 2 is the acquisition basic block sequential queue schematic diagram of an illustration in the embodiment of the present invention;
Fig. 3 is that empty expression is intended in the LLVM IR of an illustration in the embodiment of the present invention;
Fig. 4 is the emulation internal storage access list schematic diagram of an illustration in the embodiment of the present invention;
The C++ programme diagram that Fig. 5 designs for the embodiment of the present invention;
Fig. 6 is the dot file schematic diagram that the embodiment of the present invention generates the analysis of Fig. 5 program;
Fig. 7 is the function call graph that the embodiment of the present invention goes out Fig. 5 program construction.
Specific embodiment
To keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and tool
Body embodiment is described in detail.
As shown in Figure 1, the present invention relates to a kind of static function calling figure building sides suitable for Virtual Function and function pointer
Method, and according to the method achieve tool SExecCG (the call graph based on that one constructs static call figure
Simulation execution, referred to as SExecCG), belong to field of software engineering, also belongs to Program Static Analysis field.
Step 1, the intermediate code for obtaining source program
The embodiment of the present invention realizes the method for the present invention under LLVM compiler frame, using under the compiler frame
Clang/Clang++ compiler, and modification appropriate has been carried out to the source code of Clang/Clang++ compiler.Change compiler
The purpose of source code be in order to when analyzing large program, only need to be in original compilation script file without compiling manually
The middle corresponding compiling option of addition is the intermediate code for producing entire project.Concrete modification content are as follows:
Increase customized Pass, which does not make an amendment the result of compilation process, by way of pitching pile, by source journey
Result unloading in sequence compilation process is the LLVM IR file of .bc format.
Increase customized compiling option, the option is corresponding with above-mentioned customized Pass.
C/C++ source program is compiled using modified Clang/Clang++ compiler, it can be in the phase of source program
After pass compiling option has all been handled, the LLVM IR file of entire project is generated by way of pitching pile.
Key message in step 2, acquisition intermediate code, key message includes basic block sequential queue and Virtual Function phase
Close information;
The key message that the embodiment of the present invention obtains in intermediate code mainly includes two parts: basic block sequential queue and void
Functional dependence information.The controlling stream graph of each function in LLVM IR file is handled in step 2A, each control is flowed
Basic block in figure is ranked up, and generates the corresponding basic block sequential queue of each function.Virtual Function phase is obtained in step 2B
Empty list index deviates byte number, the corresponding Virtual Function of class in the information of pass, including object memory mapping.
Relevant with execution route mainly branch and circulation in program.Under Static Single Assignment form, same variable
There is different expressions in different individual paths, the PHI node unified representation variable is finally used when branch merges.At this time
The status information of variable has been included in PHI node on a plurality of individual path, therefore all without traversing when simulating execution
Program path need to only be ranked up the basic block in controlling stream graph, carry out combinations of states when simulating and executing.
Specifically, obtaining basic block sequential queue according to the following procedure:
Step 2A_1, i-th of function F in intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to controlling stream graph
CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are total of function in intermediate code
Number;
Inlet porting basic block EntryBBiFor current basic block, and by controlling stream graph CFGiIn all circulations identifier
It is denoted as " False ", expression does not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps
2A_6;
If the identifier of circulation L step 2A_3, where current basic block is " False ", 2A_4 is thened follow the steps, it is no
Then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by current basic block
Successor basic blocks in successor basic blocks in circulation L be set as current basic block, will be not in subsequent basic in circulation L
Block is pressed into i-th of untreated stack UnProcessSiIn, then after the identifier for recycling L is set to " True ", return to step
2A_2;
If step 2A_5, i-th of untreated stack UnProcessSiFor non-empty stack, then i-th of untreated stack is popped up
UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if current basic
There are successor basic blocks for block, and successor basic block meets first sequence condition, then any one by the first sequence condition of satisfaction is subsequent basic
Block is current basic block, remaining successor basic blocks is successively pressed into i-th of untreated stack UnProcessSiIn, return again to step
2A_2 is executed, and otherwise, executes step 2A_5;
First sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue BBOrderQiIn.
Fig. 2 show the pretreated schematic diagram of controlling stream graph.Each node is basic block in the controlling stream graph, and
Each basic block has a label.After having run controlling stream graph Preprocessing Algorithm, the sequential queue of basic block is obtained.
The empty table of class in the compilation phase be it is determining, in order to accurately obtain Virtual Function in the simulation execution stage, need to mention
Preceding acquisition Virtual Function relevant information.In LLVM IR file, empty table is a global structure body variable.The structural body variable by
Member's array is constituted, and each member's array is corresponding with a class, and the element in member's array is i8* categorical variable, at
The numerical value that member's array header element changes into after integer indicates that member's array correspond to the empty list index of class and deviates byte number, the first member of array
The function pointer of element transition after element, indicates that member's array corresponds to the Virtual Function of class.Fig. 3 is shown in Fig. 5 program
Derived class corresponding void table structure body variable in LLVM IR.
Global structure body variable is parsed, by information preservations such as the offset byte number, the Virtual Functions that parse.?
When analyzing Virtual Function calling, the practical class for creating the object is found out first;Pair of byte number is deviated by class and empty list index again
It should be related to, analyze the offset byte number of empty list index in the object;Finally according to the corresponding relationship of empty list index and Virtual Function,
Analyze calling is specifically which Virtual Function in the void table.
Specifically, obtaining Virtual Function relevant information according to the following procedure:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates j-th of empty table structure
Body variable VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body
Variable VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;N is the total number of empty table structure body variable, k
=1,2 ..., Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor jth
A void table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] conversion
At integer variable VPtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable
VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding
Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute
State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] conversion
At function pointer categorical variable Vfunjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor
Value constructs key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th member's array it is corresponding
The Virtual Function relevant information Table V irtualTab of classjkIn, wherein pointer type variable V funjktIndicate j-th of empty table structure body
Variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk。
Step 3, specified static analysis entrance function, and it is based on key message, simulation execution is carried out to intermediate code, is analyzed
The actual function that function call instruction in intermediate code calls out, while recording function calling relationship;
Simulation execution is carried out to program, can analyze out the function call that the Dynamic Execution stage just can determine that, as function refers to
Needle calls, Virtual Function calls etc..
In order to which the true physical address of variable, the embodiment of the present invention devise one kind to simulation program during Dynamic Execution
Two-dimensional simulation memory.By the calculating and transmitting to emulation address, it can accurately simulate pointer and be directed toward transformation, function ginseng
Number transmitting and type conversion etc..
The structural form of each emulation address in emulation memory are as follows: (I, N).Wherein " I " and variable correspond, and indicate
Each variable has one piece of one-dimensional emulation memory for being labeled as " I ".It is one-dimensional interior that " N " indicates that some member of variable is specifically in this
At the " Nth " byte deposited.Address will be emulated and be designed to that two dimensional form is in order to avoid dividing for some compound type variable
It needs to obtain the size of the variable in advance when with emulation memory, while being overlapped between can preventing variable from emulating address.
Simulation executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of static analysis entrance function, current base is set by team's head basic block
This block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to present instruction, obtains intermediate code simulation and executes to present instruction process
In, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and successively store to argument pointer relation table R1、
Function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, present instruction is function call instruction, according to argument pointer relation table R1, function pointer close
It is table R2, Virtual Function relation table R3, the practical function called of function call instruction is analyzed, and record the call relation of minor function,
After setting the function actually called for static analysis entrance function again, recurrence executes step 3_1 to step 3_4 simulation and executed
Journey;Otherwise, step 3_5 is executed;
If step 3_5, there are still untreated instructions in current basic block, next instruction of present instruction is set to
Present instruction, then execute step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic block step 3_6, in the basic block sequential queue of static analysis entrance function,
Next basic block of current basic block is set to current basic block, then executes step 3_1.
Emulation adress analysis algorithm in step 3_3 mainly carries out simulation execution to some instruction.It is emulated by distribution
Memory calculate to the emulation address of compound type variable and be transmitted to emulation address, finally makes to emulate on address
The information such as argument pointer, function pointer and the Virtual Function pointer of storage are safeguarded.
Step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as lower section
Formula carries out emulation adress analysis:
If present instruction obtains for GetElementPtr instruction for each dimension of GetElementPtr instruction
The operand value in each dimension is taken, and is aligned rear obtained byte number with the memory of corresponding operating number respectively and is multiplied, then
By the product summation in all dimensions, to obtain total emulation address offset amount, and it is stored in emulation memory memory access list.Figure
4 be a possible emulation internal storage access list schematic diagram, and %2 is the 2nd byte of access, %3 on 1 emulation memory in index
It is the 4th byte of access and 8 bytes on 2 emulation memory in index;
If present instruction is Store instruction, judge whether the first operand of Store instruction is that common variables refer to
Needle, if so, by each emulation address and second operand in the emulation memory emulation list of the first operand of Store instruction
Emulation internal storage access list in after each emulation address carries out combination of two, argument pointer relation table R is added1, otherwise, judgement
Whether the first operand of Store instruction is function pointer, if so, by the emulation memory of the second operand of Store instruction
After and function pointer is combined, function pointer relation table R is added in each emulation address in access list2;Otherwise, it does not grasp
Make;
If present instruction is Load instruction, the operand emulation address of Load instruction is obtained in argument pointer relation table
R1In corresponding emulation address, and will it is corresponding emulation address be added Load instruction emulation internal storage access list in;
If present instruction is Cast instruction, the emulation internal storage access list of the variable after type is converted is set to type
The emulation internal storage access list of variable before conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if
It is that the PHI emulation internal storage access list instructed is then set to the emulation internal storage access of variable corresponding to the basic block into circulation
Otherwise the PHI emulation internal storage access list instructed is set to change corresponding to the basic block of all operands in PHI instruction by list
The combination of the emulation internal storage access list of amount;
If present instruction is Ret instruction, judge whether return value is pointer, if so, by function call instruction
Emulation memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is Call/Invoke instruction, whether it is Virtual Function calling form that discriminant function calls, if
It is that Virtual Function is then obtained according to the Virtual Function relevant information that step 2 obtains, and by the operand of Call/Invoke instruction
Emulation address is combined with Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is Call/Invoke instruction, judge whether the parameter of Call/Invoke instruction is pointer class
Type, if so, the emulation memory that the emulation internal storage access list of the argument variable of Call/Invoke instruction is set to parameter variable is visited
Ask list;Otherwise, it does not operate.
Step 4, according to function calling relationship, construct static function calling figure.
The embodiment of the present invention supports the visualization of static function calling figure, will simulate the function tune of implementation procedure record first
It is organized into the dot file format of GraphViz support with relationship, the dot drawing tool in GraphViz is recalled, to function tune
It is visualized with figure, generates the output file of the formats such as eps, png, pdf.
In order to prove the validity of present invention method, it is illustrated herein by an experiment:
In an experiment, the embodiment of the present invention devises a simple C++ program, as shown in Figure 5.It is used in the program
Thread correlation function in Virtual Function, function pointer (including more complicated usage) and the library pthread.Fig. 6 is the present invention
The dot file that embodiment analyzes Fig. 5 program, Fig. 7 are the function call graphs that the embodiment of the present invention goes out Fig. 5 program construction.
From figure 7 it can be seen that the embodiment of the present invention has analyzed the calling of the constructed fuction from base class to derived class, Virtual Function vfunc_1
Calling, to function return function pointer func_3 and func_4 calling and cross-thread set membership.The present invention is real
The complete functions for having analyzed all possible calling of Fig. 5 program of example are applied, and the result analyzed is very correct.
The embodiment of the present invention proposes a kind of static function calling figure construction method suitable for Virtual Function and function pointer,
The LLVM IR file that this method generates source program compiling carries out static analysis, by the emulation address designed of the present invention into
Row distribution, calculate and transmitting etc., can accurately simulation program implementation procedure.In simulation implementation procedure, to Virtual Function, refer to
The information such as needle direction are safeguarded, the static function calling figure of program may finally be constructed.
The experimental results showed that this method is feasible, the function calls such as Virtual Function, function pointer pass can not only be comprehensively analyzed
System, and can accurately analyze complicated function pointer method of calling, the function pointer returned such as function, call back function and
The calling of pointer in array of function pointer etc..Last the method for the present invention also supports the library pthread, can extract cross-thread
Set membership.Therefore, the method for the present invention has certain practicability and scientific research value.
Claims (5)
1. a kind of static function calling figure construction method suitable for Virtual Function and function pointer, it is characterized in that as follows into
Row:
Step 1, the intermediate code for obtaining source program;
Key message in step 2, acquisition intermediate code, the key message includes basic block sequential queue and Virtual Function phase
Close information;
Step 3, specified static analysis entrance function, and it is based on the key message, simulation execution is carried out to the intermediate code,
The actual function that the function call instruction in the intermediate code calls is analyzed, while recording function calling relationship;
Step 4, according to the function calling relationship, construct static function calling figure.
2. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special
Sign is, is acquisition basic block sequential queue according to the following procedure in the step 2:
Step 2A_1, i-th of function F in the intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to the control
Flow graph CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are the total of function in intermediate code
Number;
The entrance basic block EntryBB is setiFor current basic block, and by the controlling stream graph CFGiIn all circulations mark
Knowing token is " False ", and expression does not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps 2A_6;
If the identifier of the circulation L where step 2A_3, the described current basic block is " False ", 2A_4 is thened follow the steps, it is no
Then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by the subsequent of current basic block
Successor basic blocks in basic block in circulation L are set as current basic block, and the successor basic blocks being not in circulation L are pressed into
I-th of untreated stack UnProcessSiIn, then by it is described circulation L identifier be set to " True " after, return to step 2A_
2;
If step 2A_5, described i-th of untreated stack UnProcessSiFor non-empty stack, then described i-th untreated stack is popped up
UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if current basic block
There are successor basic blocks, and successor basic block meets first sequence condition, then will meet any one successor basic blocks of first sequence condition
For current basic block, remaining successor basic blocks is successively pressed into described i-th untreated stack UnProcessSiIn, return again to step
Rapid 2A_2 is executed, and otherwise, executes step 2A_5;
Elder generation's sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue BBOrderQi
In.
3. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special
Sign is, is to obtain the Virtual Function relevant information according to the following procedure in the step 2:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates that j-th of empty table structure body becomes
Measure VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body variable
VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;Total number of the N for empty table structure body variable, k=1,
2,…,Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor j-th of void
Table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] is converted into whole
Type variable V PtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable
VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding
Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute
State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] is converted into letter
Number pointer type variable V funjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor value, structure
Build key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th of member's array correspond to class
Virtual Function relevant information Table V irtualTabjkIn, wherein the pointer type variable V funjktIndicate j-th of empty table structure body
Variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk。
4. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special
Sign is that simulation described in the step 3 executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of the static analysis entrance function, current base is set by team's head basic block
This block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to the present instruction, obtains intermediate code simulation and executes to the present instruction
In the process, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and is successively stored to argument pointer relationship
Table R1, function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, the described present instruction is function call instruction, according to the argument pointer relation table R1, function pointer
Relation table R2, Virtual Function relation table R3, the practical function called of the function call instruction is analyzed, and record the tune of minor function
With relationship, then after setting the function actually called for the static analysis entrance function, recurrence executes step 3_1 to walking
Rapid 3_4 simulates implementation procedure;Otherwise, step 3_5 is executed;
If there are still untreated instructions in step 3_5, the described current basic block, next instruction of present instruction is set to
Present instruction, then execute step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic block in the basic block sequential queue of step 3_6, the described static analysis entrance function,
Next basic block of current basic block is set to current basic block, then executes step 3_1.
5. the static function calling figure construction method according to claim 4 suitable for Virtual Function and function pointer, special
Sign is that the step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as
Under type carries out emulation adress analysis:
If present instruction is that element address is taken to instruct, for each dimension for taking element address to instruct, each dimension is obtained
On operand value, and be aligned rear obtained byte number with the memory of corresponding operating number respectively and be multiplied, then by all dimensions
On product summation, to obtain total emulation address offset amount, and be stored in emulation memory memory access list;
If present instruction is to deposit instruction, judgement deposits whether the first operand of instruction is common variables pointer, if so, will
Deposit the emulation internal storage access of each emulation address and second operand in the emulation memory emulation list of the first operand of instruction
After each emulation address carries out combination of two in list, argument pointer relation table R is added1, otherwise, the first behaviour of instruction is deposited in judgement
Whether be function pointer, if so, by each emulation in the emulation internal storage access list for the second operand for depositing instruction if counting
Address, after being combined with the function pointer, the addition function pointer relation table R2;Otherwise, it does not operate;
If present instruction is instruction fetch, the operand emulation address of the instruction fetch is obtained in the argument pointer relation table
R1In corresponding emulation address, and will it is corresponding emulation address be added instruction fetch emulation internal storage access list in;
If present instruction is type conversion instructions, the emulation internal storage access list of the variable after type is converted is set to type
The emulation internal storage access list of variable before conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if so,
Then the emulation internal storage access that the PHI emulation internal storage access list instructed is set to variable corresponding to the basic block into circulation is arranged
Otherwise the PHI emulation internal storage access list instructed is set to variable corresponding to the basic block of all operands in PHI instruction by table
Emulation internal storage access list combination;
If present instruction is return instruction, judge whether return value is pointer, if so, by the emulation of function call instruction
Memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is function call instruction, judge whether the function call is Virtual Function calling form, if so,
Virtual Function is obtained according to the Virtual Function relevant information that step 2 obtains, and by the emulation address of the operand of function call instruction
It is combined with the Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is function call instruction, whether the parameter of discriminant function call instruction is pointer type, if so, will
The emulation internal storage access list of the argument variable of function call instruction is set to the emulation internal storage access list of parameter variable;Otherwise,
It does not operate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910492850.0A CN110187988B (en) | 2019-06-06 | 2019-06-06 | Static function call graph construction method suitable for virtual function and function pointer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910492850.0A CN110187988B (en) | 2019-06-06 | 2019-06-06 | Static function call graph construction method suitable for virtual function and function pointer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110187988A true CN110187988A (en) | 2019-08-30 |
CN110187988B CN110187988B (en) | 2021-08-13 |
Family
ID=67720848
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910492850.0A Active CN110187988B (en) | 2019-06-06 | 2019-06-06 | Static function call graph construction method suitable for virtual function and function pointer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110187988B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124527A (en) * | 2019-10-24 | 2020-05-08 | 成都无糖信息技术有限公司 | Method for extracting virtual table function list in dynamic link library |
CN112130848A (en) * | 2020-09-24 | 2020-12-25 | 中国科学院计算技术研究所 | Band width sensing circulation blocking optimization technology facing scratch pad memory |
CN112395614A (en) * | 2020-11-27 | 2021-02-23 | 南京理工大学 | Android application program virtualization protection method based on LLVM |
CN112487438A (en) * | 2020-12-12 | 2021-03-12 | 南京理工大学 | Heap object Use-After-Free vulnerability detection method based on identifier consistency |
CN114527963A (en) * | 2020-11-23 | 2022-05-24 | 中国科学院信息工程研究所 | Class inheritance relationship identification method in C + + binary file and electronic device |
CN114610320A (en) * | 2022-03-21 | 2022-06-10 | 浙江大学 | LLVM-based variable type information repairing and comparing method and system |
CN114741131A (en) * | 2022-04-02 | 2022-07-12 | 深圳软牛科技有限公司 | Hiding method, device and equipment of dynamic library derived symbols and storage medium |
CN114968417A (en) * | 2021-02-25 | 2022-08-30 | 中移物联网有限公司 | Function calling method, device and equipment |
CN116340942A (en) * | 2023-03-01 | 2023-06-27 | 软安科技有限公司 | Function call graph construction method based on object propagation graph and pointer analysis |
CN116629353A (en) * | 2023-07-24 | 2023-08-22 | 北京邮电大学 | FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101968766A (en) * | 2010-10-21 | 2011-02-09 | 上海交通大学 | System for detecting software bug triggered during practical running of computer program |
US8539450B2 (en) * | 2009-03-11 | 2013-09-17 | Nec Laboratories America, Inc. | Fast and accurate data race detection for concurrent programs with asynchronous calls |
CN103631573A (en) * | 2012-08-24 | 2014-03-12 | 中兴通讯股份有限公司 | Method and system for obtaining execution time of transferable functions |
CN104317969A (en) * | 2014-11-18 | 2015-01-28 | 合肥康捷信息科技有限公司 | Processing method based on type conversion of cfg file and application of processing method based on type conversion of cfg file |
CN104331368A (en) * | 2014-11-18 | 2015-02-04 | 合肥康捷信息科技有限公司 | Method for performing static analysis on C++ virtual function call upon cfg (configuration) files |
CN104881610A (en) * | 2015-06-16 | 2015-09-02 | 北京理工大学 | Method for defending hijacking attacks of virtual function tables |
CN104965788A (en) * | 2015-07-03 | 2015-10-07 | 电子科技大学 | Code static detection method |
CN105242929A (en) * | 2015-10-13 | 2016-01-13 | 西安交通大学 | Design method aiming at binary program automatic parallelization of multi-core platform |
-
2019
- 2019-06-06 CN CN201910492850.0A patent/CN110187988B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8539450B2 (en) * | 2009-03-11 | 2013-09-17 | Nec Laboratories America, Inc. | Fast and accurate data race detection for concurrent programs with asynchronous calls |
CN101968766A (en) * | 2010-10-21 | 2011-02-09 | 上海交通大学 | System for detecting software bug triggered during practical running of computer program |
CN103631573A (en) * | 2012-08-24 | 2014-03-12 | 中兴通讯股份有限公司 | Method and system for obtaining execution time of transferable functions |
CN104317969A (en) * | 2014-11-18 | 2015-01-28 | 合肥康捷信息科技有限公司 | Processing method based on type conversion of cfg file and application of processing method based on type conversion of cfg file |
CN104331368A (en) * | 2014-11-18 | 2015-02-04 | 合肥康捷信息科技有限公司 | Method for performing static analysis on C++ virtual function call upon cfg (configuration) files |
CN104881610A (en) * | 2015-06-16 | 2015-09-02 | 北京理工大学 | Method for defending hijacking attacks of virtual function tables |
CN104965788A (en) * | 2015-07-03 | 2015-10-07 | 电子科技大学 | Code static detection method |
CN105242929A (en) * | 2015-10-13 | 2016-01-13 | 西安交通大学 | Design method aiming at binary program automatic parallelization of multi-core platform |
Non-Patent Citations (2)
Title |
---|
黄双玲,黄章进,顾乃杰: "基于CFG的函数调用关系静态分析方法", 《计算机系统应用》 * |
黄双玲: "面向C/C++程序行数调用关系的静态分析方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124527A (en) * | 2019-10-24 | 2020-05-08 | 成都无糖信息技术有限公司 | Method for extracting virtual table function list in dynamic link library |
CN112130848A (en) * | 2020-09-24 | 2020-12-25 | 中国科学院计算技术研究所 | Band width sensing circulation blocking optimization technology facing scratch pad memory |
CN112130848B (en) * | 2020-09-24 | 2022-06-14 | 中国科学院计算技术研究所 | Band-width sensing circulation block optimization method, compiling system, equipment and storage medium for scratch-pad memory |
CN114527963A (en) * | 2020-11-23 | 2022-05-24 | 中国科学院信息工程研究所 | Class inheritance relationship identification method in C + + binary file and electronic device |
CN112395614A (en) * | 2020-11-27 | 2021-02-23 | 南京理工大学 | Android application program virtualization protection method based on LLVM |
CN112395614B (en) * | 2020-11-27 | 2023-07-28 | 南京理工大学 | LLVM-based Android application program virtualization protection method |
CN112487438A (en) * | 2020-12-12 | 2021-03-12 | 南京理工大学 | Heap object Use-After-Free vulnerability detection method based on identifier consistency |
CN112487438B (en) * | 2020-12-12 | 2022-11-04 | 南京理工大学 | Heap object Use-After-Free vulnerability detection method based on identifier consistency |
CN114968417A (en) * | 2021-02-25 | 2022-08-30 | 中移物联网有限公司 | Function calling method, device and equipment |
CN114968417B (en) * | 2021-02-25 | 2024-05-24 | 中移物联网有限公司 | Function calling method, device and equipment |
CN114610320A (en) * | 2022-03-21 | 2022-06-10 | 浙江大学 | LLVM-based variable type information repairing and comparing method and system |
CN114741131A (en) * | 2022-04-02 | 2022-07-12 | 深圳软牛科技有限公司 | Hiding method, device and equipment of dynamic library derived symbols and storage medium |
CN114741131B (en) * | 2022-04-02 | 2023-08-15 | 深圳软牛科技有限公司 | Hiding method, device, equipment and storage medium for dynamic library derived symbol |
CN116340942A (en) * | 2023-03-01 | 2023-06-27 | 软安科技有限公司 | Function call graph construction method based on object propagation graph and pointer analysis |
CN116340942B (en) * | 2023-03-01 | 2024-04-30 | 软安科技有限公司 | Function call graph construction method based on object propagation graph and pointer analysis |
CN116629353A (en) * | 2023-07-24 | 2023-08-22 | 北京邮电大学 | FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method |
CN116629353B (en) * | 2023-07-24 | 2023-11-07 | 北京邮电大学 | FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method |
Also Published As
Publication number | Publication date |
---|---|
CN110187988B (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110187988A (en) | Static function calling figure construction method suitable for Virtual Function and function pointer | |
Gupta et al. | Deepfix: Fixing common c language errors by deep learning | |
Varró et al. | Benchmarking for graph transformation | |
CN104965788B (en) | A kind of code static detection method | |
Puschner et al. | Writing temporally predictable code | |
Danelutto et al. | A methodology for the development and the support of massively parallel programs | |
CN104794401B (en) | A kind of semiology analysis leak detection method of static analysis auxiliary | |
CN102289362A (en) | Segmented symbolic execution device and working method thereof | |
Amparore et al. | (Stochastic) model checking in GreatSPN | |
CN108363660B (en) | Test program generation method and device | |
Radul et al. | Automatically batching control-intensive programs for modern accelerators | |
CN109814924B (en) | Software complexity calculation method | |
van Merriënboer et al. | Tangent: Automatic differentiation using source code transformation in Python | |
Doberkat et al. | ProSet—a language for prototyping with sets | |
Balbo et al. | On the computation of performance characteristics of concurrent programs using GSPNs | |
Zhang et al. | An optimization algorithm applied to the class integration and test order problem | |
Vujošević Janičić | Concurrent bug finding based on bounded model checking | |
Bartels et al. | Verification of distributed embedded real-time systems and their low-level implementations using timed CSP | |
CN108647134B (en) | A kind of task monitoring, tracking and recognition methods towards multicore architecture | |
Tai | Automated test sequence generation using sequencing constraints for concurrent programs | |
Legay et al. | Statistical model checking of llvm code | |
Ivutin et al. | Low-level Code Auto-tuning for State-of-the-art Multicore Architectures | |
Dong et al. | AKGF: Automatic Kernel Generation for DNN on CPU-FPGA | |
Yehia | UCIS Applications: Improving Verification Productivity, Simulation Throughput, and Coverage Closure Process | |
Sharma et al. | Nature-Inspired Optimization Based Multithread Scheduling For Program Segments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |