CN110187988A - Static function calling figure construction method suitable for Virtual Function and function pointer - Google Patents

Static function calling figure construction method suitable for Virtual Function and function pointer Download PDF

Info

Publication number
CN110187988A
CN110187988A CN201910492850.0A CN201910492850A CN110187988A CN 110187988 A CN110187988 A CN 110187988A CN 201910492850 A CN201910492850 A CN 201910492850A CN 110187988 A CN110187988 A CN 110187988A
Authority
CN
China
Prior art keywords
function
basic block
instruction
emulation
pointer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910492850.0A
Other languages
Chinese (zh)
Other versions
CN110187988B (en
Inventor
顾乃杰
张帆
苏俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201910492850.0A priority Critical patent/CN110187988B/en
Publication of CN110187988A publication Critical patent/CN110187988A/en
Application granted granted Critical
Publication of CN110187988B publication Critical patent/CN110187988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a kind of static function calling figure construction method suitable for Virtual Function and function pointer, step includes: the intermediate code of 1 acquisition source program;2 obtain the key message in intermediate code, and the key message includes basic block sequential queue and Virtual Function relevant information;3 are based on the key message, carry out simulation execution to the intermediate code, analyze the actual function that the function call instruction in the intermediate code calls, while recording function calling relationship;4, according to the function calling relationship, construct static function calling figure.The present invention can comprehensively analyze Virtual Function, function pointer calling and thread creation relationship, and complicated function pointer can accurately be analyzed and called, so as to better helper developer prehension program, while promoting the accuracy of the Static Analysis Method dependent on function call graph.

Description

Static function calling figure construction method suitable for Virtual Function and function pointer
Technical field
The present invention relates to computer software technical field more particularly to a kind of static state suitable for Virtual Function and function pointer Function call graph construction method.
Background technique
The scale and complexity of modern software system are continuously improved, and function call graph also therefore is increasingly taken seriously.One Aspect, function call graph can analyze code structure with helper applications developer, clear code logic.On the other hand, function tune Program Static Analysis field is widely used in figure.For example, function call graph can be dead in program for detecting and eliminating Code.Function call graph can also be in conjunction with controlling stream graph, controlling stream graph between forming process, this be in Program Static Analysis about The basis of sensitive algorithm is flowed in Hole Detection, safety analysis etc..Complete and accurate function call graph can better helper Developer's prehension program, while the accuracy of the Static Analysis Method dependent on function call graph can be promoted.
In the method for current constructor calling figure, it is broadly divided into dynamic analysis and two kinds of static analysis.Dynamic analysis Method precision ratio is high, but the complete function call graph of building is more difficult, and recall ratio is not high.Static Analysis Method recall ratio is high, It is able to maintain higher precision ratio simultaneously, therefore more favourable.
Emerged in large numbers many methods for being used to construct static function calling figure at present, but existing Static Analysis Method there is also More problem, specifically including that 1., there is presently no Static Analysis Methods can comprehensively extract the functions such as Virtual Function, function pointer Call relation.2. existing method can not accurately analyze function pointer call relation.3. for using the journey of multi-thread programming Sequence, there is presently no methods static to extract cross-thread set membership, and the contextual information of thread function is in static analysis It all loses, this leads to problems such as not high to deadlock, data contention, pseudo- shared static detection accuracy in multi-thread programming.
Summary of the invention
The present invention is provided a kind of suitable for Virtual Function and function to avoid above-mentioned existing deficiencies in the technology The static function calling figure construction method of pointer, to can not only comprehensively analyze Virtual Function, function pointer call and Cross-thread set membership, and complicated function pointer can accurately be analyzed and called, so as to the exploitation of better helper Personnel understand code and promoted dependent on function call graph Static Analysis Method accuracy, have preferable research significance and Practical value.
In order to achieve the above objectives, the present invention adopts the following technical scheme:
A kind of present invention suitable for the characteristics of static function calling figure construction method of Virtual Function and function pointer is by such as Lower step carries out:
Step 1, the intermediate code for obtaining source program;
Key message in step 2, acquisition intermediate code, the key message include basic block sequential queue and empty letter Number relevant information;
Step 3, specified static analysis entrance function, and it is based on the key message, the intermediate code is simulated It executes, analyzes the actual function that the function call instruction in the intermediate code calls, while recording function calling relationship;
Step 4, according to the function calling relationship, construct static function calling figure.
It is of the present invention also to exist suitable for the characteristics of static function calling figure construction method of Virtual Function and function pointer In,
It is acquisition basic block sequential queue according to the following procedure in the step 2:
Step 2A_1, i-th of function F in the intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to described Controlling stream graph CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are function in intermediate code Total number;
The entrance basic block EntryBB is setiFor current basic block, and by the controlling stream graph CFGiIn all circulations Identifier be denoted as " False ", expression do not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps 2A_6;
If the identifier of the circulation L where step 2A_3, the described current basic block is " False ", 2A_ is thened follow the steps 4, it is no to then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by current basic block Successor basic blocks in successor basic blocks in circulation L be set as current basic block, will be not in subsequent basic in circulation L Block is pressed into i-th of untreated stack UnProcessSiIn, then by it is described circulation L identifier be set to " True " after, return execute step Rapid 2A_2;
If step 2A_5, described i-th of untreated stack UnProcessSiFor non-empty stack, then pops up described i-th and do not locate Manage stack UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if currently There are successor basic blocks for basic block, and successor basic block meets first sequence condition, then any one by the first sequence condition of satisfaction is subsequent Basic block is current basic block, remaining successor basic blocks is successively pressed into described i-th untreated stack UnProcessSiIn, then Return step 2A_2 is executed, and otherwise, executes step 2A_5;
Elder generation's sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue BBOrderQiIn.
It is to obtain the Virtual Function relevant information according to the following procedure in the step 2:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates j-th of empty table structure Body variable VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body Variable VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;N is the total number of empty table structure body variable, k =1,2 ..., Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor jth A void table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] conversion At integer variable VPtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] conversion At function pointer categorical variable Vfunjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor Value constructs key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th member's array it is corresponding The Virtual Function relevant information Table V irtualTab of classjkIn, wherein the pointer type variable V funjktIndicate j-th of empty table knot Structure body variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk
Simulation described in the step 3 executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of the static analysis entrance function, team's head basic block is set as working as Preceding basic block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to the present instruction, obtains intermediate code simulation and executes to described current In instruction process, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and successively store to argument pointer Relation table R1, function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, the described present instruction is function call instruction, according to the argument pointer relation table R1, letter Number pointer relationship table R2, Virtual Function relation table R3, the practical function called of the function call instruction is analyzed, and record letter Several call relations, then after setting the function actually called for the static analysis entrance function, recurrence executes step 3_1 to step 3_4 simulates implementation procedure;Otherwise, step 3_5 is executed;
If there are still untreated instructions in step 3_5, the described current basic block, by next instruction of present instruction It is set to present instruction, then executes step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic in the basic block sequential queue of step 3_6, the described static analysis entrance function Next basic block of current basic block is then set to current basic block, then executes step 3_1 by block.
The step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as Under type carries out emulation adress analysis:
If present instruction is that element address is taken to instruct, for each dimension for taking element address to instruct, obtain each Operand value in dimension, and be aligned rear obtained byte number with the memory of corresponding operating number respectively and be multiplied, then will own Product summation in dimension to obtain total emulation address offset amount, and is stored in emulation memory memory access list;
If present instruction is to deposit instruction, judgement deposits whether the first operand of instruction is common variables pointer, if so, The emulation memory of each emulation address and second operand in the emulation memory emulation list of the first operand of instruction will then be deposited After each emulation address carries out combination of two in access list, argument pointer relation table R is added1, otherwise, the of instruction is deposited in judgement Whether one operand is function pointer, if so, by each of the emulation internal storage access list for the second operand for depositing instruction Emulate address, after being combined with the function pointer, the addition function pointer relation table R2;Otherwise, it does not operate;
If present instruction is instruction fetch, the operand emulation address for obtaining the instruction fetch is closed in the argument pointer It is table R1In corresponding emulation address, and will it is corresponding emulation address be added instruction fetch emulation internal storage access list in;
If present instruction is type conversion instructions, the emulation internal storage access list of the variable after type is converted is set to The emulation internal storage access list of variable before type conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if It is that the PHI emulation internal storage access list instructed is then set to the emulation internal storage access of variable corresponding to the basic block into circulation Otherwise the PHI emulation internal storage access list instructed is set to change corresponding to the basic block of all operands in PHI instruction by list The combination of the emulation internal storage access list of amount;
If present instruction is return instruction, judge whether return value is pointer, if so, by function call instruction Emulation memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is function call instruction, judge whether the function call is Virtual Function calling form, if It is that Virtual Function is then obtained according to the Virtual Function relevant information that step 2 obtains, and by the emulation of the operand of function call instruction Address is combined with the Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is function call instruction, whether the parameter of discriminant function call instruction is pointer type, if It is that the emulation internal storage access list of the argument variable of function call instruction is set to the emulation internal storage access list of parameter variable; Otherwise, it does not operate.
Compared with prior art, the beneficial effects of the present invention are embodied in:
1. method proposed by the present invention can comprehensively analyze the function calling relationships such as Virtual Function, function pointer, and support The library pthread can extract the set membership of cross-thread;Function pointer, the call back function, function pointer returned for function The analysis result of the function pointer method of calling of the complexity such as array, the method for the present invention is more accurate.
2. the method for the present invention has extracted corresponding Virtual Function phase by analyzing the empty table variable in intermediate code Information is closed, the calling of Virtual Function can be accurately analyzed.
3. the present invention carries out emulation ground by using the method for carrying out simulation execution to intermediate code, in simulation implementation procedure Location is allocated, calculates and transmits, and analyzes the practical function called of function call instruction, and solving existing method cannot be quasi- True analytic function pointer and the problem of cross-thread set membership cannot be analyzed, compared to existing method, have it is higher look into it is complete Rate and precision ratio.
Detailed description of the invention
The process that Fig. 1 is SExecCG in the embodiment of the present invention describes figure;
Fig. 2 is the acquisition basic block sequential queue schematic diagram of an illustration in the embodiment of the present invention;
Fig. 3 is that empty expression is intended in the LLVM IR of an illustration in the embodiment of the present invention;
Fig. 4 is the emulation internal storage access list schematic diagram of an illustration in the embodiment of the present invention;
The C++ programme diagram that Fig. 5 designs for the embodiment of the present invention;
Fig. 6 is the dot file schematic diagram that the embodiment of the present invention generates the analysis of Fig. 5 program;
Fig. 7 is the function call graph that the embodiment of the present invention goes out Fig. 5 program construction.
Specific embodiment
To keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and tool Body embodiment is described in detail.
As shown in Figure 1, the present invention relates to a kind of static function calling figure building sides suitable for Virtual Function and function pointer Method, and according to the method achieve tool SExecCG (the call graph based on that one constructs static call figure Simulation execution, referred to as SExecCG), belong to field of software engineering, also belongs to Program Static Analysis field.
Step 1, the intermediate code for obtaining source program
The embodiment of the present invention realizes the method for the present invention under LLVM compiler frame, using under the compiler frame Clang/Clang++ compiler, and modification appropriate has been carried out to the source code of Clang/Clang++ compiler.Change compiler The purpose of source code be in order to when analyzing large program, only need to be in original compilation script file without compiling manually The middle corresponding compiling option of addition is the intermediate code for producing entire project.Concrete modification content are as follows:
Increase customized Pass, which does not make an amendment the result of compilation process, by way of pitching pile, by source journey Result unloading in sequence compilation process is the LLVM IR file of .bc format.
Increase customized compiling option, the option is corresponding with above-mentioned customized Pass.
C/C++ source program is compiled using modified Clang/Clang++ compiler, it can be in the phase of source program After pass compiling option has all been handled, the LLVM IR file of entire project is generated by way of pitching pile.
Key message in step 2, acquisition intermediate code, key message includes basic block sequential queue and Virtual Function phase Close information;
The key message that the embodiment of the present invention obtains in intermediate code mainly includes two parts: basic block sequential queue and void Functional dependence information.The controlling stream graph of each function in LLVM IR file is handled in step 2A, each control is flowed Basic block in figure is ranked up, and generates the corresponding basic block sequential queue of each function.Virtual Function phase is obtained in step 2B Empty list index deviates byte number, the corresponding Virtual Function of class in the information of pass, including object memory mapping.
Relevant with execution route mainly branch and circulation in program.Under Static Single Assignment form, same variable There is different expressions in different individual paths, the PHI node unified representation variable is finally used when branch merges.At this time The status information of variable has been included in PHI node on a plurality of individual path, therefore all without traversing when simulating execution Program path need to only be ranked up the basic block in controlling stream graph, carry out combinations of states when simulating and executing.
Specifically, obtaining basic block sequential queue according to the following procedure:
Step 2A_1, i-th of function F in intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to controlling stream graph CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are total of function in intermediate code Number;
Inlet porting basic block EntryBBiFor current basic block, and by controlling stream graph CFGiIn all circulations identifier It is denoted as " False ", expression does not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps 2A_6;
If the identifier of circulation L step 2A_3, where current basic block is " False ", 2A_4 is thened follow the steps, it is no Then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by current basic block Successor basic blocks in successor basic blocks in circulation L be set as current basic block, will be not in subsequent basic in circulation L Block is pressed into i-th of untreated stack UnProcessSiIn, then after the identifier for recycling L is set to " True ", return to step 2A_2;
If step 2A_5, i-th of untreated stack UnProcessSiFor non-empty stack, then i-th of untreated stack is popped up UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if current basic There are successor basic blocks for block, and successor basic block meets first sequence condition, then any one by the first sequence condition of satisfaction is subsequent basic Block is current basic block, remaining successor basic blocks is successively pressed into i-th of untreated stack UnProcessSiIn, return again to step 2A_2 is executed, and otherwise, executes step 2A_5;
First sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue BBOrderQiIn.
Fig. 2 show the pretreated schematic diagram of controlling stream graph.Each node is basic block in the controlling stream graph, and Each basic block has a label.After having run controlling stream graph Preprocessing Algorithm, the sequential queue of basic block is obtained.
The empty table of class in the compilation phase be it is determining, in order to accurately obtain Virtual Function in the simulation execution stage, need to mention Preceding acquisition Virtual Function relevant information.In LLVM IR file, empty table is a global structure body variable.The structural body variable by Member's array is constituted, and each member's array is corresponding with a class, and the element in member's array is i8* categorical variable, at The numerical value that member's array header element changes into after integer indicates that member's array correspond to the empty list index of class and deviates byte number, the first member of array The function pointer of element transition after element, indicates that member's array corresponds to the Virtual Function of class.Fig. 3 is shown in Fig. 5 program Derived class corresponding void table structure body variable in LLVM IR.
Global structure body variable is parsed, by information preservations such as the offset byte number, the Virtual Functions that parse.? When analyzing Virtual Function calling, the practical class for creating the object is found out first;Pair of byte number is deviated by class and empty list index again It should be related to, analyze the offset byte number of empty list index in the object;Finally according to the corresponding relationship of empty list index and Virtual Function, Analyze calling is specifically which Virtual Function in the void table.
Specifically, obtaining Virtual Function relevant information according to the following procedure:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates j-th of empty table structure Body variable VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body Variable VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;N is the total number of empty table structure body variable, k =1,2 ..., Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor jth A void table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] conversion At integer variable VPtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] conversion At function pointer categorical variable Vfunjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor Value constructs key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th member's array it is corresponding The Virtual Function relevant information Table V irtualTab of classjkIn, wherein pointer type variable V funjktIndicate j-th of empty table structure body Variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk
Step 3, specified static analysis entrance function, and it is based on key message, simulation execution is carried out to intermediate code, is analyzed The actual function that function call instruction in intermediate code calls out, while recording function calling relationship;
Simulation execution is carried out to program, can analyze out the function call that the Dynamic Execution stage just can determine that, as function refers to Needle calls, Virtual Function calls etc..
In order to which the true physical address of variable, the embodiment of the present invention devise one kind to simulation program during Dynamic Execution Two-dimensional simulation memory.By the calculating and transmitting to emulation address, it can accurately simulate pointer and be directed toward transformation, function ginseng Number transmitting and type conversion etc..
The structural form of each emulation address in emulation memory are as follows: (I, N).Wherein " I " and variable correspond, and indicate Each variable has one piece of one-dimensional emulation memory for being labeled as " I ".It is one-dimensional interior that " N " indicates that some member of variable is specifically in this At the " Nth " byte deposited.Address will be emulated and be designed to that two dimensional form is in order to avoid dividing for some compound type variable It needs to obtain the size of the variable in advance when with emulation memory, while being overlapped between can preventing variable from emulating address.
Simulation executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of static analysis entrance function, current base is set by team's head basic block This block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to present instruction, obtains intermediate code simulation and executes to present instruction process In, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and successively store to argument pointer relation table R1、 Function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, present instruction is function call instruction, according to argument pointer relation table R1, function pointer close It is table R2, Virtual Function relation table R3, the practical function called of function call instruction is analyzed, and record the call relation of minor function, After setting the function actually called for static analysis entrance function again, recurrence executes step 3_1 to step 3_4 simulation and executed Journey;Otherwise, step 3_5 is executed;
If step 3_5, there are still untreated instructions in current basic block, next instruction of present instruction is set to Present instruction, then execute step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic block step 3_6, in the basic block sequential queue of static analysis entrance function, Next basic block of current basic block is set to current basic block, then executes step 3_1.
Emulation adress analysis algorithm in step 3_3 mainly carries out simulation execution to some instruction.It is emulated by distribution Memory calculate to the emulation address of compound type variable and be transmitted to emulation address, finally makes to emulate on address The information such as argument pointer, function pointer and the Virtual Function pointer of storage are safeguarded.
Step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as lower section Formula carries out emulation adress analysis:
If present instruction obtains for GetElementPtr instruction for each dimension of GetElementPtr instruction The operand value in each dimension is taken, and is aligned rear obtained byte number with the memory of corresponding operating number respectively and is multiplied, then By the product summation in all dimensions, to obtain total emulation address offset amount, and it is stored in emulation memory memory access list.Figure 4 be a possible emulation internal storage access list schematic diagram, and %2 is the 2nd byte of access, %3 on 1 emulation memory in index It is the 4th byte of access and 8 bytes on 2 emulation memory in index;
If present instruction is Store instruction, judge whether the first operand of Store instruction is that common variables refer to Needle, if so, by each emulation address and second operand in the emulation memory emulation list of the first operand of Store instruction Emulation internal storage access list in after each emulation address carries out combination of two, argument pointer relation table R is added1, otherwise, judgement Whether the first operand of Store instruction is function pointer, if so, by the emulation memory of the second operand of Store instruction After and function pointer is combined, function pointer relation table R is added in each emulation address in access list2;Otherwise, it does not grasp Make;
If present instruction is Load instruction, the operand emulation address of Load instruction is obtained in argument pointer relation table R1In corresponding emulation address, and will it is corresponding emulation address be added Load instruction emulation internal storage access list in;
If present instruction is Cast instruction, the emulation internal storage access list of the variable after type is converted is set to type The emulation internal storage access list of variable before conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if It is that the PHI emulation internal storage access list instructed is then set to the emulation internal storage access of variable corresponding to the basic block into circulation Otherwise the PHI emulation internal storage access list instructed is set to change corresponding to the basic block of all operands in PHI instruction by list The combination of the emulation internal storage access list of amount;
If present instruction is Ret instruction, judge whether return value is pointer, if so, by function call instruction Emulation memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is Call/Invoke instruction, whether it is Virtual Function calling form that discriminant function calls, if It is that Virtual Function is then obtained according to the Virtual Function relevant information that step 2 obtains, and by the operand of Call/Invoke instruction Emulation address is combined with Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is Call/Invoke instruction, judge whether the parameter of Call/Invoke instruction is pointer class Type, if so, the emulation memory that the emulation internal storage access list of the argument variable of Call/Invoke instruction is set to parameter variable is visited Ask list;Otherwise, it does not operate.
Step 4, according to function calling relationship, construct static function calling figure.
The embodiment of the present invention supports the visualization of static function calling figure, will simulate the function tune of implementation procedure record first It is organized into the dot file format of GraphViz support with relationship, the dot drawing tool in GraphViz is recalled, to function tune It is visualized with figure, generates the output file of the formats such as eps, png, pdf.
In order to prove the validity of present invention method, it is illustrated herein by an experiment:
In an experiment, the embodiment of the present invention devises a simple C++ program, as shown in Figure 5.It is used in the program Thread correlation function in Virtual Function, function pointer (including more complicated usage) and the library pthread.Fig. 6 is the present invention The dot file that embodiment analyzes Fig. 5 program, Fig. 7 are the function call graphs that the embodiment of the present invention goes out Fig. 5 program construction. From figure 7 it can be seen that the embodiment of the present invention has analyzed the calling of the constructed fuction from base class to derived class, Virtual Function vfunc_1 Calling, to function return function pointer func_3 and func_4 calling and cross-thread set membership.The present invention is real The complete functions for having analyzed all possible calling of Fig. 5 program of example are applied, and the result analyzed is very correct.
The embodiment of the present invention proposes a kind of static function calling figure construction method suitable for Virtual Function and function pointer, The LLVM IR file that this method generates source program compiling carries out static analysis, by the emulation address designed of the present invention into Row distribution, calculate and transmitting etc., can accurately simulation program implementation procedure.In simulation implementation procedure, to Virtual Function, refer to The information such as needle direction are safeguarded, the static function calling figure of program may finally be constructed.
The experimental results showed that this method is feasible, the function calls such as Virtual Function, function pointer pass can not only be comprehensively analyzed System, and can accurately analyze complicated function pointer method of calling, the function pointer returned such as function, call back function and The calling of pointer in array of function pointer etc..Last the method for the present invention also supports the library pthread, can extract cross-thread Set membership.Therefore, the method for the present invention has certain practicability and scientific research value.

Claims (5)

1. a kind of static function calling figure construction method suitable for Virtual Function and function pointer, it is characterized in that as follows into Row:
Step 1, the intermediate code for obtaining source program;
Key message in step 2, acquisition intermediate code, the key message includes basic block sequential queue and Virtual Function phase Close information;
Step 3, specified static analysis entrance function, and it is based on the key message, simulation execution is carried out to the intermediate code, The actual function that the function call instruction in the intermediate code calls is analyzed, while recording function calling relationship;
Step 4, according to the function calling relationship, construct static function calling figure.
2. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special Sign is, is acquisition basic block sequential queue according to the following procedure in the step 2:
Step 2A_1, i-th of function F in the intermediate code is obtainediCorresponding controlling stream graph CFGi, and according to the control Flow graph CFGiObtain CFGiCorresponding entrance basic block EntryBBi;I=1,2 ..., FN, FN are the total of function in intermediate code Number;
The entrance basic block EntryBB is setiFor current basic block, and by the controlling stream graph CFGiIn all circulations mark Knowing token is " False ", and expression does not traverse respective cycle;
If step 2A_2, current basic block is the head basic block for recycling L, 2A_3 is thened follow the steps, it is no to then follow the steps 2A_6;
If the identifier of the circulation L where step 2A_3, the described current basic block is " False ", 2A_4 is thened follow the steps, it is no Then follow the steps 2A_5;
Step 2A_4, i-th of basic block sequential queue BBOrderQ is added in current basic blocki, and by the subsequent of current basic block Successor basic blocks in basic block in circulation L are set as current basic block, and the successor basic blocks being not in circulation L are pressed into I-th of untreated stack UnProcessSiIn, then by it is described circulation L identifier be set to " True " after, return to step 2A_ 2;
If step 2A_5, described i-th of untreated stack UnProcessSiFor non-empty stack, then described i-th untreated stack is popped up UnProcessSiThe basic block of middle stack top, and by after the basic block of stack top setting current basic block, execute step 2A_2;
Step 2A_6, i-th of basic block sequential queue BBOrderQ is added in current basic blockiIn, if current basic block There are successor basic blocks, and successor basic block meets first sequence condition, then will meet any one successor basic blocks of first sequence condition For current basic block, remaining successor basic blocks is successively pressed into described i-th untreated stack UnProcessSiIn, return again to step Rapid 2A_2 is executed, and otherwise, executes step 2A_5;
Elder generation's sequence condition are as follows: forerunner's basic block of successor basic blocks is in i-th of basic block sequential queue BBOrderQi In.
3. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special Sign is, is to obtain the Virtual Function relevant information according to the following procedure in the step 2:
Step 2B_1, defining j-th of empty table structure body variable in intermediate code is Vj, Vj[k] indicates that j-th of empty table structure body becomes Measure VjIn k-th of member's array, and each member's array correspond to a class, Vj[k] [t] indicates j-th of empty table structure body variable VjIn k-th of member's array t-th of element, wherein j=1,2 ..., N;Total number of the N for empty table structure body variable, k=1, 2,…,Mj, MjFor j-th of empty table structure body variable VjThe total number of middle member's array, t=1,2 ..., Mjk, MjkFor j-th of void Table structure body variable VjIn k-th of member's array element total number;
Step 2B_2, by j-th of empty table structure body variable VjIn k-th of member's array the 1st element Vj[k] [1] is converted into whole Type variable V PtrOffsetjk, with j-th of empty table structure body variable VjCorresponding class is key, with the integer variable VPtrOffsetjkFor value, key-value pair Pair is constructed1, and by key-value pair Pair1J-th of empty table structure body variable V is addedjIt is corresponding Empty list index offset table VirtualPtrjIn, wherein the integer variable VPtrOffsetjkIndicate k-th of empty list index in institute State j-th of empty table structure body variable VjOffset byte number in the memory mapping for the object that corresponding class is created that;
Step 2B_3, by j-th of empty table structure body variable VjIn k-th of member's array t-th of element Vj[k] [t] is converted into letter Number pointer type variable V funjkt, to deviate byte number Vj[k] [1] is key, with the pointer type variable V funjktFor value, structure Build key-value pair Pair2, and by key-value pair Pair2J-th of empty table structure body variable V is addedjIn k-th of member's array correspond to class Virtual Function relevant information Table V irtualTabjkIn, wherein the pointer type variable V funjktIndicate j-th of empty table structure body Variable VjIn k-th of member's array correspond to the t-1 Virtual Function in the empty table of class, t=2 ..., Mjk
4. the static function calling figure construction method according to claim 1 suitable for Virtual Function and function pointer, special Sign is that simulation described in the step 3 executes and specifically includes step:
Step 3_1, according to the basic block sequential queue of the static analysis entrance function, current base is set by team's head basic block This block;
Step 3_2, first instruction being arranged in current basic block is present instruction;
Step 3_3, emulation adress analysis is carried out to the present instruction, obtains intermediate code simulation and executes to the present instruction In the process, argument pointer, the function pointer, Virtual Function information stored on address are emulated, and is successively stored to argument pointer relationship Table R1, function pointer relation table R2, Virtual Function relation table R3In;
If step 3_4, the described present instruction is function call instruction, according to the argument pointer relation table R1, function pointer Relation table R2, Virtual Function relation table R3, the practical function called of the function call instruction is analyzed, and record the tune of minor function With relationship, then after setting the function actually called for the static analysis entrance function, recurrence executes step 3_1 to walking Rapid 3_4 simulates implementation procedure;Otherwise, step 3_5 is executed;
If there are still untreated instructions in step 3_5, the described current basic block, next instruction of present instruction is set to Present instruction, then execute step 3_3;Otherwise, step 3_6 is executed;
If there are still untreated basic block in the basic block sequential queue of step 3_6, the described static analysis entrance function, Next basic block of current basic block is set to current basic block, then executes step 3_1.
5. the static function calling figure construction method according to claim 4 suitable for Virtual Function and function pointer, special Sign is that the step 3_3 is when simulation implementation procedure starts, after untreated instruction distribution emulation memory, by such as Under type carries out emulation adress analysis:
If present instruction is that element address is taken to instruct, for each dimension for taking element address to instruct, each dimension is obtained On operand value, and be aligned rear obtained byte number with the memory of corresponding operating number respectively and be multiplied, then by all dimensions On product summation, to obtain total emulation address offset amount, and be stored in emulation memory memory access list;
If present instruction is to deposit instruction, judgement deposits whether the first operand of instruction is common variables pointer, if so, will Deposit the emulation internal storage access of each emulation address and second operand in the emulation memory emulation list of the first operand of instruction After each emulation address carries out combination of two in list, argument pointer relation table R is added1, otherwise, the first behaviour of instruction is deposited in judgement Whether be function pointer, if so, by each emulation in the emulation internal storage access list for the second operand for depositing instruction if counting Address, after being combined with the function pointer, the addition function pointer relation table R2;Otherwise, it does not operate;
If present instruction is instruction fetch, the operand emulation address of the instruction fetch is obtained in the argument pointer relation table R1In corresponding emulation address, and will it is corresponding emulation address be added instruction fetch emulation internal storage access list in;
If present instruction is type conversion instructions, the emulation internal storage access list of the variable after type is converted is set to type The emulation internal storage access list of variable before conversion;
If present instruction be PHI instruction, judge basic block locating for present instruction whether be hydraulic circulating head basic block, if so, Then the emulation internal storage access that the PHI emulation internal storage access list instructed is set to variable corresponding to the basic block into circulation is arranged Otherwise the PHI emulation internal storage access list instructed is set to variable corresponding to the basic block of all operands in PHI instruction by table Emulation internal storage access list combination;
If present instruction is return instruction, judge whether return value is pointer, if so, by the emulation of function call instruction Memory access table is set to the emulation internal storage access list of return value;Otherwise, it does not operate;
If present instruction is function call instruction, judge whether the function call is Virtual Function calling form, if so, Virtual Function is obtained according to the Virtual Function relevant information that step 2 obtains, and by the emulation address of the operand of function call instruction It is combined with the Virtual Function, and Virtual Function relation table R is added3, otherwise, do not operate;
If present instruction is function call instruction, whether the parameter of discriminant function call instruction is pointer type, if so, will The emulation internal storage access list of the argument variable of function call instruction is set to the emulation internal storage access list of parameter variable;Otherwise, It does not operate.
CN201910492850.0A 2019-06-06 2019-06-06 Static function call graph construction method suitable for virtual function and function pointer Active CN110187988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910492850.0A CN110187988B (en) 2019-06-06 2019-06-06 Static function call graph construction method suitable for virtual function and function pointer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910492850.0A CN110187988B (en) 2019-06-06 2019-06-06 Static function call graph construction method suitable for virtual function and function pointer

Publications (2)

Publication Number Publication Date
CN110187988A true CN110187988A (en) 2019-08-30
CN110187988B CN110187988B (en) 2021-08-13

Family

ID=67720848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910492850.0A Active CN110187988B (en) 2019-06-06 2019-06-06 Static function call graph construction method suitable for virtual function and function pointer

Country Status (1)

Country Link
CN (1) CN110187988B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124527A (en) * 2019-10-24 2020-05-08 成都无糖信息技术有限公司 Method for extracting virtual table function list in dynamic link library
CN112130848A (en) * 2020-09-24 2020-12-25 中国科学院计算技术研究所 Band width sensing circulation blocking optimization technology facing scratch pad memory
CN112395614A (en) * 2020-11-27 2021-02-23 南京理工大学 Android application program virtualization protection method based on LLVM
CN112487438A (en) * 2020-12-12 2021-03-12 南京理工大学 Heap object Use-After-Free vulnerability detection method based on identifier consistency
CN114527963A (en) * 2020-11-23 2022-05-24 中国科学院信息工程研究所 Class inheritance relationship identification method in C + + binary file and electronic device
CN114610320A (en) * 2022-03-21 2022-06-10 浙江大学 LLVM-based variable type information repairing and comparing method and system
CN114741131A (en) * 2022-04-02 2022-07-12 深圳软牛科技有限公司 Hiding method, device and equipment of dynamic library derived symbols and storage medium
CN114968417A (en) * 2021-02-25 2022-08-30 中移物联网有限公司 Function calling method, device and equipment
CN116340942A (en) * 2023-03-01 2023-06-27 软安科技有限公司 Function call graph construction method based on object propagation graph and pointer analysis
CN116629353A (en) * 2023-07-24 2023-08-22 北京邮电大学 FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101968766A (en) * 2010-10-21 2011-02-09 上海交通大学 System for detecting software bug triggered during practical running of computer program
US8539450B2 (en) * 2009-03-11 2013-09-17 Nec Laboratories America, Inc. Fast and accurate data race detection for concurrent programs with asynchronous calls
CN103631573A (en) * 2012-08-24 2014-03-12 中兴通讯股份有限公司 Method and system for obtaining execution time of transferable functions
CN104317969A (en) * 2014-11-18 2015-01-28 合肥康捷信息科技有限公司 Processing method based on type conversion of cfg file and application of processing method based on type conversion of cfg file
CN104331368A (en) * 2014-11-18 2015-02-04 合肥康捷信息科技有限公司 Method for performing static analysis on C++ virtual function call upon cfg (configuration) files
CN104881610A (en) * 2015-06-16 2015-09-02 北京理工大学 Method for defending hijacking attacks of virtual function tables
CN104965788A (en) * 2015-07-03 2015-10-07 电子科技大学 Code static detection method
CN105242929A (en) * 2015-10-13 2016-01-13 西安交通大学 Design method aiming at binary program automatic parallelization of multi-core platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8539450B2 (en) * 2009-03-11 2013-09-17 Nec Laboratories America, Inc. Fast and accurate data race detection for concurrent programs with asynchronous calls
CN101968766A (en) * 2010-10-21 2011-02-09 上海交通大学 System for detecting software bug triggered during practical running of computer program
CN103631573A (en) * 2012-08-24 2014-03-12 中兴通讯股份有限公司 Method and system for obtaining execution time of transferable functions
CN104317969A (en) * 2014-11-18 2015-01-28 合肥康捷信息科技有限公司 Processing method based on type conversion of cfg file and application of processing method based on type conversion of cfg file
CN104331368A (en) * 2014-11-18 2015-02-04 合肥康捷信息科技有限公司 Method for performing static analysis on C++ virtual function call upon cfg (configuration) files
CN104881610A (en) * 2015-06-16 2015-09-02 北京理工大学 Method for defending hijacking attacks of virtual function tables
CN104965788A (en) * 2015-07-03 2015-10-07 电子科技大学 Code static detection method
CN105242929A (en) * 2015-10-13 2016-01-13 西安交通大学 Design method aiming at binary program automatic parallelization of multi-core platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
黄双玲,黄章进,顾乃杰: "基于CFG的函数调用关系静态分析方法", 《计算机系统应用》 *
黄双玲: "面向C/C++程序行数调用关系的静态分析方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124527A (en) * 2019-10-24 2020-05-08 成都无糖信息技术有限公司 Method for extracting virtual table function list in dynamic link library
CN112130848A (en) * 2020-09-24 2020-12-25 中国科学院计算技术研究所 Band width sensing circulation blocking optimization technology facing scratch pad memory
CN112130848B (en) * 2020-09-24 2022-06-14 中国科学院计算技术研究所 Band-width sensing circulation block optimization method, compiling system, equipment and storage medium for scratch-pad memory
CN114527963A (en) * 2020-11-23 2022-05-24 中国科学院信息工程研究所 Class inheritance relationship identification method in C + + binary file and electronic device
CN112395614A (en) * 2020-11-27 2021-02-23 南京理工大学 Android application program virtualization protection method based on LLVM
CN112395614B (en) * 2020-11-27 2023-07-28 南京理工大学 LLVM-based Android application program virtualization protection method
CN112487438A (en) * 2020-12-12 2021-03-12 南京理工大学 Heap object Use-After-Free vulnerability detection method based on identifier consistency
CN112487438B (en) * 2020-12-12 2022-11-04 南京理工大学 Heap object Use-After-Free vulnerability detection method based on identifier consistency
CN114968417A (en) * 2021-02-25 2022-08-30 中移物联网有限公司 Function calling method, device and equipment
CN114968417B (en) * 2021-02-25 2024-05-24 中移物联网有限公司 Function calling method, device and equipment
CN114610320A (en) * 2022-03-21 2022-06-10 浙江大学 LLVM-based variable type information repairing and comparing method and system
CN114741131A (en) * 2022-04-02 2022-07-12 深圳软牛科技有限公司 Hiding method, device and equipment of dynamic library derived symbols and storage medium
CN114741131B (en) * 2022-04-02 2023-08-15 深圳软牛科技有限公司 Hiding method, device, equipment and storage medium for dynamic library derived symbol
CN116340942A (en) * 2023-03-01 2023-06-27 软安科技有限公司 Function call graph construction method based on object propagation graph and pointer analysis
CN116340942B (en) * 2023-03-01 2024-04-30 软安科技有限公司 Function call graph construction method based on object propagation graph and pointer analysis
CN116629353A (en) * 2023-07-24 2023-08-22 北京邮电大学 FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method
CN116629353B (en) * 2023-07-24 2023-11-07 北京邮电大学 FPGA-oriented coarse-granularity FIFO hardware channel automatic fitting method

Also Published As

Publication number Publication date
CN110187988B (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN110187988A (en) Static function calling figure construction method suitable for Virtual Function and function pointer
Gupta et al. Deepfix: Fixing common c language errors by deep learning
Varró et al. Benchmarking for graph transformation
CN104965788B (en) A kind of code static detection method
Puschner et al. Writing temporally predictable code
Danelutto et al. A methodology for the development and the support of massively parallel programs
CN104794401B (en) A kind of semiology analysis leak detection method of static analysis auxiliary
CN102289362A (en) Segmented symbolic execution device and working method thereof
Amparore et al. (Stochastic) model checking in GreatSPN
CN108363660B (en) Test program generation method and device
Radul et al. Automatically batching control-intensive programs for modern accelerators
CN109814924B (en) Software complexity calculation method
van Merriënboer et al. Tangent: Automatic differentiation using source code transformation in Python
Doberkat et al. ProSet—a language for prototyping with sets
Balbo et al. On the computation of performance characteristics of concurrent programs using GSPNs
Zhang et al. An optimization algorithm applied to the class integration and test order problem
Vujošević Janičić Concurrent bug finding based on bounded model checking
Bartels et al. Verification of distributed embedded real-time systems and their low-level implementations using timed CSP
CN108647134B (en) A kind of task monitoring, tracking and recognition methods towards multicore architecture
Tai Automated test sequence generation using sequencing constraints for concurrent programs
Legay et al. Statistical model checking of llvm code
Ivutin et al. Low-level Code Auto-tuning for State-of-the-art Multicore Architectures
Dong et al. AKGF: Automatic Kernel Generation for DNN on CPU-FPGA
Yehia UCIS Applications: Improving Verification Productivity, Simulation Throughput, and Coverage Closure Process
Sharma et al. Nature-Inspired Optimization Based Multithread Scheduling For Program Segments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant