CN108874470B - Information processing method, server and computer storage medium - Google Patents

Information processing method, server and computer storage medium Download PDF

Info

Publication number
CN108874470B
CN108874470B CN201710331113.3A CN201710331113A CN108874470B CN 108874470 B CN108874470 B CN 108874470B CN 201710331113 A CN201710331113 A CN 201710331113A CN 108874470 B CN108874470 B CN 108874470B
Authority
CN
China
Prior art keywords
function
address
tested
code
call
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710331113.3A
Other languages
Chinese (zh)
Other versions
CN108874470A (en
Inventor
熊彪
尚鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710331113.3A priority Critical patent/CN108874470B/en
Publication of CN108874470A publication Critical patent/CN108874470A/en
Application granted granted Critical
Publication of CN108874470B publication Critical patent/CN108874470B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses an information processing method, a server and a computer storage medium, wherein the method comprises the following steps: reversely traversing function codes in the binary file in a static mode; analyzing the function codes to obtain a global function topological relation in the function codes; acquiring the called actual address of the function in the running process of the function code in a dynamic mode; obtaining a local function topological relation according to the called actual address of the function; and according to the local function topological relation, supplementing missing data in the global function topological relation to obtain a function call relation chain.

Description

Information processing method, server and computer storage medium
Technical Field
The present invention relates to communications technologies, and in particular, to an information processing method, a server, and a computer storage medium.
Background
Various applications used by the user, various internet services, various life services, and the like can be implemented by computer programming. In programming, a plurality of functions are sometimes called in series to achieve a specific purpose, and such a calling mode is referred to as a function chain call. Some functions in the function chain are independent, and some functions are only used in function combination and are not called independently. The continuous configuration and combination of the objects are the common occasions of function chain calling, and a corresponding assembly class library is required to be called in the compiling process of programming.
The conventional method for acquiring the function call relation chain is realized by a doxygen tool. This approach is to statically parse the source code, by parsing the code's calls to generate a chain of function call relationships, namely: and acquiring the function call relation chain in a mode of analyzing the source code.
The acquisition mode of the function call relation chain has the following problems: in the process of analyzing the dynamic language by adopting a static source code analyzing mode, the function call relation chain has a missing phenomenon because the actual function address can be determined only in the dynamic actual operation process.
However, in the related art, there is no effective solution to this problem.
Disclosure of Invention
In view of this, embodiments of the present invention provide an information processing method, a server, and a computer storage medium, which at least solve the problems in the prior art.
The embodiment of the invention provides an information processing method, which comprises the following steps:
reversely traversing function codes in the binary file in a static mode;
analyzing the function codes to obtain a global function topological relation in the function codes;
acquiring the called actual address of the function in the running process of the function code in a dynamic mode;
obtaining a local function topological relation according to the called actual address of the function;
and according to the local function topological relation, completing missing data in the global function topological relation to obtain a function call relation chain.
In the foregoing scheme, the analyzing the function code to obtain a global function topology relationship in the function code includes:
analyzing the function code in a static analysis mode to obtain all instructions in the function code;
identifying a function address calling instruction from all instructions in the function code;
and extracting a first function call relation chain from the function code according to the position indicated by the function address call instruction, wherein the first function call relation chain is used for representing the global function topological relation.
In the above scheme, the dynamically obtaining the called actual address of the function when the function code is executed includes:
and switching from the static analysis mode to a dynamic analysis mode, and analyzing the function virtual address contained in the function code to obtain the called actual address of the function corresponding to the function virtual address.
In the foregoing solution, the analyzing the function virtual address included in the function code to obtain an actual address to which the function corresponding to the function virtual address is called includes:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
before the process to be tested runs, a breakpoint is set in the process to be tested, running characteristic data is obtained when the breakpoint runs to the breakpoint, and an actual address of the function to be called is obtained according to the characteristic data.
In the foregoing solution, the analyzing the function virtual address included in the function code to obtain an actual address to which the function corresponding to the function virtual address is called includes:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
suspending the process to be tested;
a breakpoint is set at the position of the function address calling instruction in the process to be tested;
and running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function.
In the above scheme, the method further comprises:
obtaining a second function call relation chain according to the called actual address of the function, wherein the first function call relation chain is used for representing the local function topological relation;
the method includes the following steps of according to the local function topological relation, completing missing data in the global function topological relation to obtain a function call relation chain, and the method further includes:
and obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
An embodiment of the present invention further provides a server, where the server includes:
the traversing unit is used for reversely traversing the function codes in the binary file in a static mode;
the first analysis unit is used for analyzing the function codes and acquiring global function topological relation in the function codes;
the second analysis unit is used for acquiring the called actual address of the function in the running process of the function code in a dynamic mode;
the first processing unit is used for obtaining a local function topological relation according to the called actual address of the function;
and the second processing unit is used for supplementing missing data in the global function topological relation according to the local function topological relation to obtain a function call relation chain.
In the foregoing solution, the first parsing unit is further configured to:
analyzing the function code in a static analysis mode to obtain all instructions in the function code;
identifying a function address calling instruction from all instructions in the function code;
and extracting a first function call relation chain from the function code according to the position indicated by the function address call instruction, wherein the first function call relation chain is used for representing the global function topological relation.
In the foregoing solution, the second parsing unit is further configured to:
and switching from the static analysis mode to a dynamic analysis mode, and analyzing the function virtual address contained in the function code to obtain the called actual address of the function corresponding to the function virtual address.
In the foregoing solution, the second parsing unit is further configured to:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
before the process to be tested runs, a breakpoint is set for the process to be tested, running characteristic data is obtained when the breakpoint is run, and the called actual address of the function is obtained according to the characteristic data.
In the foregoing solution, the second parsing unit is further configured to:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
suspending the process to be tested;
a breakpoint is set at the position of the function address calling instruction in the process to be tested;
and running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function.
In the above solution, the server further includes: a relationship chain acquiring unit for:
obtaining a second function call relation chain according to the called actual address of the function, wherein the first function call relation chain is used for representing the local function topological relation;
the second processing unit is further configured to:
and obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
An embodiment of the present invention further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program, when executed by a processor, implements the steps of any one of the above information processing methods.
An embodiment of the present invention further provides a server, where the server includes:
a memory for storing a computer program capable of running on the processor;
a processor for executing the steps of any of the above information processing methods when the computer program is run.
The information processing method of the embodiment of the invention comprises the following steps: reversely traversing function codes in the binary file in a static mode; analyzing the function codes to obtain a global function topological relation in the function codes; acquiring the called actual address of the function in the running process of the function code in a dynamic mode; obtaining a local function topological relation according to the called actual address of the function; and according to the local function topological relation, supplementing missing data in the global function topological relation to obtain a function call relation chain.
By adopting the embodiment of the invention, as for the phenomenon that the function call relation chain is lost, the mechanism combining the static mode and the dynamic mode is adopted, firstly, the function codes in the binary file are reversely traversed in the static mode, the global function topological relation is obtained through analysis, then, the local function topological relation is obtained through analysis in the dynamic mode, and the missing data in the global function topological relation is supplemented according to the local function topological relation to obtain the function call relation chain.
Drawings
FIG. 1 is a diagram of hardware entities for performing information interaction in an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating a method according to an embodiment of the present invention;
FIG. 3 is a diagram of a system architecture according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a hardware architecture of a server according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a server module according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating identification of a function call instruction according to an embodiment of the present invention;
FIG. 7 is a flowchart of static parsing for an application scenario in which embodiments of the present invention are applied;
FIG. 8 is a flow chart illustrating dynamic parsing of an application scenario in which embodiments of the present invention are applied;
FIG. 9 is a schematic view of a dynamic and static combination process flow of an application scenario to which an embodiment of the present invention is applied.
Detailed Description
The following describes the embodiments in further detail with reference to the accompanying drawings.
A mobile terminal implementing various embodiments of the present invention will now be described with reference to the accompanying drawings. In the following description, suffixes such as "module", "component", or "unit" used to indicate elements are used only for facilitating the description of the embodiments of the present invention, and do not have a specific meaning per se. Thus, "module" and "component" may be used in a mixture.
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks disclosed have not been described in detail as not to unnecessarily obscure aspects of the embodiments.
In addition, although the terms "first", "second", etc. are used herein several times to describe various elements (or various thresholds or various applications or various instructions or various operations) etc., these elements (or thresholds or applications or instructions or operations) should not be limited by these terms. These terms are only used to distinguish one element (or threshold or application or instruction or operation) from another element (or threshold or application or instruction or operation). For example, a first operation may be referred to as a second operation, and a second operation may be referred to as a first operation, without departing from the scope of the invention, the first operation and the second operation being operations, except that they are not the same operation.
The steps in the embodiment of the present invention are not necessarily processed according to the described step sequence, and may be optionally rearranged in a random manner, or steps in the embodiment may be deleted, or steps in the embodiment may be added according to requirements.
The term "and/or" in embodiments of the present invention refers to any and all possible combinations including one or more of the associated listed items. It is also noted that: when used in this specification, the term "comprises/comprising" specifies the presence of stated features, integers, steps, operations, elements and/or components but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements and/or components and/or groups thereof.
The intelligent terminal (e.g., mobile terminal) of the embodiments of the present invention may be implemented in various forms. For example, the mobile terminal described in the embodiments of the present invention may include a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a numerical broadcast receiver, a Personal Digital Assistant (PDA), a tablet computer (PAD), a Portable Multimedia Player (PMP), a navigation device, and the like, and a fixed terminal such as a numerical TV, a desktop computer, and the like. In the following, it is assumed that the terminal is a mobile terminal. However, it will be understood by those skilled in the art that the configuration according to the embodiment of the present invention can be applied to a fixed type terminal in addition to elements particularly used for moving purposes.
Fig. 1 is a schematic diagram of hardware entities performing information interaction in an embodiment of the present invention, where fig. 1 includes: terminal 1, server 2. The terminal 1 may be composed of a plurality of terminals 11-13, and performs information interaction with the server 2 in a wireless or wired manner. The number of servers in fig. 1 is merely for reference and does not limit the number of servers.
In the related art, the method for acquiring the function call relation chain is to statically analyze the source code, and generate the function call relation chain by analyzing the code call, that is: and acquiring the function call relation chain in a mode of analyzing the source code. The method for acquiring the function call relation chain has the defects. 1) The method comprises the steps that a source code is required to be obtained, and function call relation chain data can be obtained; 2) In the method of statically analyzing the source code, a function call relation chain has a missing phenomenon in the process of analyzing the dynamic language, because an actual function address can be determined only in the process of dynamically and actually operating; 3) For boundary coupling call analysis between interfaces crossing modules, the method cannot be obtained, because static decoding realized by a doxygen tool is analyzed in the module, the call relation between the modules cannot be processed, and the error rate of character matching is high; 4) After the magnitude of the source code reaches a certain level, the analysis process is too long, the consumed time is too long, and the instantaneity of function call relation chain data is not easy to maintain. It can be seen that, in the related art, the obtained function call relation chain is incomplete and has a deficiency, which results in the above unsolved problem. By adopting the embodiment of the present invention, the server executes the processing logic 10 shown in fig. 1, and with a mechanism combining a static manner and a dynamic manner, the server reversely traverses the function codes in the binary file in the static manner, analyzes to obtain the global function topological relation, then analyzes in the dynamic manner to obtain the local function topological relation, and according to the local function topological relation, complements the missing data in the global function topological relation to obtain the function call relation chain, which is complete and has no missing, thereby avoiding the above-mentioned problems caused by the missing function call relation chain.
Processing logic 10, comprising: s1, reversely traversing function codes in a binary file in a static mode, and analyzing to obtain a global function topological relation in the function codes; s2, analyzing in a dynamic mode to obtain a local function topological relation in the function code; and S3, according to the local function topological relation, supplementing missing data in the global function topological relation to obtain a function call relation chain.
The above example of fig. 1 is only an example of a system architecture for implementing the embodiment of the present invention, and the embodiment of the present invention is not limited to the system architecture described in the above fig. 1, and various embodiments of the method of the present invention are proposed based on the system architecture described in the above fig. 1. Herein, the function call relation chain may also be referred to simply as a function call chain.
As shown in fig. 2, an information processing method according to an embodiment of the present invention includes: function code in a binary file is traversed in a reverse direction in a static manner (101). In the embodiment of the invention, through traversal analysis on reverse assembly, for example, the function code is C/C + + code, the function call in the function code is reversely converted into assembly language after being compiled into binary. There are two types of instructions that call functions: one type is call instructions. The other is an unconditional jump instruction. Both types of instructions are used for calling function real addresses, namely, addresses called by both types of instructions are the real addresses of the functions. Both types of instructions need to be identified from the function code. And analyzing the function codes to obtain a global function topological relation in the function codes (102). For the static analysis process, the functions in each binary file only need to be traversed reversely, and each instruction in each function is traversed, so that the call of the call instruction and the jump instruction is identified. The call instruction and the jump instruction are both instructions called by function addresses, and the analysis process of the static function call relation chain can be completed by extracting the instructions called by the function addresses, namely the call instruction and the jump instruction. By adopting the static reverse binary analysis method of the embodiment of the invention, the global graph of the function call relation chain can be obtained.
The information processing method of the embodiment of the invention comprises the following steps: the actual address where the function code runtime function is invoked is obtained in a dynamic manner (103). The function real address is distinct from the virtual address. Corresponding to the actual address of the function, the corresponding function may be called a real function, and corresponding to the virtual address, the corresponding function may be called a virtual function. Specifically, the call relation of the function is observed from an angle of an inverse binary system, and the call of the function can be divided into two types: 1) The calling of the actual address of the function is divided into two types, one is call instruction calling, the other is unconditional jump instruction calling, and the addresses called by the instructions are all actual addresses. 2) The calling of the function virtual address refers to that a function is transmitted as a parameter, and calling and virtual function calling are carried out through parameters/variables. The calling of the actual function address can be obtained through static analysis, while the calling of the virtual function address is a calling of a register and a virtual address when the static analysis is carried out on the calling of the virtual function address, and the actually called function can be known only by extracting the actual function address from the register during dynamic operation.
And acquiring the actual address of the function called in the function code operation in a dynamic mode, wherein the actual address can be realized by adopting a dynamic instrumentation mode. By adopting a dynamic instrumentation mode, the real function call address of the dynamic (or polymorphic) function during operation can be obtained. The dynamic pile inserting mode is as follows: in order to ensure the integrity of the original logic of a program to be tested (such as function code), some probes are inserted into the program on the basis of the integrity, characteristic data of program operation is thrown out through the execution of the probes, and a function call relation chain of the program can be obtained through the analysis of the characteristic data.
The information processing method of the embodiment of the invention comprises the following steps: and obtaining a local function topological relation according to the called actual address of the function (104). And according to the local function topological relation, supplementing missing data in the global function topological relation to obtain a function call relation chain (105).
By adopting the embodiment of the invention, the analysis of the function call relation chain is realized, the call relation among the functions can be completely acquired, and the assistance is provided for test analysis; the accurate division of the test range can be intelligently carried out on the change of the codes.
In the embodiment of the invention, the function code is analyzed in a static analysis mode to obtain all instructions in the function code. From all instructions in the function code, function address call instructions (including call and jump instructions) are identified. And extracting a first function call relation chain from the function codes according to the position indicated by the function address call instruction, wherein the first function call relation chain is used for representing the global function topology relation, so that the global function topology relation in the function codes is finally obtained through analyzing the function codes in a static analysis mode.
For static analysis, the analysis process of the static function call relation chain can be completed only by traversing the function in each binary file, traversing each instruction in each function, identifying a function address call instruction, such as a call instruction and a jump call instruction, and extracting the call instruction and the jump call instruction. And extracting the call instruction and the jump call instruction from all the instructions of the function by statically and reversely traversing the function instruction in the binary system, and then extracting the function call relation chain according to the call instruction and the jump call instruction. The function call relationship chain may be saved to a static version call chain database. However, the static analysis method is invalid for function call of the virtual address, and for calling the virtual address of the function, the function code can be determined in real time only when the function code needs to be dynamically run, so as to obtain the actual address of the function corresponding to the virtual address.
In the embodiment of the present invention, the static analysis mode is switched to the dynamic analysis mode, and the function virtual address included in the function code is analyzed to obtain the actual address, called by the function, corresponding to the function virtual address, so that the actual address, called by the function when the function code runs, is finally obtained through the analysis of the function code in the dynamic analysis mode.
In the above embodiment, the first function call relation chain has missing data, where the missing data is caused by the call of the virtual function address, the call of the virtual function address is a call to a register during the static resolution, and the virtual function address is a virtual address, and the actually called function can be known only by extracting the actual function address from the register during the dynamic running of the function code. The calling of the function virtual address refers to that a function is used as a parameter to be transmitted, and calling and virtual function calling are carried out through parameters/variables. By adopting the embodiment, for virtual function call, the actual address of the function call can be obtained in a dynamic analysis mode.
In the embodiment of the invention, a process to be tested, which needs to dynamically acquire a function call relation chain, is determined from a function code, and the process to be tested comprises a function virtual address. Before the process to be tested runs, a breakpoint is set in the process to be tested, running characteristic data is obtained when the breakpoint runs to the breakpoint, and an actual address of the function to be called is obtained according to the characteristic data. One example is: before the process to be tested runs, obtaining running characteristic data when the process to be tested runs down a breakpoint and runs to the breakpoint, capturing calling information according to the characteristic data to obtain a second function calling relation chain, wherein the second function calling relation chain is used for representing an actual address called by the local function topological relation to obtain the local function topological relation.
In the embodiment of the invention, the process to be tested, which needs to dynamically acquire the function call relation chain, is determined from the function code, and the process to be tested comprises the function virtual address. And suspending the process to be tested, and setting a breakpoint at the position of the function address calling instruction in the process to be tested. And running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function. One example is: determining that a process to be tested needing to dynamically acquire a function call relation chain exists in the function code; and suspending the process to be tested, calling a breakpoint at the instruction for the function address, executing the process to be tested, and extracting a function actual address if the breakpoint is hit. And obtaining a second function call relation chain according to the actual function address, wherein the second function call relation chain is used for representing the actual address called by the local function topology relation to obtain the local function topology relation.
In the embodiment of the present invention, a second function call relation chain is obtained according to the actual address of the function to be called, and the first function call relation chain is used to represent the local function topology relation. And in the process of obtaining a function calling relation chain by supplementing missing data in the global function topological relation according to the local function topological relation, obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
In this embodiment, a static reverse assembly code disassembling manner and a dynamic assembly code instrumentation manner are adopted to obtain the function call relation chain, that is: and merging the static function call relation chain data and the dynamic function call relation chain data so as to supplement the statically analyzed function call relation chain data. Specifically, a static inverse disassembling code mode is adopted to obtain a global function topological relation. For the part with the deficiency, a dynamic assembly code instrumentation mode is adopted, and the function address actually operated in the dynamic operation process of the function call relation chain mode is analyzed, so that the real function call address in the dynamic (polymorphic) function operation process is obtained. Wherein, the stake is inserted and is pointed: some probes are inserted into the program on the basis of ensuring the original logic integrity of the program to be tested, characteristic data of program operation is thrown out through the execution of the probes, and a function call relation chain of the program can be obtained through the analysis of the data. In the embodiment of the invention, the calling relation among the functions can be completely acquired through the analysis of the function calling relation chain, so as to provide assistance for test analysis; the test range can be intelligently and accurately divided according to the change of the codes.
As shown in fig. 3, the information processing system according to the embodiment of the present invention includes a terminal 41 and a server 42, where when the server 42 parses a function code collected from the terminal 41, the server reversely traverses the function code in a binary file in a static manner, parses the function code to obtain a global function-topology relationship in the function code, parses the function code in a dynamic manner to obtain a local function-topology relationship in the function code, and completes missing data in the global function-topology relationship according to the local function-topology relationship to obtain a function call relationship chain. The server 42 may be a test platform for test case (or test data) testing. The server 42 includes: a traversal unit 421 configured to reversely traverse the function code in the binary file in a static manner; a first analyzing unit 422, configured to analyze the function code to obtain a global function topology relationship in the function code; a second parsing unit 423, configured to obtain, in a dynamic manner, a real address where the function code runtime function is called; a first processing unit 424, configured to obtain a local function topology relationship according to the actual address to which the function is invoked; and a second processing unit 425, configured to complete missing data in the global function topology according to the local function topology to obtain a function call relationship chain.
In an embodiment of the present invention, the first analyzing unit is further configured to: analyzing the function code in a static analysis mode to obtain all instructions in the function code; identifying a function address calling instruction from all instructions in the function code; and extracting a first function call relation chain from the function code according to the position indicated by the function address call instruction, wherein the first function call relation chain is used for representing the global function topological relation.
In an implementation manner of the embodiment of the present invention, the second parsing unit is further configured to: and switching to a dynamic analysis mode from the static analysis mode, analyzing the function virtual address contained in the function code, and obtaining the actual address of the function corresponding to the function virtual address, which is called.
In an implementation manner of the embodiment of the present invention, the second parsing unit is further configured to: determining a process to be tested, which needs to dynamically acquire a function call relation chain, from the function code, wherein the process to be tested comprises a function virtual address; before the process to be tested runs, a breakpoint is set for the process to be tested, running characteristic data is obtained when the breakpoint is run, and the called actual address of the function is obtained according to the characteristic data.
In an implementation manner of the embodiment of the present invention, the second parsing unit is further configured to: determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address; suspending the process to be tested; a breakpoint is set at the position of the function address calling instruction in the process to be tested; and running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function.
In an implementation manner of the embodiment of the present invention, the server further includes: a relationship chain acquiring unit for: obtaining a second function call relation chain according to the called actual address of the function, wherein the first function call relation chain is used for representing the local function topological relation; the second processing unit is further configured to: and obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
A computer-readable storage medium of an embodiment of the present invention, on which a computer program is stored, is characterized in that the computer program is executed by a processor to implement the steps of the information processing method according to the above embodiment in real time.
As shown in fig. 4, a server according to an embodiment of the present invention includes: a memory 61 for storing a computer program capable of running on the processor; the processor 62 is configured to execute the steps of the information processing method in the above embodiments when the computer program is executed. The server may further include: the external communication interface 63 is used for performing information interaction with peripheral devices such as a terminal, and specifically, when the server analyzes the function codes collected from the terminal, the function codes in the binary file are reversely traversed in a static manner, a global function topological relation in the function codes is obtained through analysis, a local function topological relation in the function codes is obtained through dynamic analysis, and missing data in the global function topological relation is supplemented according to the local function topological relation to obtain a function call relation chain. The server may further include: an internal communication interface 64, wherein the internal communication interface 64 may be a bus interface such as a PCI bus.
The embodiment of the invention is explained by taking a real application scene as an example as follows:
the embodiment of the invention can realize the extraction of the function call relation chain, and is a scheme for acquiring the function call relation chain in a dynamic and static combination mode set. And acquiring the function call relation chain by adopting a static reverse disassembling code mode and a dynamic assembling code instrumentation mode. The method comprises the steps of obtaining a function call relation chain global graph by adopting a static reverse analysis binary mode. And acquiring a real function call address when a dynamic (polymorphic) function runs by adopting a dynamic instrumentation mode.
As shown in fig. 5, a module adopted in the embodiment of the present invention includes: a static analysis module 51, a dynamic analysis module 52 and a relationship chain statistics module 53. The static analysis module 51 is configured to analyze the function code in a static inverse disassembling code manner, and obtain a global graph in which the first function call chain is a function call relation chain; the dynamic analysis module 52 is configured to obtain a real function call address when a dynamic (polymorphic) function runs by using a dynamic instrumentation method, and obtain a second function call chain through the real function call address when the function runs, where the obtained second function call chain is a local graph of a function call relation chain; and the relation chain counting module 53 is configured to obtain a complete function call relation chain according to the second function call chain and the first function call chain, and specifically, complete missing data in the first function call chain through the second function call chain, so as to count and perfect the function call chain.
1. The static analysis module 51 realizes the calling relationship of the functions obtained from the binary reverse angle. In the process of performing static binary reverse analysis to obtain the function call relation chain, the specific implementation method is as follows:
the calling relation of the function is observed at the angle of the reverse binary system, and the function can be divided into the following two types:
1) The calling of the actual address of the function is divided into two types, one is call instruction calling, the other is unconditional jump instruction calling, and the addresses called by the instructions are all actual addresses.
2) The calling of the function virtual address refers to that a function is transmitted as a parameter, and calling and virtual function calling are carried out through parameters/variables. This type of call is a call to a register in a static period, is a virtual address, and requires a real address to be extracted from the register at dynamic runtime to know the function that is actually called.
The calling of the ordinary function means that the calling can be directly carried out through the actual address of the function. From the perspective of the C/C + + language, this function may be a pure C function or a C + + class member non-virtual function. Through analysis of inverse assembly, after a function call in a C/C + + code is compiled into a binary system, the function call is inverted into assembly language, and from the perspective of a common function, an instruction for calling the function has two types: one type is call instructions. Another type is an unconditional jump instruction, as shown in FIG. 6.
For static analysis, the analysis process of the static function call relation chain can be completed only by traversing the function in each binary file, identifying call and jump call after traversing each instruction in each function, and extracting the instruction called by the function address, and the analysis process is shown in fig. 7 and includes: reversely traversing a function in the binary system (401), traversing an instruction in the function (402), analyzing an assembly instruction (403), judging a function call instruction (404), extracting a function call relation (405), and storing the extracted function call relation in a database (406). It can be seen that in the process of static reverse traversal of function instructions in the binary system, after the instruction related to function call is extracted, the instruction is stored in the static version call chain database. However, this method is invalid for function call of virtual address, and for calling function virtual address, it needs to be determined in real time by dynamic runtime, so it needs the following dynamic instrumentation analysis function call relation chain method to solve.
2. The dynamic analysis module 52 solves the problem of missing data of the call chain in the process of supplementing the static reverse binary file analysis function call relation chain. The specific implementation method comprises the following steps:
the purpose of the use case logging module is to implement association between a use case and a function, and a specific implementation method is shown in fig. 8. In fig. 8, the filled-in part with the hatching 4 is an operation required by the user, and the other part is an operation performed by the system. The specific implementation process comprises the following steps: setting a version number (601), suspending a process to be tested (602), analyzing an assembly instruction (603), judging a function call instruction (604), inserting a breakpoint (605), executing a test program (606), recording a function address (607), recording a function call relation (608), and warehousing information (609). It can be seen that, a user sets a test version number on a system, the system acquires a pdb (Program Data Base) file of a Program of a product to be tested from a compiling platform according to the version number, the pdb file is a file generated during VS compiling link, a relationship between a function and a line number is acquired through the pdb file, an assembly instruction of each line of codes is analyzed, whether the assembly instruction is a call/jump instruction is judged, and breakpoints are dropped at the two instructions. And then resuming the running of the suspended process, and extracting the real-time address value of the register at the breakpoint when the tested program hits the breakpoint, wherein the address is the function address of the dynamic running. And then storing the function call relation chain acquired by the dynamic runtime into a dynamic call chain database. The dynamic instrumentation analysis function call relation chain continuously accumulates data in an operation mode, and the data in the aspect can make up for the missing call chain data in the static reverse binary file analysis function call relation chain mode. Therefore, the method really achieves the purpose of capturing a complete function call relation chain.
3. And a relation chain counting module 53 for counting and refining the function call relation chain. The function call relation chain analysis mode of the static reverse binary file can analyze all relations of function call, and only part of the call relations cannot be specifically called function addresses due to function virtual addresses. Because the static analysis can be automatically analyzed by the system without manual participation, the function call relation chain data of the static analysis can be stored according to the version so as to ensure that the call chain data is the latest version. The method for dynamically analyzing the function call relation chain acquires the local function call relation graph of the module, and has the advantage that acquired call relation data are real-time and accurate, but only the function call relation graph can cover the branch data of the function call which is run. Moreover, the data in the aspect can be collected by manually and dynamically operating the acquisition party, and in order to avoid the complex workload of manually acquiring each version, the function call relation chain data dynamically acquired by each version can be stored according to the concept of a large version, so that the integrity of the data is ensured, and certain real-time performance is also ensured.
In order to obtain complete function call relation chain data, a static analysis mode and a dynamic analysis mode can be combined, and under the condition that a global graph is obtained through the static analysis, a missing point in the global graph is supplemented through the dynamic analysis mode, and a required function call relation chain is gradually completed.
The process of parsing the function call relation chain and performing statistics and refinement on it is shown in fig. 9. In fig. 9, the left side is a first processing branch of static analysis, and the right side is a second processing branch of dynamic analysis. The process comprises the following steps:
step 701, after the version comparison is performed by the test, the operation is continuously executed through different processing branches.
In a first processing branch, comprising:
step 7021, version change analysis;
7022, obtaining a static function call relation chain;
7023, counting the function call chain perfection of each module, and taking the function call chain perfection as basic data;
step 7024, function call chain data is supplemented by large version corresponding dynamic function call chain data.
In the second processing branch, comprising:
7031, acquiring product information;
7032, obtaining a large version;
step 7033, converting the corresponding product ID;
7034, acquiring dynamic function call chain data corresponding to the large version, and executing 7024;
and 704, after the function call chain data is finally supplemented, counting the function call chain perfectness of each module.
Through the flow of fig. 9, it is realized how to merge the function call relationship chain data of both static and dynamic states, so as to supplement the function call relationship chain data of the static analysis. The global function topological relation is obtained by reversely analyzing the function-call relation chain in a static binary system, and for the part with the deficiency, the function-call relation chain is analyzed by using a dynamic pile to obtain the actually-operated function address in the dynamic operation process, so that the function of supplementing the deficiency and maladjustment used chain is achieved, and the data of the whole function-call relation chain is gradually perfected. By adopting the embodiment of the invention, the complete function call relation chain can be obtained, including cross-module interface call. Functional relationships in the project can be established, and understanding, development and implementation of members of the project group are facilitated. The test is convenient to develop and realize analysis. The change of the calling relation of the function can be monitored, the problems of coupling, test analysis omission and the like are avoided. Meanwhile, the influence test range of the code can be clearly changed through the coupling relation of the call chain.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (12)

1. An information processing method, characterized in that the method comprises:
reversely traversing function codes in the binary file in a static mode to reversely convert the function codes into a form of assembly language;
analyzing the function code in the assembly language form in a static analysis mode to obtain all instructions in the function code, and identifying function address call instructions from all the instructions in the function code;
extracting a first function call relation chain from the function code according to the position indicated by the function address call instruction, wherein the first function call relation chain is used for representing the global function topological relation in the function code;
acquiring the called actual address of the function when the function code runs in a dynamic analysis mode;
obtaining a local function topological relation according to the called actual address of the function;
and according to the local function topological relation, supplementing missing data in the global function topological relation to obtain a function call relation chain.
2. The method according to claim 1, wherein the obtaining the actual address of the function code runtime function called in a dynamic resolution manner comprises:
and switching from the static analysis mode to a dynamic analysis mode, and analyzing the function virtual address contained in the function code to obtain the called actual address of the function corresponding to the function virtual address.
3. The method according to claim 2, wherein the parsing the function virtual address included in the function code to obtain a real address of a function called corresponding to the function virtual address includes:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
before the process to be tested runs, a breakpoint is set in the process to be tested, running characteristic data is obtained when the breakpoint runs to the breakpoint, and an actual address of the function to be called is obtained according to the characteristic data.
4. The method according to claim 2, wherein the parsing the function virtual address included in the function code to obtain a real address of a function called corresponding to the function virtual address includes:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
suspending the process to be tested;
a breakpoint is set at the position of the function address calling instruction in the process to be tested;
and running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function.
5. The method according to claim 3 or 4, characterized in that the method further comprises:
obtaining a second function call relation chain according to the called actual address of the function, wherein the first function call relation chain is used for representing the local function topological relation;
the method includes the following steps of according to the local function topological relation, completing missing data in the global function topological relation to obtain a function call relation chain, and the method further includes:
and obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
6. A server, characterized in that the server comprises:
the traversing unit is used for reversely traversing the function codes in the binary file in a static mode so as to reversely convert the function codes into the form of assembly language;
the first analysis unit is used for analyzing the function codes in the assembly language form in a static analysis mode to obtain all instructions in the function codes, and identifying function address calling instructions from all the instructions in the function codes;
the first analysis unit is further configured to extract a first function call relation chain from the function code according to the position indicated by the function address call instruction, where the first function call relation chain is used to represent a global function topology relation in the function code;
the second analysis unit is used for acquiring the called actual address of the function when the function code runs in a dynamic analysis mode;
the first processing unit is used for obtaining a local function topological relation according to the called actual address of the function;
and the second processing unit is used for supplementing missing data in the global function topological relation according to the local function topological relation to obtain a function call relation chain.
7. The server according to claim 6, wherein the second parsing unit is further configured to:
and switching to a dynamic analysis mode from the static analysis mode, analyzing the function virtual address contained in the function code, and obtaining the actual address of the function corresponding to the function virtual address, which is called.
8. The server according to claim 7, wherein the second parsing unit is further configured to:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
before the process to be tested runs, a breakpoint is set for the process to be tested, running characteristic data is obtained when the breakpoint is run, and the called actual address of the function is obtained according to the characteristic data.
9. The server according to claim 7, wherein the second parsing unit is further configured to:
determining a process to be tested needing to dynamically acquire a function call relation chain from the function code, wherein the process to be tested comprises a function virtual address;
suspending the process to be tested;
a breakpoint is set at the position of the function address calling instruction in the process to be tested;
and running the process to be tested, if the breakpoint is hit, triggering extraction processing of the called actual address of the function, and extracting the called actual address of the function.
10. The server according to claim 8 or 9, wherein the server further comprises: a relationship chain acquisition unit for:
obtaining a second function call relation chain according to the called actual address of the function, wherein the first function call relation chain is used for representing the local function topological relation;
the second processing unit is further configured to:
and obtaining the function calling relation chain according to the first function calling relation chain and the second function calling relation chain.
11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
12. A server, characterized in that the server comprises:
a memory for storing a computer program capable of running on the processor;
a processor for performing the steps of the method according to any one of claims 1 to 5 when running the computer program.
CN201710331113.3A 2017-05-11 2017-05-11 Information processing method, server and computer storage medium Active CN108874470B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710331113.3A CN108874470B (en) 2017-05-11 2017-05-11 Information processing method, server and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710331113.3A CN108874470B (en) 2017-05-11 2017-05-11 Information processing method, server and computer storage medium

Publications (2)

Publication Number Publication Date
CN108874470A CN108874470A (en) 2018-11-23
CN108874470B true CN108874470B (en) 2023-04-07

Family

ID=64319569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710331113.3A Active CN108874470B (en) 2017-05-11 2017-05-11 Information processing method, server and computer storage medium

Country Status (1)

Country Link
CN (1) CN108874470B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110032394B (en) * 2019-04-12 2022-05-31 深圳市腾讯信息技术有限公司 Analysis method and device for passive code file and storage medium
CN110417574B (en) * 2019-05-21 2022-01-07 腾讯科技(深圳)有限公司 Topology analysis method and device and storage medium
CN111078559B (en) * 2019-12-18 2023-10-13 广州品唯软件有限公司 Method, device, medium and computer equipment for extracting function call in java code
CN111290950B (en) * 2020-01-22 2022-03-01 腾讯科技(深圳)有限公司 Test point obtaining method and device in program test, storage medium and equipment
CN111443902B (en) * 2020-03-20 2023-09-08 杭州有赞科技有限公司 Function call tree generation method, system, computer device and readable storage medium
CN113742252B (en) * 2020-05-28 2024-03-29 华为技术有限公司 Method and device for detecting memory disorder
CN117453280A (en) * 2023-09-12 2024-01-26 湖南长银五八消费金融股份有限公司 Code topology and service topology generation method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714118A (en) * 2009-11-20 2010-05-26 北京邮电大学 Detector for binary-code buffer-zone overflow bugs, and detection method thereof
CN101814053A (en) * 2010-03-29 2010-08-25 中国人民解放军信息工程大学 Method for discovering binary code vulnerability based on function model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761089B (en) * 2014-01-14 2017-09-15 清华大学 The method that kinematic function call relation is determined based on register transfer language
CN104035773B (en) * 2014-06-11 2017-04-12 清华大学 Extension call graph based software system node importance evaluation method
KR101620931B1 (en) * 2014-09-04 2016-05-13 한국전자통신연구원 Similar malicious code retrieval apparatus and method based on malicious code feature information
CN104331368B (en) * 2014-11-18 2017-04-05 合肥康捷信息科技有限公司 A kind of method called based on cfg file static analysis C++ Virtual Functions
CN106547520B (en) * 2015-09-16 2021-05-28 腾讯科技(深圳)有限公司 Code path analysis method and device
CN105095092A (en) * 2015-09-25 2015-11-25 南京大学 Static analysis and dynamic operation based detection of atomic violation of JS (JavaScript) code in Web application
KR102063966B1 (en) * 2015-10-21 2020-01-09 엘에스산전 주식회사 Optimization method for compiling programmable logic controller command

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714118A (en) * 2009-11-20 2010-05-26 北京邮电大学 Detector for binary-code buffer-zone overflow bugs, and detection method thereof
CN101814053A (en) * 2010-03-29 2010-08-25 中国人民解放军信息工程大学 Method for discovering binary code vulnerability based on function model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
G. Kaliora等.Nonlinear control of feedforward systems with bounded signals.《 IEEE Transactions on Automatic Control》.2004,第49卷(第11期),第1975-1987页. *
孙贺等.一种结合动态与静态分析的函数调用图提取方法.《计算机工程》.2017,第43卷(第3期),第154-162页. *
熊彪.静态逆向反汇编获取函数调用关系链.《https://cloud.tencent.com/developer/article/1005583》.2017,第1-6页. *

Also Published As

Publication number Publication date
CN108874470A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
CN108874470B (en) Information processing method, server and computer storage medium
US10481964B2 (en) Monitoring activity of software development kits using stack trace analysis
US9734263B2 (en) Method and apparatus for efficient pre-silicon debug
US8776029B2 (en) System and method of software execution path identification
CN110580226B (en) Object code coverage rate testing method, system and medium for operating system level program
US10037265B2 (en) Enhancing the debugger stack with recently reported errors under debug
US20090222646A1 (en) Method and apparatus for detecting processor behavior using instruction trace data
CN113268243B (en) Memory prediction method and device, storage medium and electronic equipment
CN109634822B (en) Function time consumption statistical method and device, storage medium and terminal equipment
CN112671878B (en) Block chain information subscription method, device, server and storage medium
CN106294132B (en) A kind of method and device managing log
CN112463518A (en) Page full-life-cycle monitoring method, device, equipment and storage medium based on Flutter
CN109582574A (en) A kind of code coverage statistical method, device, storage medium and terminal device
Shea et al. Scoped identifiers for efficient bit aligned logging
CN110908869B (en) Application program data monitoring method, device, equipment and storage medium
CN106940775B (en) Vulnerability detection method and device for application program
CN113806231A (en) Code coverage rate analysis method, device, equipment and medium
CN106940772B (en) Variable object tracking method and device
US10289540B2 (en) Performing entropy-based dataflow analysis
Marburger et al. Tools for understanding the behavior of telecommunication systems
CN107391358B (en) Abnormal data processing method and system
CN113434417B (en) Regression testing method and device for loopholes, storage medium and electronic device
CN116185882A (en) Software debugging method, device, equipment and storage medium
CN112528291B (en) Code auditing method and device based on knowledge graph
CN111400147B (en) Service quality testing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant