CN111723345B - Callback function-based control flow obfuscation method and system - Google Patents

Callback function-based control flow obfuscation method and system Download PDF

Info

Publication number
CN111723345B
CN111723345B CN202010388650.3A CN202010388650A CN111723345B CN 111723345 B CN111723345 B CN 111723345B CN 202010388650 A CN202010388650 A CN 202010388650A CN 111723345 B CN111723345 B CN 111723345B
Authority
CN
China
Prior art keywords
program
callback function
loop
original
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010388650.3A
Other languages
Chinese (zh)
Other versions
CN111723345A (en
Inventor
舒辉
沙子涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202010388650.3A priority Critical patent/CN111723345B/en
Publication of CN111723345A publication Critical patent/CN111723345A/en
Application granted granted Critical
Publication of CN111723345B publication Critical patent/CN111723345B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • G06F8/434Pointers; Aliasing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention belongs to the technical field of network security, and particularly relates to a control flow confusion method and a control flow confusion system based on a callback function, aiming at a loop structure in a program, the loop jump of a basic block is converted into repeated call among functions through the callback function, so that the original execution logic of the program is hidden, and the original control flows of the program are unified and merged into a sequential structure; through program analysis, data dependence is reconstructed in the callback function so as to maintain functional consistency before and after cycle jump transformation. The invention hides the program key algorithm logic by callback function circular calling, so as to be difficult to detect by an automatic analysis technology, effectively solves the problems that the traditional confusion method is difficult to resist the automatic analysis means and the like, has effective influence on the program overhead because only a small amount of function calling and necessary parameter transmission instructions are introduced into the original program, greatly enhances the analysis resistance under the condition of ensuring the program execution efficiency, has no obvious change before and after the size and the execution efficiency of the program are tested by practical application, and has good applicability.

Description

Callback function-based control flow obfuscation method and system
Technical Field
The invention belongs to the technical field of network security, and particularly relates to a control flow confusion method and system based on a callback function.
Background
Reverse analysis techniques are a technique common to malicious code writers. By reversely analyzing the existing program and mining the algorithm logic and key data contained in the program, software piracy and even hacking based on vulnerabilities can be effectively implemented. According to software survey published by the software alliance (BSA for short) in 2018, up to 36% of installed software in the world is not legally authorized, and the software is a great threat to the information security of the software industry and users. Code obfuscation techniques are common techniques for software protection. The effective confusion algorithm is constructed to confuse program data flow and control flow, so that the purpose of hiding code logic is achieved. The control flow confusion algorithm is researched relatively more, such as an opaque predicate confusion algorithm and a flat control flow confusion algorithm: the opaque predicate confusion algorithm enables one more false branch in a program by constructing an opaque predicate of a constant true or a constant false; or construct opaque predicates that can be true or false, add equivalent basic blocks in the program, making the structure of the control flow more complex. The control flow flattening algorithm flatly expands original nested loops and conditional branch statements in the program, and the nested loops and the conditional branch statements are connected through the switch statements, so that basic information for clearly recording the flow direction of the control flow is lost in each basic block, and reverse analysis is resisted.
However, the purpose of the obfuscation is to increase the difficulty of manual reversal and delay the program from breaking. Therefore, the development of automated software reverse techniques severely impacts the effectiveness of related obfuscation algorithms. Studies have shown that dynamic symbol execution and data flow analysis are widely used for the elimination of control flow confusion: the former judges whether the designated program point can be replaced by a known value or not through constant propagation and an available expression, and the latter solves the influence of preventive confusion on static program analysis by using unreachable path analysis. The method effectively eliminates the obstruction of the obfuscation algorithm to program decoding, so that the software protection industry is difficult to make further breakthrough. Obviously, the traditional control flow obfuscation algorithm is difficult to adapt to the requirement of software protection, and the innovation of the algorithm is particularly important, so that the code protection method with high obfuscation strength and high algorithm elasticity is the current problem to be solved firstly.
Disclosure of Invention
Therefore, the invention provides a control flow confusion method and system based on a callback function, which aim to overcome the defects of low confusion strength, high operation overhead and the like of the traditional confusion algorithm and enhance the reverse analysis resistance of a network program.
According to the design scheme provided by the invention, the control flow confusion method based on the callback function comprises the following contents:
aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flow into a sequential structure;
through program analysis, data dependence is reconstructed in the callback function so as to maintain functional consistency before and after cycle jump transformation.
As the control flow confusion method based on the callback function, in the jump transformation, firstly, the program is converted into an intermediate language, the basic block of the cyclic code existing in the program and the corresponding dependency relationship are mined through the analysis of the control flow and the data flow, the basic block of the cyclic code is stripped from the original program and transferred to the callback function, and a call instruction for the callback function is added in the original program.
As the control flow obfuscation method based on the callback function, further, the corresponding callback boundary is set according to the original program logic in the corresponding loop ending condition, so that the program is executed as expected.
As the control flow confusion method based on the callback function, the invention further pushes the loop related variables aiming at the loop code basic block in the program, transfers the parameters to the interior of the callback function by using the stack pointer, and reconstructs data dependence in the callback function so as to maintain the consistency of functions before and after the loop jump transformation.
As a control flow obfuscation method based on a callback function, the invention further identifies the jump relation of the basic block through a branch instruction aiming at a program intermediate language, and generates a control flow graph describing the execution of the program according to the jump relation; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a cyclic code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
As the control flow confusion method based on the callback function, the loop logic type is further judged by comparing instructions in the LLVM according to the entry degree and the exit degree of the loop entry point; and positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting.
As the control flow obfuscation method based on the callback function, further, the loop logic type includes a judgment priority loop type and an execution priority loop type; and aiming at judging the priority type of the circulation, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and maintaining the original program execution logic unchanged through code redundancy.
As a control flow obfuscation method based on a callback function, further, data transfer is performed through a stack pointer to reconstruct data dependency in the callback function so as to maintain the dependency relationship unchanged, which specifically includes: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset.
As the control flow confusion method based on the callback function, the stack is sequentially popped and the stack balance is recovered after the callback function is executed.
Further, the present invention also provides a callback function-based control flow obfuscation system, comprising: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop conversion module is used for converting the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows are unified and merged into a sequential structure;
and the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after the loop jump transformation.
The invention has the beneficial effects that:
according to the method, by analyzing the analysis of control flow and data flow, starting from the working mechanism of a callback function, and by circularly calling the callback function, the repeated skip among the traditional circular structures is replaced, the key algorithm logic in the program is hidden in the callback function, the key algorithm logic is not easily detected by an automatic analysis technology, and the problems that the traditional confusion method is difficult to resist the automatic analysis means and the like are effectively solved. In addition, the callback function is used for circular confusion, only a small amount of function calls and necessary parameter transfer instructions are introduced into the original program, the influence on the program overhead is effective, the analysis resistance capability can be greatly enhanced under the condition of ensuring the program execution efficiency, the program size and the execution efficiency are not obviously changed before and after the program size and the execution efficiency are tested by practical application, and the method has good applicability.
Description of the drawings:
FIG. 1 is a schematic diagram of control flow obfuscation in an embodiment;
FIG. 2 is a schematic diagram illustrating an embodiment of a loop equivalence execution flow;
FIG. 3 is a diagram illustrating cycle callback boundary setting in an embodiment;
FIG. 4 is a schematic diagram of a control flow of a conventional loop structure in an embodiment;
FIG. 5 is a flow diagram illustrating callback-type loop control in an embodiment.
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
The traditional obfuscation method mainly increases the complexity of a program by adding code redundancy, and resists reverse analysis. The confusion strength of the method depends on the code redundancy degree, the method is difficult to deal with the increasingly efficient automatic analysis technology, and the introduction of the code redundancy in the program can lead to the increase of the code volume and the reduction of the operation efficiency, thereby being not in line with the development requirement of the software industry. To this end, an embodiment of the present invention, as shown in fig. 1, provides a callback function-based control flow obfuscation method, which includes the following steps:
s101, aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flows into a sequential structure;
s102, reconstructing data dependence in the callback function through program analysis to maintain functional consistency before and after cycle jump transformation.
Aiming at a circulation structure in a program, a circulation logic is hidden through a callback function, a program control flow graph is damaged, and the backward analysis resistance of the program is enhanced; the defects of low confusion strength, high running expense and the like of the traditional confusion algorithm are overcome, the repeated skip among traditional loop structures is replaced by the loop call of the callback function, the key algorithm logic is hidden in the callback function, the detection by an automatic analysis technology is not easy, and the analysis resistance of a network program is enhanced.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, in jump transformation, first, a program is converted into an intermediate language, and through analysis of a control flow and a data flow, a basic block of a loop code and a corresponding dependency relationship existing in the program are mined, the basic block of the loop code is stripped from an original program and transferred to the callback function, and a call instruction for the callback function is added to the original program.
The program is converted into an intermediate language, and at the level, the program is analyzed in control flow and data flow, and the loop code blocks and the corresponding dependency relationship existing in the program are mined. On the basis, the cyclic code block is stripped from the original program, transferred to the callback function, and added with the call to the callback function. And (3) converting the basic block circular jump into repeated calling among functions by utilizing a callback function response-execution mechanism, and maintaining the functional consistency before and after conversion through a program analysis technology. And the purposes of hiding control flow logic, resisting reverse analysis and protecting codes are achieved.
As the control flow obfuscation method based on the callback function in the embodiment of the present invention, further, a corresponding callback boundary is set in a corresponding loop ending condition according to the original program logic, so that the program is executed as expected.
As the control flow confusion method based on the callback function in the embodiment of the invention, further, aiming at a basic block of a loop code in a program, a loop related variable is pushed, a stack pointer is utilized to transfer a parameter to the interior of the callback function, and data dependence is reconstructed in the callback function so as to maintain the consistency of functions before and after loop jump transformation.
In order to ensure that the data dependency is unchanged, the loop-related variables are pushed, parameters are transmitted to the interior of the callback function by using a stack pointer, the data dependency is rebuilt in the callback function, and the functional consistency of the program is maintained.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, for a program intermediate language, a jump relation of a basic block is identified by a branch instruction, and a control flow graph describing program execution is generated according to the jump relation; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a cyclic code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
The callback function is applied to replace a cycle structure equivalently, and the functional consistency of the front program and the back program is ensured, so that the callback function selected in the embodiment of the invention has the following characteristics: 1) Immediate responsiveness: after the caller is executed, a callback condition is immediately triggered, and all circulation functions are completely executed before the next instruction is executed; 2) Repetition returnability: when the program runs, the cycle number is uncontrollable, so that a caller repeatedly calls a callback function under the condition of not triggering an exit condition; 3) Specific exit conditions: corresponding to the loop boundary condition, providing a corresponding exit mechanism to end the callback process and continue to execute other functions of the program. In the intermediate language level, the program marks the jump relation of the basic block through the branch instruction, and a control flow graph for describing the program execution can be generated according to the jump relation, so that a closed loop structure existing in the program, namely a corresponding loop code block, is mined. And on the basis, sending the cyclic code block into a callback function to finish the cyclic stripping work.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, a loop logic type is determined by comparing instructions in the LLVM and according to a loop entry point in-degree and an out-degree; and positioning and converting the Boolean value of the comparison instruction, and returning the Boolean value by adding a return instruction at the end of the callback function so as to complete the setting of the callback boundary condition. Further, the loop logic type includes a judgment priority loop type and an execution priority loop type; and aiming at judging the priority type of the circulation, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and maintaining the original program execution logic unchanged through code redundancy.
After the code stripping is completed, the corresponding callback boundary is set according to the original program logic, obviously, the general program has two types of loop logics, namely execution priority loop (do-while) and judgment priority loop (while-do, for) respectively according to the C language code style. By distinguishing the cycle entry point in-degree and out-degree, the cycle types can be effectively distinguished: executing the priority cycle out degree to be 1, and judging the priority cycle out degree to be 2; both of them have an incomes greater than or equal to 2. Referring to the two types of loop equivalent execution flow shown in fig. 2, for different types of loops, the comparison instruction positions for identifying jump conditions are also different due to different priorities of judgment and execution: judging that the jumping condition is positioned at a loop entry point in the priority type loop; in the execution priority loop, the jump condition is located at the loop exit point, i.e. the last bit code block. LLVM is a framework system for constructing compiler (compiler), written in C + +, and is used to optimize compile time (compile-time), link time (link-time), run-time (run-time), and idle-time (idle-time) of a program written in an arbitrary programming language. Referring to fig. 3, LLVM performs the decision function by comparing the command CmpInst and appears as a jump condition in the loop structure. And the callback function takes whether the return value is 0 as a callback boundary condition, so that after the CmpInst is positioned, boolean value conversion is carried out on the instruction, and the RetInst is added at the end of the callback function to return the value, thereby finishing the setting of the boundary condition. In a specific application, the equivalent conversion modes of the two types of circulation are different: the execution priority type immediately enters a loop structure after the program runs, and conforms to the function calling logic; the judgment priority type firstly carries out cycle boundary detection and then determines whether to enter a cycle structure. Therefore, a jump condition code block is additionally added for the latter, when the callback function is triggered for the first time, the basic block is executed to carry out entry judgment, and the execution logic of the original program is kept unchanged through code redundancy.
However, code stripping will cause the original program data stream dependencies to be destroyed. Therefore, reconstructing data dependencies is a problem that needs to be addressed intensively in engineering. By analyzing the data stream in the program and according to the related algorithm, the dependency relationship of the target instruction can be effectively mined. In order to ensure that the dependency relationship before and after the model application is unchanged, further, data transmission is performed through a stack pointer to reconstruct data dependency in a callback function so as to maintain the dependency relationship unchanged, specifically comprising: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset. Further, after the callback function is executed, the stack is popped in sequence, and stack balance is recovered. The specific implementation algorithm can be designed as follows:
Figure BDA0002485006820000051
firstly, the loop-related instructions are scanned, a data dependency discovery algorithm is applied one by one, and the generated result is added into a data set. And after the scanning is finished, the instruction in the data set is subjected to duplication removal, and the instruction address is pushed. Then, the stack top pointer is used as a parameter and is transmitted into the callback function, and a variable (ESP) is re-declared in the callback function, obviously, the dependency relationship formed by the original instruction and the data in the stack is equivalent to the dependency relationship formed by the target data. For other dependencies, the same can be achieved by stack pointer offsetting. And finally, after the callback function is executed, sequentially popping the stack and recovering stack balance.
In the static analysis layer, the original program control flow graph is damaged, and the cycle boundary condition is replaced by the callback boundary condition. In the aspect of dynamic analysis, a large number of callback library functions exist in a Windows operating system, based on a loop model established by the functions, loop functions are hidden in system call, an executor is converted into system functions from an original program, short jump among basic blocks is replaced by long jump of function call, and program execution addresses are distributed in a memory space to better resist the dynamic analysis technology. In addition, in the embodiment of the invention, the callback function is used for circular confusion, only a small amount of function calls and necessary parameter transfer instructions are introduced into the original program, the influence on the program overhead is effective, and the analysis resistance can be greatly enhanced under the condition of ensuring the program execution efficiency.
After the data dependency relationship is recovered, the embodiment of the invention can effectively confuse codes and protect algorithm logic on the premise of ensuring the functional consistency. Before and after the application of the program, control flow diagrams are shown as a traditional circulation structure control flow diagram in fig. 4 and a callback type circulation control flow diagram in fig. 5, after the traditional circulation structure is converted into callback type circulation, originally complex control flows of the program are unified and merged into a sequential structure, original execution logic is completely hidden, and the circulation call among functions implicitly completes the algorithm function in the network program. On the basis of a traditional confusion algorithm, by means of a control flow deepening concept and by means of the execution characteristics of a callback function, the circulation logic is hidden in function call, and the reverse resistance is enhanced.
Further, based on the above method, an embodiment of the present invention further provides a callback function-based control flow obfuscation system, including: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop transformation module is used for transforming the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows are unified and merged into a sequential structure;
and the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after loop jump transformation.
Unless specifically stated otherwise, the relative steps, numerical expressions and values of the components and steps set forth in these embodiments do not limit the scope of the present invention.
Based on the foregoing system, an embodiment of the present invention further provides a server, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the above-described systems or methods.
Based on the above system, the embodiment of the present invention further provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the above system or method.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the system embodiment, and for the sake of brief description, reference may be made to the corresponding content in the system embodiment for the part where the device embodiment is not mentioned.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the system and the apparatus described above may refer to the corresponding process in the foregoing system embodiment, and details are not described herein again.
In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and system may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the system according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or changes to the embodiments described in the foregoing embodiments, or make equivalent substitutions for some features, within the scope of the disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A control flow confusion method based on a callback function is characterized by comprising the following contents:
aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flow into a sequential structure;
through program analysis, data dependence is reconstructed in a callback function so as to maintain the consistency of functions before and after cycle jump transformation;
in the jump transformation, firstly, a program is converted into an intermediate language, a basic block of a cyclic code and a corresponding dependency relationship existing in the program are mined through analysis of a control flow and a data flow, the basic block of the cyclic code is stripped from an original program and transferred to a callback function, and a call instruction for the callback function is added in the original program;
setting a corresponding callback boundary according to the original program logic in the corresponding cycle ending condition so as to enable the program to execute as expected;
pushing a loop related variable aiming at a loop code basic block in a program, transferring a parameter to the interior of a callback function by using a stack pointer, and rebuilding data dependence in the callback function so as to maintain functional consistency before and after loop jump conversion;
judging the type of the loop logic through a comparison instruction in the LLVM according to the entry degree and the exit degree of a loop entry point; positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting;
data transmission is performed through a stack pointer, data dependency is rebuilt in a callback function, so that the dependency relationship is maintained unchanged, and the method specifically comprises the following steps: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially transmitting data by using the stack pointer offset.
2. The control flow obfuscation method based on the callback function as claimed in claim 1, wherein for the intermediate language of the program, the jump relation of the basic block is identified by a branch instruction, and a control flow graph describing the execution of the program is generated according to the jump relation; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a loop code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
3. The callback function-based control flow obfuscation method of claim 1, wherein the loop logic type comprises a judge priority loop type and an execute priority loop type; and aiming at judging the priority type of the loop, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and keeping the execution logic of the original program unchanged through code redundancy.
4. The control flow obfuscation method based on the callback function as claimed in claim 1, wherein after the callback function is executed, the stack is popped in sequence to restore stack balance.
5. A callback function based control flow obfuscation system, comprising: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop transformation module is used for transforming the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows of the program are unified and merged into a sequential structure;
the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after loop jump transformation;
in the jump transformation, firstly, a program is converted into an intermediate language, a basic block of a cyclic code and a corresponding dependency relationship existing in the program are mined through analysis of a control flow and a data flow, the basic block of the cyclic code is stripped from an original program and transferred to a callback function, and a call instruction for the callback function is added in the original program;
setting a corresponding callback boundary according to the original program logic in the corresponding cycle ending condition so as to enable the program to execute as expected;
pushing a loop related variable aiming at a loop code basic block in a program, transferring a parameter to the interior of a callback function by using a stack pointer, and rebuilding data dependence in the callback function so as to maintain functional consistency before and after loop jump transformation;
judging the type of the loop logic through a comparison instruction in the LLVM according to the entry degree and the exit degree of a loop entry point; positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting;
data transfer is performed through a stack pointer, data dependency is rebuilt in a callback function, so that the dependency relationship is maintained unchanged, and the method specifically comprises the following steps: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset.
CN202010388650.3A 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system Active CN111723345B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010388650.3A CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010388650.3A CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Publications (2)

Publication Number Publication Date
CN111723345A CN111723345A (en) 2020-09-29
CN111723345B true CN111723345B (en) 2022-11-22

Family

ID=72565056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010388650.3A Active CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Country Status (1)

Country Link
CN (1) CN111723345B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527307B (en) * 2020-11-18 2023-06-20 西安电子科技大学 Program control flow hiding method, system and application
CN113158147B (en) * 2021-03-24 2022-12-09 中国人民解放军战略支援部队信息工程大学 Code obfuscation method based on parent fusion
CN114357389B (en) * 2021-12-31 2024-04-16 北京大学 LLVM (logical Low level virtual machine) -based instruction flower adding confusion method and device
CN117407876A (en) * 2023-12-11 2024-01-16 常熟理工学院 Opaque predicate detection method, system and storage medium in malicious software

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682460A (en) * 2016-11-25 2017-05-17 西北大学 Code obfuscation method based on two transformations
CN109784010A (en) * 2018-12-18 2019-05-21 武汉极意网络科技有限公司 A kind of program control flow based on LLVM obscures method and device
CN110309629A (en) * 2019-06-18 2019-10-08 阿里巴巴集团控股有限公司 A kind of web page code reinforcement means, device and equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7430670B1 (en) * 1999-07-29 2008-09-30 Intertrust Technologies Corp. Software self-defense systems and methods
EP1947584B1 (en) * 2006-12-21 2009-05-27 Telefonaktiebolaget LM Ericsson (publ) Obfuscating computer program code

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682460A (en) * 2016-11-25 2017-05-17 西北大学 Code obfuscation method based on two transformations
CN109784010A (en) * 2018-12-18 2019-05-21 武汉极意网络科技有限公司 A kind of program control flow based on LLVM obscures method and device
CN110309629A (en) * 2019-06-18 2019-10-08 阿里巴巴集团控股有限公司 A kind of web page code reinforcement means, device and equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
代码混淆技术研究综述;李路鹿等;《软件》;20200215(第02期);全文 *
分存技术在代码混淆中的研究;杨秋翔等;《计算机工程与设计》;20150116(第01期);全文 *
软件防反汇编技术研究;尚涛等;《计算机应用研究》;20091215(第12期);全文 *

Also Published As

Publication number Publication date
CN111723345A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111723345B (en) Callback function-based control flow obfuscation method and system
CN106096338B (en) A kind of virtualization software guard method obscured with data flow
JP5458184B2 (en) System and method for aggressive automatic correction in a dynamic function call system
Kalysch et al. VMAttack: Deobfuscating virtualization-based packed binaries
TW200837604A (en) Obfuscating computer program code
US8775826B2 (en) Counteracting memory tracing on computing systems by code obfuscation
Wang et al. Translingual obfuscation
US20160171213A1 (en) Apparatus and method for controlling instruction execution to prevent illegal accesses to a computer
CN112115427A (en) Code obfuscation method, device, electronic device and storage medium
CN113366474A (en) System, method and storage medium for obfuscating a computer program by representing control flow of the computer program as data
CN114611074A (en) Method, system, equipment and storage medium for obfuscating source code of solid language
US8887140B2 (en) System and method for annotation-driven function inlining
CN111814119B (en) Anti-debugging method
CN109858204B (en) Program code protection method and device based on LLVM
EP3380974A1 (en) Method to generate a secure code
KR102429641B1 (en) Method and device for generating input values for fuzzing by analysis of comparison statements within binaries
CN114003868A (en) Method for processing software code and electronic equipment
CN111488558B (en) Script protection method and device, computer readable storage medium and computer equipment
CN114880665A (en) Intelligent detection method and device for return programming attack
CN114637988A (en) Binary-oriented function level software randomization method
CN110147238B (en) Program compiling method, device and system
Wang et al. An efficient control-flow based obfuscator for micropython bytecode
Kumar et al. A thorough investigation of code obfuscation techniques for software protection
CN113158147B (en) Code obfuscation method based on parent fusion
CN109918872B (en) Android application reinforcing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant