CN111723345A - Callback function-based control flow obfuscation method and system - Google Patents

Callback function-based control flow obfuscation method and system Download PDF

Info

Publication number
CN111723345A
CN111723345A CN202010388650.3A CN202010388650A CN111723345A CN 111723345 A CN111723345 A CN 111723345A CN 202010388650 A CN202010388650 A CN 202010388650A CN 111723345 A CN111723345 A CN 111723345A
Authority
CN
China
Prior art keywords
program
callback function
loop
control flow
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010388650.3A
Other languages
Chinese (zh)
Other versions
CN111723345B (en
Inventor
舒辉
沙子涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Engineering University of PLA Strategic Support Force
Original Assignee
Information Engineering University of PLA Strategic Support Force
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Engineering University of PLA Strategic Support Force filed Critical Information Engineering University of PLA Strategic Support Force
Priority to CN202010388650.3A priority Critical patent/CN111723345B/en
Publication of CN111723345A publication Critical patent/CN111723345A/en
Application granted granted Critical
Publication of CN111723345B publication Critical patent/CN111723345B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • G06F8/434Pointers; Aliasing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention belongs to the technical field of network security, and particularly relates to a control flow confusion method and a control flow confusion system based on a callback function, aiming at a loop structure in a program, the loop jump of a basic block is converted into repeated call among functions through the callback function, so that the original execution logic of the program is hidden, and the original control flows of the program are unified and merged into a sequential structure; through program analysis, data dependence is reconstructed in the callback function so as to maintain functional consistency before and after cycle jump transformation. The invention hides the program key algorithm logic by callback function circular calling, so as to be difficult to detect by an automatic analysis technology, effectively solves the problems that the traditional confusion method is difficult to resist the automatic analysis means and the like, has effective influence on the program overhead because only a small amount of function calling and necessary parameter transmission instructions are introduced into the original program, greatly enhances the analysis resistance under the condition of ensuring the program execution efficiency, has no obvious change before and after the size and the execution efficiency of the program are tested by practical application, and has good applicability.

Description

Callback function-based control flow obfuscation method and system
Technical Field
The invention belongs to the technical field of network security, and particularly relates to a control flow confusion method and system based on a callback function.
Background
Reverse analysis techniques are a technique common to malicious code writers. By reversely analyzing the existing program and mining the algorithm logic and key data contained in the program, software piracy and even hacking based on vulnerabilities can be effectively implemented. According to software survey published by the software alliance (BSA for short) in 2018, up to 36% of installed software in the world is not legally authorized, and the software is a great threat to the information security of the software industry and users. Code obfuscation techniques are common techniques for software protection. The effective confusion algorithm is constructed to confuse program data flow and control flow, so that the purpose of hiding code logic is achieved. The control flow confusion algorithm is researched relatively more, such as an opaque predicate confusion algorithm and a flat control flow confusion algorithm: the opaque predicate confusion algorithm enables one more false branch in a program by constructing an opaque predicate of constant true or constant false; or construct opaque predicates that can be true or false, add equivalent basic blocks in the program, making the structure of the control flow more complex. The control flow flattening algorithm flatly expands original nested loops and conditional branch statements in the program, and the nested loops and the conditional branch statements are connected through the switch statements, so that basic information for clearly recording the flow direction of the control flow is lost in each basic block, and reverse analysis is resisted.
However, the purpose of the obfuscation is to increase the difficulty of manual reversal and delay the program from breaking. Therefore, the development of automated software reverse techniques severely impacts the effectiveness of related obfuscation algorithms. Studies have shown that dynamic symbol execution and data flow analysis are widely used for the elimination of control flow confusion: the former judges whether the specified program point can be replaced by a known value through constant propagation and an available expression, and the latter solves the influence of preventive confusion on static program analysis by using unreachable path analysis. The method effectively eliminates the obstruction of the confusion algorithm to the program decoding, so that the software protection industry is difficult to make further breakthrough. Obviously, the traditional control flow obfuscation algorithm is difficult to adapt to the requirement of software protection, and the algorithm innovation is particularly important, so that the code protection method with high obfuscation strength and high algorithm elasticity is the current problem to be solved firstly.
Disclosure of Invention
Therefore, the invention provides a control flow confusion method and a control flow confusion system based on a callback function, which aim to overcome the defects of low confusion strength, high operation cost and the like of the traditional confusion algorithm and enhance the anti-reverse analysis capability of a network program.
According to the design scheme provided by the invention, the control flow confusion method based on the callback function comprises the following contents:
aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flow into a sequential structure;
through program analysis, data dependence is reconstructed in the callback function so as to maintain functional consistency before and after cycle jump transformation.
As the control flow confusion method based on the callback function, in the jump transformation, firstly, the program is converted into an intermediate language, the basic block of the cyclic code existing in the program and the corresponding dependency relationship are mined through the analysis of the control flow and the data flow, the basic block of the cyclic code is stripped from the original program and transferred to the callback function, and a call instruction for the callback function is added in the original program.
As the control flow obfuscation method based on the callback function, further, the corresponding callback boundary is set according to the original program logic in the corresponding loop ending condition, so that the program is executed as expected.
As the control flow confusion method based on the callback function, the loop related variables are pushed against the basic block of the loop code in the program, parameters are transmitted to the inside of the callback function by using a stack pointer, and data dependence is reconstructed in the callback function so as to maintain the consistency of functions before and after loop jump transformation.
As the control flow confusion method based on the callback function, the jump relation of the basic block is identified by the branch instruction aiming at the program intermediate language, and the control flow graph describing the program execution is generated according to the jump relation; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a loop code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
As the control flow confusion method based on the callback function, the loop logic type is further judged by comparing instructions in the LLVM according to the entry degree and the exit degree of the loop entry point; and positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting.
As the control flow obfuscation method based on the callback function, further, the loop logic type includes a judgment priority loop type and an execution priority loop type; and aiming at judging the priority type of the loop, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and keeping the execution logic of the original program unchanged through code redundancy.
As a control flow obfuscation method based on a callback function, further, data transfer is performed through a stack pointer to reconstruct data dependency in the callback function so as to maintain the dependency relationship unchanged, which specifically includes: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset.
As the control flow confusion method based on the callback function, the stack is sequentially popped and the stack balance is recovered after the callback function is executed.
Further, the present invention also provides a callback function-based control flow obfuscation system, comprising: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop conversion module is used for converting the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows are unified and merged into a sequential structure;
and the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after the loop jump transformation.
The invention has the beneficial effects that:
according to the method, by analyzing the analysis of the control flow and the data flow, starting from the working mechanism of the callback function, and by circularly calling the callback function, the repeated skip among the traditional circular structures is replaced, the key algorithm logic in the program is hidden in the callback function, the key algorithm logic is not easy to detect by an automatic analysis technology, and the problems that the traditional confusion method is difficult to resist the automatic analysis means and the like are effectively solved. Moreover, the callback function is used for circular confusion, only a small amount of function calls and necessary parameter transfer instructions are introduced into the original program, the influence on the program overhead is effective, the analysis resistance can be greatly enhanced under the condition of ensuring the program execution efficiency, the program size and the execution efficiency are not obviously changed after the program size and the execution efficiency are tested by practical application, and the method has good applicability.
Description of the drawings:
FIG. 1 is a schematic diagram of a control flow obfuscation flow in an embodiment;
FIG. 2 is a schematic diagram of an embodiment of a loop equivalent execution flow;
FIG. 3 is a diagram illustrating cycle callback boundary setting in an embodiment;
FIG. 4 is a schematic diagram of a control flow of a conventional loop structure in an embodiment;
FIG. 5 is a flow diagram illustrating callback-type loop control in an embodiment.
The specific implementation mode is as follows:
in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.
The traditional obfuscation method mainly increases the complexity of a program by adding code redundancy, and resists reverse analysis. The confusion strength of the method depends on the code redundancy degree, the method is difficult to deal with the increasingly efficient automatic analysis technology, and the introduction of the code redundancy in the program can lead to the increase of the code volume and the reduction of the operation efficiency, thereby being not in line with the development requirement of the software industry. To this end, an embodiment of the present invention, as shown in fig. 1, provides a callback function-based control flow obfuscation method, which includes the following steps:
s101, aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flow into a sequential structure;
s102, reconstructing data dependence in the callback function through program analysis to maintain functional consistency before and after cycle jump transformation.
Aiming at a circulation structure in a program, a circulation logic is hidden through a callback function, a program control flow graph is damaged, and the backward analysis resistance of the program is enhanced; the defects of low confusion strength, high running expense and the like of the traditional confusion algorithm are overcome, the repeated skip among traditional loop structures is replaced by the loop call of the callback function, the key algorithm logic is hidden in the callback function, the detection by an automatic analysis technology is not easy, and the analysis resistance of a network program is enhanced.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, in jump transformation, first, a program is converted into an intermediate language, and through analysis of a control flow and a data flow, a basic block of a loop code and a corresponding dependency relationship existing in the program are mined, the basic block of the loop code is stripped from an original program and transferred to the callback function, and a call instruction for the callback function is added to the original program.
The program is converted into an intermediate language, and at the level, the program is analyzed in control flow and data flow, and a cyclic code block and a corresponding dependency relationship existing in the program are mined. On the basis, the cyclic code block is stripped from the original program, transferred to the callback function, and added with the call to the callback function. And (3) converting the basic block circular jump into repeated calling among functions by utilizing a callback function response-execution mechanism, and maintaining the functional consistency before and after conversion through a program analysis technology. Thereby achieving the purposes of hiding control flow logic, resisting reverse analysis and protecting codes.
As the control flow obfuscation method based on the callback function in the embodiment of the present invention, further, a corresponding callback boundary is set in a corresponding loop ending condition according to the original program logic, so that the program is executed as expected.
As the control flow confusion method based on the callback function in the embodiment of the invention, further, aiming at a basic block of a loop code in a program, a loop related variable is pushed, a stack pointer is utilized to transfer a parameter to the interior of the callback function, and data dependence is reconstructed in the callback function so as to maintain the consistency of functions before and after loop jump transformation.
In order to ensure that the data dependency is unchanged, the loop-related variables are pushed, parameters are transmitted to the interior of the callback function by using a stack pointer, the data dependency is rebuilt in the callback function, and the functional consistency of the program is maintained.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, for a program intermediate language, a jump relation of a basic block is identified by a branch instruction, and a control flow graph describing program execution is generated according to the jump relation; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a loop code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
The callback function is applied to replace a cycle structure equivalently, and the functional consistency of the front program and the back program is ensured, so that the callback function selected in the embodiment of the invention has the following characteristics: 1) immediate responsiveness: after the caller is executed, a callback condition is triggered immediately, and all circulation functions are completely executed before the next instruction is executed; 2) repetition returnability: when the program runs, the cycle number is uncontrollable, so that a caller repeatedly calls a callback function under the condition of not triggering an exit condition; 3) specific exit conditions: corresponding to the loop boundary condition, providing a corresponding exit mechanism to end the callback process and continue to execute other functions of the program. In the intermediate language level, the program marks the jump relation of the basic block through the branch instruction, and a control flow graph for describing the program execution can be generated according to the jump relation, so that a closed loop structure existing in the program, namely a corresponding loop code block, is mined. And on the basis, sending the cyclic code block into a callback function to finish the cyclic stripping work.
As a control flow obfuscation method based on a callback function in the embodiment of the present invention, further, a loop logic type is determined by comparing instructions in the LLVM and according to a loop entry point in-degree and an out-degree; and positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting. Further, the loop logic type includes a judgment priority loop type and an execution priority loop type; and aiming at judging the priority type of the loop, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and keeping the execution logic of the original program unchanged through code redundancy.
After the code stripping is completed, the corresponding callback boundary is set according to the original program logic, obviously, the general program has two types of loop logics, namely execution priority loop (do-while) and judgment priority loop (while-do, for) respectively according to the C language code style. By distinguishing the cycle entry point in-degree and out-degree, the cycle types can be effectively distinguished: executing the priority cycle out degree to be 1, and judging the priority cycle out degree to be 2; both of them have an incomes greater than or equal to 2. Referring to the two types of loop equivalent execution flow diagrams shown in fig. 2, for different types of loops, the comparison instruction positions for identifying the jump conditions are also different due to different priorities of judgment and execution: judging that the jumping condition is positioned at a loop entry point in the priority type loop; in the execution priority loop, the jump condition is located at the loop exit point, i.e. the last bit code block. The LLVM is a framework system for constructing a compiler (compiler), written in C + +, and is used to optimize compile-time (compile-time), link-time (link-time), run-time (run-time), and idle-time (idle-time) of a program written in an arbitrary programming language. Referring to fig. 3, LLVM performs the decision function by comparing the command CmpInst and appears as a jump condition in the loop structure. And the callback function takes whether the return value is 0 as the callback boundary condition, so after the CmpInst is positioned, Boolean value conversion is carried out on the instruction, the RetInst is added at the tail end of the callback function to return the value, and the boundary condition setting is completed. In specific applications, the equivalent transformation modes of the two types of loops are different: the execution priority type immediately enters a loop structure after the program runs, and conforms to the function calling logic; the judgment priority type firstly carries out cycle boundary detection and then determines whether to enter a cycle structure. Therefore, a jump condition code block is additionally added for the latter, when the callback function is triggered for the first time, the basic block is executed to carry out entry judgment, and the execution logic of the original program is kept unchanged through code redundancy.
However, code stripping will cause the original program data stream dependencies to be destroyed. Therefore, reconstructing data dependencies is a problem that needs to be addressed intensively in engineering. By analyzing the data stream in the program and according to the related algorithm, the dependency relationship of the target instruction can be effectively mined. In order to ensure that the dependency relationship before and after the model application is unchanged, further, data transmission is performed through a stack pointer to reconstruct data dependency in a callback function so as to maintain the dependency relationship unchanged, specifically comprising: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset. Further, after the callback function is executed, the stack is popped in sequence, and stack balance is recovered. The specific implementation algorithm can be designed as follows:
Figure BDA0002485006820000051
firstly, the loop-related instructions are scanned, a data dependency discovery algorithm is applied one by one, and the generated result is added into a data set. And after the scanning is finished, the instruction in the data set is subjected to duplication removal, and the instruction address is pushed. Then, the stack top pointer is used as a parameter and is transmitted into the callback function, and a variable (ESP) is re-declared in the callback function, obviously, the dependency relationship formed by the original instruction and the data in the stack is equivalent to the dependency relationship formed by the target data. For other dependencies, the same can be achieved by stack pointer offsetting. And finally, after the callback function is executed, sequentially popping the stack and recovering stack balance.
In the static analysis layer, the original program control flow graph is damaged, and the cycle boundary condition is replaced by the callback boundary condition. In the aspect of dynamic analysis, a large number of callback library functions exist in a Windows operating system, based on a loop model established by the functions, loop functions are hidden in system call, an executor is converted into system functions from an original program, short jump among basic blocks is replaced by long jump of function call, and program execution addresses are distributed in a memory space to better resist the dynamic analysis technology. In addition, in the embodiment of the invention, the callback function is used for circular confusion, only a small amount of function calls and necessary parameter transfer instructions are introduced into the original program, the influence on the program overhead is effective, and the analysis resistance can be greatly enhanced under the condition of ensuring the program execution efficiency.
After the data dependency relationship is recovered, the embodiment of the invention can effectively confuse codes and protect algorithm logic on the premise of ensuring the functional consistency. Before and after the application of the program, control flow diagrams are shown as a traditional circulation structure control flow diagram in fig. 4 and a callback type circulation control flow diagram in fig. 5, after the traditional circulation structure is converted into callback type circulation, originally complex control flows of the program are unified and merged into a sequential structure, original execution logic is completely hidden, and the circulation call among functions implicitly completes the algorithm function in the network program. On the basis of a traditional confusion algorithm, by means of a control flow deepening concept and by means of the execution characteristics of a callback function, the circulation logic is hidden in function call, and the reverse resistance is enhanced.
Further, based on the above method, an embodiment of the present invention further provides a callback function-based control flow obfuscation system, including: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop conversion module is used for converting the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows are unified and merged into a sequential structure;
and the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after the loop jump transformation.
Unless specifically stated otherwise, the relative steps, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the present invention.
Based on the foregoing system, an embodiment of the present invention further provides a server, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the above-described systems or methods.
Based on the above system, the embodiment of the present invention further provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the above system or method.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the system embodiment, and for the sake of brief description, reference may be made to the corresponding content in the system embodiment for the part where the device embodiment is not mentioned.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing system embodiments, and are not described herein again.
In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and system may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the system according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A control flow obfuscation method based on a callback function is characterized by comprising the following contents:
aiming at a circulation structure in a program, converting the circulation jump of a basic block into repeated calling among functions through a callback function, hiding the original execution logic of the program, and uniformly merging the original control flow into a sequential structure;
through program analysis, data dependence is reconstructed in the callback function so as to maintain functional consistency before and after cycle jump transformation.
2. The control flow obfuscation method of claim 1, wherein in the jump transformation, the program is first converted into an intermediate language, and through control flow and data flow analysis, a loop code basic block and a corresponding dependency existing in the program are mined, the loop code basic block is stripped from the original program and transferred to the callback function, and a call instruction for the callback function is added to the original program.
3. A callback function based control flow obfuscation method according to claim 1 or 2, wherein the respective callback boundaries are set according to the original program logic in the corresponding loop-ending conditions to cause the program to execute as expected.
4. The control flow obfuscation method of claim 1, wherein loop-dependent variables are pushed to the inside of the callback function by using stack pointers to transfer parameters to the inside of the callback function for basic blocks of loop code in the program, and data dependencies are reconstructed in the callback function to maintain functional consistency before and after loop jump transformation.
5. The control flow obfuscation method based on a callback function as claimed in claim 2, wherein for an intermediate language of the program, a jump relationship of a basic block is identified by a branch instruction, and a control flow graph describing execution of the program is generated according to the jump relationship; a closed loop structure existing in a program is excavated according to the control flow graph so as to obtain a basic block of a loop code; the loop stripping in the program is completed by feeding the basic block of loop code into the callback function.
6. A callback function based control flow obfuscation method according to claim 3, wherein the loop logic type is distinguished by comparing instructions in the LLVM and depending on loop entry point in-degree and out-degree; and positioning and Boolean value conversion are carried out on the comparison instruction, and the Boolean value is returned by adding a return instruction at the tail of the callback function so as to complete callback boundary condition setting.
7. The callback function-based control flow obfuscation method of claim 6, wherein the loop logic type comprises a judge priority loop type and an execute priority loop type; and aiming at judging the priority type of the loop, by adding a jump condition code block, when a callback function is triggered for the first time, executing a basic block to judge an entrance, and keeping the execution logic of the original program unchanged through code redundancy.
8. The control flow obfuscation method based on a callback function as claimed in claim 1, wherein performing data transfer through a stack pointer reconstructs data dependencies in the callback function to maintain dependency relationships unchanged, specifically comprising: scanning a cycle related instruction, adding a scanning result to a data set, removing duplication of the instruction in the data set, and stacking an instruction address; transmitting a stack top pointer serving as a parameter into a callback function, and declaring a variable in the callback function again so as to enable a dependency relationship formed by an original instruction and data in a stack to be equivalent to a dependency relationship formed by the original instruction and the original data; and sequentially carrying out data transfer by using the stack pointer offset.
9. The control flow obfuscation method of claim 8, wherein after the callback function is executed, the stack is popped in sequence to restore stack balance.
10. A callback function based control flow obfuscation system, comprising: a cyclic transformation module and a cyclic reconstruction module, wherein,
the loop conversion module is used for converting the loop jump of the basic block into the repeated call among functions through a callback function aiming at the loop structure in the program, so that the original execution logic of the program is hidden, and the original control flows of the program are unified and merged into a sequential structure;
and the loop reconstruction module is used for reconstructing data dependence in the callback function through program analysis so as to maintain the functional consistency before and after the loop jump transformation.
CN202010388650.3A 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system Active CN111723345B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010388650.3A CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010388650.3A CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Publications (2)

Publication Number Publication Date
CN111723345A true CN111723345A (en) 2020-09-29
CN111723345B CN111723345B (en) 2022-11-22

Family

ID=72565056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010388650.3A Active CN111723345B (en) 2020-05-09 2020-05-09 Callback function-based control flow obfuscation method and system

Country Status (1)

Country Link
CN (1) CN111723345B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527307A (en) * 2020-11-18 2021-03-19 西安电子科技大学 Program control flow hiding method, system and application
CN113158147A (en) * 2021-03-24 2021-07-23 中国人民解放军战略支援部队信息工程大学 Code obfuscation method based on parent fusion
CN114357389A (en) * 2021-12-31 2022-04-15 北京大学 Instruction flower adding confusion method and device based on LLVM
CN117407876A (en) * 2023-12-11 2024-01-16 常熟理工学院 Opaque predicate detection method, system and storage medium in malicious software

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070234070A1 (en) * 1999-07-29 2007-10-04 Intertrust Technologies Corp. Software self-defense systems and methods
US20100199354A1 (en) * 2006-12-21 2010-08-05 Johan Eker Obfuscating Computer Program Code
CN106682460A (en) * 2016-11-25 2017-05-17 西北大学 Code obfuscation method based on two transformations
CN109784010A (en) * 2018-12-18 2019-05-21 武汉极意网络科技有限公司 A kind of program control flow based on LLVM obscures method and device
CN110309629A (en) * 2019-06-18 2019-10-08 阿里巴巴集团控股有限公司 A kind of web page code reinforcement means, device and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070234070A1 (en) * 1999-07-29 2007-10-04 Intertrust Technologies Corp. Software self-defense systems and methods
US20100199354A1 (en) * 2006-12-21 2010-08-05 Johan Eker Obfuscating Computer Program Code
CN106682460A (en) * 2016-11-25 2017-05-17 西北大学 Code obfuscation method based on two transformations
CN109784010A (en) * 2018-12-18 2019-05-21 武汉极意网络科技有限公司 A kind of program control flow based on LLVM obscures method and device
CN110309629A (en) * 2019-06-18 2019-10-08 阿里巴巴集团控股有限公司 A kind of web page code reinforcement means, device and equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
尚涛等: "软件防反汇编技术研究", 《计算机应用研究》 *
李路鹿等: "代码混淆技术研究综述", 《软件》 *
杨秋翔等: "分存技术在代码混淆中的研究", 《计算机工程与设计》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527307A (en) * 2020-11-18 2021-03-19 西安电子科技大学 Program control flow hiding method, system and application
CN112527307B (en) * 2020-11-18 2023-06-20 西安电子科技大学 Program control flow hiding method, system and application
CN113158147A (en) * 2021-03-24 2021-07-23 中国人民解放军战略支援部队信息工程大学 Code obfuscation method based on parent fusion
CN113158147B (en) * 2021-03-24 2022-12-09 中国人民解放军战略支援部队信息工程大学 Code obfuscation method based on parent fusion
CN114357389A (en) * 2021-12-31 2022-04-15 北京大学 Instruction flower adding confusion method and device based on LLVM
CN114357389B (en) * 2021-12-31 2024-04-16 北京大学 LLVM (logical Low level virtual machine) -based instruction flower adding confusion method and device
CN117407876A (en) * 2023-12-11 2024-01-16 常熟理工学院 Opaque predicate detection method, system and storage medium in malicious software

Also Published As

Publication number Publication date
CN111723345B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN111723345B (en) Callback function-based control flow obfuscation method and system
CN106096338B (en) A kind of virtualization software guard method obscured with data flow
Kalysch et al. VMAttack: deobfuscating virtualization-based packed binaries
TW200837604A (en) Obfuscating computer program code
US20110167407A1 (en) System and method for software data reference obfuscation
CN112839036B (en) Software running environment generation method and system based on mimicry defense theory
US20160171213A1 (en) Apparatus and method for controlling instruction execution to prevent illegal accesses to a computer
CN113366474A (en) System, method and storage medium for obfuscating a computer program by representing control flow of the computer program as data
CN112434266A (en) Shell code control flow flattening confusion method
CN114611074A (en) Method, system, equipment and storage medium for obfuscating source code of solid language
CN111814119B (en) Anti-debugging method
CN109858204B (en) Program code protection method and device based on LLVM
CN113419960B (en) Seed generation method and system for kernel fuzzy test of trusted operating system
EP3380974B1 (en) Method to generate a secure code
CN111488558B (en) Script protection method and device, computer readable storage medium and computer equipment
He et al. Tamperproofing a software watermark by encoding constants
CN114880665A (en) Intelligent detection method and device for return programming attack
KR102429641B1 (en) Method and device for generating input values for fuzzing by analysis of comparison statements within binaries
CN110147238B (en) Program compiling method, device and system
Wang et al. An efficient control-flow based obfuscator for micropython bytecode
Kumar et al. A thorough investigation of code obfuscation techniques for software protection
CN117270878B (en) Constraint condition extraction method and device for program variables in program execution path
CN113158147B (en) Code obfuscation method based on parent fusion
CN113946804B (en) Source code obfuscation method and device
CN109918872B (en) Android application reinforcing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant