CN112115427A - Code obfuscation method, device, electronic device and storage medium - Google Patents

Code obfuscation method, device, electronic device and storage medium Download PDF

Info

Publication number
CN112115427A
CN112115427A CN202010819524.9A CN202010819524A CN112115427A CN 112115427 A CN112115427 A CN 112115427A CN 202010819524 A CN202010819524 A CN 202010819524A CN 112115427 A CN112115427 A CN 112115427A
Authority
CN
China
Prior art keywords
code block
code
address
instruction
basic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010819524.9A
Other languages
Chinese (zh)
Other versions
CN112115427B (en
Inventor
兰丽
蒲志明
夏冰
于大鹏
高迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010819524.9A priority Critical patent/CN112115427B/en
Publication of CN112115427A publication Critical patent/CN112115427A/en
Application granted granted Critical
Publication of CN112115427B publication Critical patent/CN112115427B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention provides a code obfuscation method, a code obfuscation device, electronic equipment and a storage medium; the method comprises the following steps: determining a basic code block in a function according to the control flow trend of the function in a target program; dividing the basic code block to obtain a sub code block; converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided. According to the code obfuscation method, the code obfuscation device, the electronic device and the storage medium, the target address of the unconditional jump instruction in the code block is converted into the address determined when the target program runs, so that the direct jump relation between the code block with the unconditional jump instruction and the code block to be jumped is cut off, and the difficulty of reverse analysis is increased.

Description

Code obfuscation method, device, electronic device and storage medium
Technical Field
The present invention relates to the field of computer security technologies, and in particular, to a code obfuscation method and apparatus, an electronic device, and a storage medium.
Background
With the development of information science and technology, the security of a software system also faces serious threat while bringing convenience to users. An attacker can easily acquire user private information, a core algorithm, a key business process and even a source code of software through reverse means such as decompilation, dynamic debugging and the like. This causes a huge loss in the intellectual property protection of the software of the enterprise.
In order to effectively resist the reverse analysis of an attacker for software, software developers provide protection technologies such as software encryption, code confusion, software watermarking, tamper resistance and the like. The code obfuscation technology is one of the key technologies for ensuring software safety, converts the source code and the internal structure logic of a program into a form which is more difficult to analyze and modify on the premise of not changing the original program semantics, and greatly increases the reverse analysis cost of an attacker.
Control flow obfuscation is a relatively mature and critical technique in code obfuscation, which hides the true execution logic of a program by changing or complicating the control flow of the original program, increases the difficulty for a cracker to analyze and reconstruct the control flow of the program, and thus protects the source code. The implementation methods of the currently studied control flow obfuscation technologies include:
(1) opaque predicates
And adding opaque predicates which are difficult to deduce the identity, the identity and the false of the value of the basic block from the expression per se or the real time and the true time to the basic block in the control flow graph, confusing the real execution flow of the basic block and further complicating the control flow.
(2) Control flow planarization
By destroying the easily-recognized conditions and loop structures in the function control flow graph, the easily-read code flow is recombined into a code execution flow in the form of switch case which is difficult to understand.
(3) Inserting false control flows
Redundant control flows are inserted into the original control flow by using an opaque predicate technology, so that the complexity of the original control flow is increased, and the difficulty of reconstructing the original control flow by an attacker is improved.
The control flow confusion method has obvious characteristics after the program is confused, and is easy to discover and crack by an attacker by using the conventional reverse technology to restore the code control flow. If the opaque predicates frequently used are a limited number of more complex mathematical expressions, the opaque predicates can be collected and sorted and then directly filtered out in reverse analysis; if the program has an obvious switch case structure after the control flow is flattened, the execution sequence of the code blocks can be recorded in the dynamic debugging process, and the control flow can be rebuilt.
Disclosure of Invention
The embodiment of the invention provides a code obfuscation method, a code obfuscation device, electronic equipment and a storage medium, which are used for solving the defects that after a code obfuscation method in the prior art obfuscates program codes, program characteristics are obvious and an attacker easily finds and cracks the program codes.
An embodiment of a first aspect of the present invention provides a code obfuscating method, including:
determining a basic code block in a function according to the control flow trend of the function in a target program;
dividing the basic code block to obtain a sub code block;
converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
In the above technical solution, after the step of dividing the basic code block to obtain the sub-code blocks, the method further includes:
and disturbing the arrangement sequence of the code blocks in the functions in the target program in the functions.
In the above technical solution, the converting a target address of an unconditional jump instruction in a code block to an address determined when the target program runs specifically includes:
inserting an address calculation code block between a first code block having an unconditional jump instruction and a second code block to be jumped according to the unconditional jump instruction, and changing a jump target of the unconditional jump instruction into the address calculation code block; wherein the address calculation code block is to dynamically calculate an address of the second code block at runtime.
In the above technical solution, the inserting an address calculation code block between a first code block having an unconditional jump instruction and a second code block to be jumped according to the unconditional jump instruction specifically includes:
inserting an address calculation code block between the first code block and the second code block;
calculating an address offset between the address calculation code block and the second code block according to the address information of the address calculation code block and the address information of the second code block;
modifying an address to be jumped to by an unconditional jump instruction within the address calculation code block to an address calculation formula, the address calculation formula comprising: the address calculation code block itself address information, the address calculation code block and the second code block address offset.
In the above technical solution, the method further includes:
and replacing the direct call instruction of the system function in the target program with an indirect call instruction of the system function.
In the above technical solution, the replacing the direct call instruction to the system function in the target program with the indirect call instruction to the system function specifically includes:
generating an indirect call instruction according to the real address of the system function and the dynamic link library, wherein the indirect call instruction acquires the real address of the system function to be executed in a mode of analyzing the function address in the dynamic link library;
and replacing the direct call instruction of the system function in the target program with the indirect call instruction of the system function.
In the foregoing technical solution, the dividing the basic code block according to a second preset rule to obtain a sub-code block specifically includes:
judging whether the basic code block meets a preset rule or not, and when the basic code block meets the preset rule, segmenting the basic code block to obtain a first segmentation result;
judging whether a jump instruction exists at the tail part of the first segmentation result, and adding the jump instruction at the tail part of the first segmentation result when the jump instruction does not exist to obtain a sub-code block; wherein the jump instruction is to jump to a next instruction of a last instruction of the first split result in the basic code block.
An embodiment of a second aspect of the present invention provides a code obfuscating apparatus, including:
the basic code block determining module is used for determining a basic code block in a function according to the control flow direction of the function in the target program;
a sub-code block generation module, configured to divide the basic code block to obtain sub-code blocks;
the instruction conversion module is used for converting a target address of the unconditional jump instruction in the code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
In a third embodiment of the present invention, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the steps of the code obfuscating method according to the first embodiment of the present invention.
A fourth aspect of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the code obfuscation method as described in the first aspect of the present invention.
According to the code obfuscation method, the code obfuscation device, the electronic device and the storage medium, the target address of the unconditional jump instruction in the code block is converted into the address determined when the target program runs, so that the direct jump relation between the code block with the unconditional jump instruction and the code block to be jumped is cut off, and the difficulty of reverse analysis is increased.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a code obfuscation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of inserting an address calculation code block between a code block having an unconditional jump instruction and a code block to be jumped;
FIG. 3 is a control flow graph viewed by the inverse tool IDA before a function is not obfuscated;
FIG. 4 is a control flow graph of the function associated with FIG. 3 viewed with the inverse tool IDA after being obfuscated;
FIG. 5 is a diagram of a code obfuscator according to an embodiment of the present invention;
fig. 6 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a code obfuscation method according to an embodiment of the present invention, and as shown in fig. 1, the code obfuscation method according to the embodiment of the present invention includes:
step 101, determining a basic code block in a function according to the control flow trend of the function in the target program.
In the embodiment of the present invention, the target program refers to a program for performing code obfuscation by using the code obfuscation method provided by the embodiment of the present invention. Specifically, the function in the target program is the object of code obfuscation operation, that is, obfuscation of the code in the function included in the target program is required.
As is known to those skilled in the art, a function in the computer art is a fixed segment of a program that independently performs a specific function. The basic unit of composition of a function is an instruction. Executing the instructions may perform certain operations at a particular step. The basic code block is between the function and the instruction. The basic code blocks include a plurality of instructions, and the functions may include one or more basic code blocks.
The execution logic inside the target program is called the control flow walk. The control flow trend reflects the call relation between functions and the execution flow of each instruction in the functions. As an alternative implementation, the control flow direction of the function of the target program may be represented in a control flow graph.
In other embodiments of the present invention, the control flow direction of the function of the target program may also be obtained by analyzing the target program through a tool and a function library provided by the LLVM. And the LLVM compiles the source code of the target program into an intermediate code file, analyzes the intermediate code file, and traverses the functions in the intermediate code file so as to obtain the control flow direction of the functions.
As an optional implementation mode, the control flow trend of the function can be analyzed by using a tool provided by the LLVM and a function library, all switch-case structures in the function are replaced by if-else structures, and a basic code block of the original control flow of the function is obtained. Since the switch-case structure is easily used by attackers for reverse analysis of the target program, it is necessary to replace all switch-case structures in the function to if-else structures.
And further segmenting the function according to the control flow trend of the target program to obtain a basic code block. When the function is further cut, the basic code block can be cut from the function according to the control flow direction of the function in the target program and a first preset rule.
According to the first preset rule, a starting instruction of the basic code block can be determined, wherein the first preset rule comprises any one of the following three items:
a) an entry instruction of a function;
b) a target instruction of the jump instruction;
c) and jumping to the next instruction after the instruction, wherein the instruction is not the target instruction of the jump instruction.
After determining the start instruction of the basic code block, all instructions (without the second start instruction) between the start instructions of two basic code blocks in the same function are formed into one basic code block.
It should be noted that, as is readily understood by those skilled in the art, if a switch-case structure is included in a function, after the switch-case structure is replaced, the switch-case structure will no longer be present in the basic code block obtained by further splitting the function.
And 102, dividing the basic code block to obtain a sub code block.
In order to further increase the difficulty of the reverse analysis, in the embodiment of the present invention, the basic code block may be further divided to obtain the sub-code blocks.
The sub-code block is a part of the basic code block and includes a plurality of instructions. Before the basic code block is divided, whether the basic code block meets a second preset rule is judged. The second preset rule is used for describing the partitioning condition of the basic code block. For example, one implementation of the second predetermined rule is a predetermined number of instructions, such as 5. Judging whether the number of the instructions in the basic code block is 5 or not when judging whether the basic code block meets a second preset rule or not, and if the number of the instructions in the basic code block is more than or equal to 5, enabling the basic code block to meet the second preset rule and further performing segmentation processing; if the number of the basic code blocks is less than 5, the basic code blocks cannot be divided. The content of the second preset rule is not limited to the number of the preset instructions, and the content of the second preset rule can be determined according to actual needs.
The second preset rule may be used not only to judge whether the basic code block can be divided, but also to guide the division of the basic code block. When the basic code block is divided, the basic code block may be divided according to a second preset rule (e.g., a preset number of instructions) to obtain sub-code blocks. For example, it is preset that every 5 instructions constitute one sub-code block, and when the basic code block is divided, the continuous 5 instructions in the basic code block are divided as a basic unit to obtain the corresponding sub-code block. If the residual instructions are less than 5 after a basic code block is divided, the residual instructions can be used as a basic unit to form a sub-code block.
It should be noted that, when the basic code blocks are generated, the jump instruction is used as a boundary for division, and therefore the logical relationship between the basic code blocks obtained by division is not interrupted. However, when generating a sub-code block, except for a few sub-code blocks (e.g. the sub-code block at the tail of the basic code block), there is no instruction having an explicit logical relationship with other sub-code blocks, such as a jump instruction, at the tail of the sub-code block before being divided, so that when dividing the basic code block and generating the sub-code block, it is necessary to add a jump instruction at the tail of the sub-code block to explicitly identify the next sub-code block to be executed after the current sub-code block is completely executed.
For example, the code of a basic code block in a certain function is as follows:
label%3:
%4=alloca i8*,align 8
%5=alloca i32*,align 8
%6=alloca i32*,align 4
%7=alloca i32*,align 4
%8=alloca i32*,align 4
store i8*%0,i8**%4,align 8
store i32*%1,i32**%5,align 8
store i32*%2,i32**%6,align 4
store i32*%0,i32**%7,align 4
store i32*%0,i32**%8,align 4
br label%9
the basic code block is segmented according to a certain rule, and the following sub-code blocks can be obtained:
sub-code block 1
label%3:
%4=alloca i8*,align 8
%5=alloca i32*,align 8
%6=alloca i32*,align 4
br label%.split
Sub-code block 2
label.split:
%7=alloca i32*,align 4
%8=alloca i32*,align 4
br label%.split.split
Sub-code block 3
label.split.split:
store i8*%0,i8**%4,align 8
br label%.split.split.split
Sub-code block 4
label.split.split.split:
store i32*%1,i32**%5,align 8
store i32*%2,i32**%6,align 4
store i32*%0,i32**%7,align 4
store i32*%0,i32**%8,align 4
br label%9
As can be seen from the above example, the jump instructions are newly added at the tail of each of the sub-code blocks 1 to 3 newly divided from the basic code block, and the jump to the next sub-code block can be performed according to these newly added jump instructions.
And 103, converting a target address of the unconditional jump instruction in the code block into an address determined when the target program runs.
After each function in the target program is subjected to the operation of the previous step, the function includes a sub-code block, and in some cases, some basic code blocks in the function do not meet the condition of segmentation, so that the function also includes an undivided basic code block. The basic code blocks (if any) that are not partitioned are unified with the sub-code blocks as code blocks.
The converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs specifically includes:
inserting an address calculation code block between a first code block having an unconditional jump instruction and a second code block to be jumped according to the unconditional jump instruction, and changing a jump target of the unconditional jump instruction into the address calculation code block; the address calculation code block is used to dynamically calculate an address of the second code block at runtime.
The jump instruction comprises a conditional jump instruction and a non-conditional jump instruction. Unconditional jump instructions, also known as direct jump instructions, require a jump operation without a conditional decision. The jump instruction can clearly reflect the execution sequence of the target program, so in order to increase the difficulty of reverse analysis, the jump instruction needs to be hidden.
Specifically, an address calculation code block is inserted between a first code block with an unconditional jump instruction and a second code block to be jumped, and meanwhile, the direct control flow relation between the first code block with the unconditional jump instruction and the second code block to be jumped according to the unconditional jump instruction is cut off by modifying the target address of the unconditional jump instruction of the first code block, namely, the target address of the unconditional jump instruction of the first code block is modified into the address calculation code block from the second code block. The address calculation module comprises an unconditional jump instruction, the address to be jumped by the unconditional jump instruction is an address calculation formula, and the address calculation formula comprises: address calculation code block address information itself, address calculation code block and second code block address offset between. The address offset between the address calculation code block and the second code block may be calculated in advance from address information of the address calculation code block and address information of the second code block.
When the target program runs, the specific value of the address information of the address calculation code block can be determined, so that the address information of the second code block is calculated by combining the address offset between the address calculation code block and the second code block, and the jump to the second code block is realized.
The operation can cut off the direct jump relation between the first code block and the second code block, so that the address of the second code block can be obtained only by dynamic calculation when the target program runs, and the difficulty of reverse analysis is increased.
For example, fig. 2 is a schematic diagram of inserting an address calculation code block between a code block having an unconditional jump instruction and a code block to be jumped. In the embodiment shown in fig. 2, it is assumed that there are a code block a and a code block B in the target program, and the original execution flow of the code block a and the code block B is described on the leftmost side of fig. 2. According to the original execution flow, the code block A is directly jumped to the code block B, and the code block A comprises a pseudo Jump instruction Jump B (actually, various types of instructions such as Jump, mov, bx, jne and the like). According to the description of the embodiment of the invention, firstly, a code block T is added between the code block A and the code block B, the content of the code block T only contains an instruction for jumping to the code block B, and meanwhile, the instruction Jump B for jumping to the code block B in the code block A is changed into the instruction Jump T for jumping to the code block T. The middle part of fig. 2 depicts the addition of a code block T between code block a and code block B. Then, the address offset between the code block B and the code block T is calculated and recorded as offset. The content of the modified code block T is: and acquiring the address of the T, and modifying the direct Jump instruction into indirect Jump, namely replacing a Jump B instruction of the direct Jump code block B in the code block T with Jump T + offset, so that the direct Jump relation between the code block A and the code block B is cut off, and the address of the code block B can be obtained only by dynamic calculation during running. The rightmost side of fig. 2 depicts modifying the address content in the code block T.
According to the code obfuscation method provided by the embodiment of the invention, the target address of the unconditional jump instruction in the code block is converted into the address determined when the target program runs, so that the direct jump relation between the code block with the unconditional jump instruction and the code block to be jumped is cut off, and the difficulty of reverse analysis is increased.
Based on any of the above embodiments, in an embodiment of the present invention, after step 102, the method further includes:
and (4) disordering the sequence of code blocks inside the function in the target program in the function to which the code blocks belong.
There is a linear order between the code blocks inside each function, and in order to increase the difficulty of the backward analysis, in this embodiment, the order between the code blocks inside the functions is disturbed.
Specifically, in the embodiment of the present invention, a number may be set for each code block in the function, and a random array that can reflect the code block number is generated. For example, a certain function has N code blocks in total, and the code blocks are numbered in the order from 1 to N. After the code blocks are numbered, a random array with the size of N, the content of 1 to N and no repeated data can be generated. The numbers in this random array are not ordered by the size of the numbers, but are ordered randomly.
After the random array is provided, the code blocks can be rearranged according to the sequence of the random array, so that the purpose of destroying the original layout of the function is achieved. For example, the number in a random array of size 10 is ordered as: 1. 3, 8, 5, 7, 4, 6, 2, 9, 10. When sorting the code blocks, the code block with the number of 1 is placed at the first bit in the function, then the code block with the number of 3 is placed at the 2 nd bit in the function, then the code block with the number of 8 is placed at the 3 rd bit in the function, and so on, and finally the code block with the number of 10 is placed at the 10 th bit in the function.
It should be noted that, in the embodiment of the present invention, the order of the code blocks inside the obfuscating function refers to a static layout order of the obfuscating code blocks in the function, and since each code block includes a jump instruction, a dynamic execution logic of the whole function is not changed, and the function can still be normally executed.
The target program comprises a plurality of functions, and the sequence of code blocks in the target program can be disturbed according to the description.
The code obfuscation method provided by the embodiment of the invention obtains the basic code blocks from the functions contained in the target program, divides the basic code blocks to obtain the sub-code blocks, and then carries out-of-order arrangement on the basic code blocks and the sub-code blocks which are not divided in the functions, thereby changing the content layout of each function in the target program and achieving the purpose of increasing the difficulty of reverse analysis.
Based on any of the above embodiments, in an embodiment of the present invention, the method further includes:
and replacing the direct call instruction of the target function to the system function with the indirect call instruction of the system function.
In the embodiment of the present invention, the system function refers to a library function specified by the C language standard, such as a function in glibc. The system function easily exposes the location of some critical code, and thus it is desirable to hide the system function in the target program.
In specific implementation, a direct call instruction for a system function in a target program is replaced by an indirect call instruction for the system function, and the indirect call instruction acquires a real address of the system function to be executed in a mode of analyzing a function address in a dynamic link library. For example, under linux, a system function is introduced by parsing the address of the system function to be executed by a dlsym function form; under windows, the address of the system function to be executed is acquired through a GetProcAddress function form, so that the system function is introduced.
For example, in one embodiment, first, a function in a target program is traversed, a system function in a function call is analyzed and obtained, and the system function is marked as sysFunc; then, encrypting the system function name to obtain an encrypted system function name encStr; then, converting all system function call instructions in the target program into indirect call instructions, specifically comprising: loading a system function to be called by using dlsym, marking the name of the related function as indirectCall, and decrypting the system function name during running; finally, the function indirectCall replaces the previous system function in the target function.
The relevant code for this embodiment is as follows:
before the system function is replaced:
sysFunc(…)
after the system function is replaced:
encStr=encode(“sysFunc”);
indirectCall=dls ym((void*)0,decode(encStr))
indirectCall(…)
after the above operation, the call to the system function in the target program is hidden, so that an attacker cannot find out the system function in the target program through reverse analysis.
In the above embodiment, as a preferred implementation manner, the operation of encrypting the system function name is performed while replacing the direct call instruction of the system function with the indirect call instruction of the system function. This may further enhance the degree of confusion of the system function. In other embodiments, the operation of encrypting the system function name can be omitted according to actual needs.
The code obfuscation method provided by the embodiment of the invention can prevent an attacker from finding out the system function in the target program through reverse analysis by hiding the call to the system function in the target program, can effectively prevent static analysis and delays the copying or tampering of software.
In order to illustrate the technical effect of the code obfuscation method provided by the embodiment of the invention, control flow diagrams of a target program before and after code obfuscation are compared through a drawing.
Fig. 3 is a control flow graph viewed by the inverse tool IDA before a function is not obfuscated. According to the control flow graph, an attacker can clearly see the layout and the execution flow of the program. After the code obfuscation method provided by the embodiment of the invention is used, the control flow of the function is checked through an IDA reverse tool as shown in FIG. 4, so that the increase of program code blocks can be seen from the diagram, and a plurality of code blocks which are irrelevant exist, so that an attacker can hardly read the code and directly analyze the program execution flow.
Fig. 5 is a schematic diagram of a code obfuscation apparatus according to an embodiment of the present invention, and as shown in fig. 5, the code obfuscation apparatus according to the embodiment of the present invention includes:
a basic code block determining module 501, configured to determine a basic code block in a function according to a control flow direction of the function in a target program;
a sub-code block generating module 502, configured to divide the basic code block to obtain a sub-code block;
an instruction conversion module 503, configured to convert a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
The code obfuscation device provided by the embodiment of the invention converts the target address of the unconditional jump instruction in the code block into the address determined when the target program runs, thereby cutting off the direct jump relation between the code block with the unconditional jump instruction and the code block to be jumped, and increasing the difficulty of reverse analysis.
Fig. 6 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 6, the electronic device may include: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may call logic instructions in the memory 630 to perform the following method: determining a basic code block in a function according to the control flow trend of the function in a target program; dividing the basic code block to obtain a sub code block; converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
It should be noted that, when being implemented specifically, the electronic device in this embodiment may be a server, a PC, or other devices, as long as the structure includes the processor 610, the communication interface 620, the memory 630, and the communication bus 640 shown in fig. 6, where the processor 610, the communication interface 620, and the memory 630 complete mutual communication through the communication bus 640, and the processor 610 may call the logic instruction in the memory 630 to execute the above method. The embodiment does not limit the specific implementation form of the electronic device.
In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Further, embodiments of the present invention disclose a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, which when executed by a computer, the computer is capable of performing the methods provided by the above-mentioned method embodiments, for example, comprising: determining a basic code block in a function according to the control flow trend of the function in a target program; dividing the basic code block to obtain a sub code block; converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including: determining a basic code block in a function according to the control flow trend of the function in a target program; dividing the basic code block to obtain a sub code block; converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A code obfuscation method, comprising:
determining a basic code block in a function according to the control flow trend of the function in a target program;
dividing the basic code block to obtain a sub code block;
converting a target address of an unconditional jump instruction in a code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
2. A code obfuscation method as in claim 1, wherein after the step of partitioning the basic code block into sub-code blocks, the method further comprises:
and disturbing the arrangement sequence of the code blocks in the functions in the target program in the functions.
3. The code obfuscation method according to claim 1 or 2, wherein the converting a target address of an unconditional jump instruction in a code block to an address determined when the target program runs includes:
inserting an address calculation code block between a first code block having an unconditional jump instruction and a second code block to be jumped according to the unconditional jump instruction, and changing a jump target of the unconditional jump instruction into the address calculation code block; wherein the address calculation code block is to dynamically calculate an address of the second code block at runtime.
4. A code obfuscation method as claimed in claim 3, wherein inserting an address calculation code block between a first code block having an unconditional jump instruction and a second code block to be jumped according to the unconditional jump instruction comprises:
inserting an address calculation code block between the first code block and the second code block;
calculating an address offset between the address calculation code block and the second code block according to the address information of the address calculation code block and the address information of the second code block;
modifying an address to be jumped to by an unconditional jump instruction within the address calculation code block to an address calculation formula, the address calculation formula comprising: the address calculation code block itself address information, the address calculation code block and the second code block address offset.
5. A code obfuscation method as claimed in claim 1 or 2, further comprising:
and replacing the direct call instruction of the system function in the target program with an indirect call instruction of the system function.
6. The code obfuscation method according to claim 5, wherein replacing the direct call instruction to the system function in the target program with an indirect call instruction to the system function includes:
generating an indirect call instruction according to the real address of the system function and the dynamic link library, wherein the indirect call instruction acquires the real address of the system function to be executed in a mode of analyzing the function address in the dynamic link library;
and replacing the direct call instruction of the system function in the target program with the indirect call instruction of the system function.
7. A code obfuscation method as claimed in claim 1, wherein the dividing the basic code block according to a second preset rule to obtain sub-code blocks specifically includes:
judging whether the basic code block meets a preset rule or not, and when the basic code block meets the preset rule, segmenting the basic code block to obtain a first segmentation result;
judging whether a jump instruction exists at the tail part of the first segmentation result, and adding the jump instruction at the tail part of the first segmentation result when the jump instruction does not exist to obtain a sub-code block; wherein the jump instruction is to jump to a next instruction of a last instruction of the first split result in the basic code block.
8. A code obfuscation apparatus, comprising:
the basic code block determining module is used for determining a basic code block in a function according to the control flow direction of the function in the target program;
a sub-code block generation module, configured to divide the basic code block to obtain sub-code blocks;
the instruction conversion module is used for converting a target address of the unconditional jump instruction in the code block into an address determined when the target program runs; wherein the code block includes a sub-code block, or includes a sub-code block and a basic code block that is not divided.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the code obfuscation method according to any one of claims 1 to 7 are implemented when the program is executed by the processor.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the code obfuscation method as defined in any one of claims 1 to 7.
CN202010819524.9A 2020-08-14 2020-08-14 Code confusion method, device, electronic equipment and storage medium Active CN112115427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010819524.9A CN112115427B (en) 2020-08-14 2020-08-14 Code confusion method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010819524.9A CN112115427B (en) 2020-08-14 2020-08-14 Code confusion method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112115427A true CN112115427A (en) 2020-12-22
CN112115427B CN112115427B (en) 2024-05-31

Family

ID=73804123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010819524.9A Active CN112115427B (en) 2020-08-14 2020-08-14 Code confusion method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112115427B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569269A (en) * 2021-09-23 2021-10-29 苏州浪潮智能科技有限公司 Encryption method, device, equipment and readable medium for code obfuscation
CN113836545A (en) * 2021-08-20 2021-12-24 咪咕音乐有限公司 Code encryption method, device, equipment and storage medium
CN114662063A (en) * 2022-04-22 2022-06-24 苏州浪潮智能科技有限公司 Method, device and medium for obfuscating codes
CN116956245A (en) * 2023-09-19 2023-10-27 安徽大学 Software watermark realization method and system based on control flow flattening confusion
CN117313047A (en) * 2023-11-28 2023-12-29 深圳润世华软件和信息技术服务有限公司 Source code confusion method, confusion reversal method, corresponding device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103547993A (en) * 2011-03-25 2014-01-29 索夫特机械公司 Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines
US20140165030A1 (en) * 2012-12-06 2014-06-12 Apple Inc. Methods and apparatus for correlation protected processing of data operations
CN108537012A (en) * 2018-02-12 2018-09-14 北京梆梆安全科技有限公司 Source code based on variable and code execution sequence obscures method and device
CN110688120A (en) * 2018-07-06 2020-01-14 武汉斗鱼网络科技有限公司 Method for jumping to designated module and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103547993A (en) * 2011-03-25 2014-01-29 索夫特机械公司 Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines
US20140165030A1 (en) * 2012-12-06 2014-06-12 Apple Inc. Methods and apparatus for correlation protected processing of data operations
CN108537012A (en) * 2018-02-12 2018-09-14 北京梆梆安全科技有限公司 Source code based on variable and code execution sequence obscures method and device
CN110688120A (en) * 2018-07-06 2020-01-14 武汉斗鱼网络科技有限公司 Method for jumping to designated module and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈秋远;李善平;鄢萌;夏鑫;: "代码克隆检测研究进展", 软件学报, no. 04 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836545A (en) * 2021-08-20 2021-12-24 咪咕音乐有限公司 Code encryption method, device, equipment and storage medium
CN113569269A (en) * 2021-09-23 2021-10-29 苏州浪潮智能科技有限公司 Encryption method, device, equipment and readable medium for code obfuscation
WO2023045249A1 (en) * 2021-09-23 2023-03-30 苏州浪潮智能科技有限公司 Obfuscated code encryption method and apparatus, and device and readable storage
CN114662063A (en) * 2022-04-22 2022-06-24 苏州浪潮智能科技有限公司 Method, device and medium for obfuscating codes
CN116956245A (en) * 2023-09-19 2023-10-27 安徽大学 Software watermark realization method and system based on control flow flattening confusion
CN117313047A (en) * 2023-11-28 2023-12-29 深圳润世华软件和信息技术服务有限公司 Source code confusion method, confusion reversal method, corresponding device and storage medium
CN117313047B (en) * 2023-11-28 2024-03-15 深圳润世华软件和信息技术服务有限公司 Source code confusion method, confusion reversal method, corresponding device and storage medium

Also Published As

Publication number Publication date
CN112115427B (en) 2024-05-31

Similar Documents

Publication Publication Date Title
CN112115427A (en) Code obfuscation method, device, electronic device and storage medium
CN109992935B (en) Source code protection method and device
JP6257754B2 (en) Data protection
CN109918917B (en) Method, computer device and storage medium for preventing leakage of H5 source code
CN104680039B (en) A kind of data guard method and device of application program installation kit
US9454456B2 (en) Method for separately executing software, apparatus, and computer-readable recording medium
CN110929234B (en) Python program encryption protection system and method based on code virtualization
CN107077540B (en) Method and system for providing cloud-based application security services
CN111819542A (en) Compiling apparatus and method
CN111512307B (en) Compiling apparatus and method
CN108830049A (en) A kind of software similarity detection method based on dynamic controlling stream graph weight sequence birthmark
US8677149B2 (en) Method and system for protecting intellectual property in software
CN113569269A (en) Encryption method, device, equipment and readable medium for code obfuscation
CA3150187C (en) Method and apparatus for protecting web script codes
US10331896B2 (en) Method of protecting secret data when used in a cryptographic algorithm
EP2937803A1 (en) Control flow flattening for code obfuscation where the next block calculation needs run-time information
Gautam et al. A novel software protection approach for code obfuscation to enhance software security
CN110147238B (en) Program compiling method, device and system
CN111651781A (en) Log content protection method and device, computer equipment and storage medium
EP2947590A1 (en) Program code obfuscation based upon recently executed program code
Banescu Characterizing the strength of software obfuscation against automated attacks
CN114741692A (en) Method, system, equipment and readable storage medium for back door flow identification
Groß et al. Protecting JavaScript apps from code analysis
Kumar et al. A thorough investigation of code obfuscation techniques for software protection
CN111291333A (en) Java application program encryption method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant