CN111680271A

CN111680271A - Contract code obfuscation platform and method based on intelligent contract byte code characteristics

Info

Publication number: CN111680271A
Application number: CN202010489637.7A
Authority: CN
Inventors: 程镇; 周亚金; 吴磊; 任奎
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2020-06-02
Filing date: 2020-06-02
Publication date: 2020-09-18
Also published as: WO2021244054A1

Abstract

The platform converts original byte codes into instruction sequences, extracts instruction positions needing to be rewritten and original jump target addresses according to the obfuscation method, then generates insertion instructions and inserts the insertion instructions at corresponding positions of the instruction sequences, then corrects the jump addresses of the instruction sequences to enable the jump addresses to correspond to correct jump addresses, and finally converts the corrected instruction sequences into byte codes, namely obfuscated byte codes and outputs the byte codes. The invention can prevent the contract information of the contract inventor from being easily analyzed by a tool by confusing the contract byte codes, thereby reducing the risk of randomly stealing codes by people on the chain of contracts.

Description

Contract code obfuscation platform and method based on intelligent contract byte code characteristics

Technical Field

The invention relates to the field of intelligent contracts, in particular to a contract code obfuscation platform and a contract code obfuscation method based on intelligent contract byte code characteristics.

Background

The intelligent contract was the idea proposed by nissabo in the 1990 s, almost as old as the internet. Because of the lack of a trusted execution environment, the smart contract is not applied to the actual industry, and since the birth of the bit currency, people recognize that the underlying technology blockchain of the bit currency can naturally provide a trusted execution environment for the smart contract. An intelligent contract is an assembly language programmed on a blockchain. Typically one will not write the bytecode itself, but will compile it from a higher level language.

Since the blockchain is an open distributed book, all the information owners on the blockchain are publicly visible, and the codes are often multiplexed, and since the contracts stored on the chain are in a byte code form and are difficult for people to read, in order to understand a contract which does not publish the source codes, people usually adopt various analysis means to understand how the contract operates to achieve different purposes. Such an environment is very hostile to a contract developer who is not willing to easily copy others or even find vulnerabilities to attack his contracts.

Disclosure of Invention

The invention aims to provide an intelligent contract code obfuscation platform aiming at the condition that the contract code on the existing chain can be easily analyzed by various analysis tools, and a contract developer can rewrite the contract code through the platform before deploying the contract to avoid the condition.

The purpose of the invention is realized by the following technical scheme:

a contract code obfuscation platform based on intelligent contract bytecode features, the platform comprising:

a bytecode/instruction converter for receiving original bytecode and converting said original bytecode into an instruction sequence according to a target obfuscation method to represent an executable segment;

the information extractor is used for extracting and injecting the information required by the re-analysis of the instruction sequence and the jump target according to an obfuscation method, wherein the information comprises an instruction position required to be rewritten and an original jump target address, the instruction position required to be rewritten is stored and sent to the byte code injector, and the original jump target address is sent to the re-analyzer of the jump target;

the bytecode injector generates an insertion instruction according to an obfuscation method, inserts the insertion instruction at a corresponding position of the instruction sequence to form a new instruction sequence and sends the new instruction sequence to the skip target re-analyzer;

a jump target re-resolver for correcting the jump address of the new instruction sequence to make it correspond to the correct jump target;

and the instruction/byte code converter is used for converting the corrected instruction sequence into byte codes, namely the confused byte codes, and outputting the byte codes.

A contract code obfuscation method based on intelligent contract byte code characteristics specifically comprises the following steps:

s1: the contract developer generates original byte codes through an intelligent contract compiler;

s2: inputting the original byte code into a contract code obfuscation platform, and selecting an obfuscation method expected to be used;

s3: the contract code obfuscation platform converts the original byte code into an instruction sequence, extracts an instruction position to be rewritten and an original jump target address according to the obfuscation method, then generates an insertion instruction and inserts the insertion instruction into a corresponding position of the instruction sequence, then corrects the jump address of the instruction sequence to enable the jump address to correspond to a correct jump address, and finally converts the corrected instruction sequence into the byte code, namely the obfuscated byte code, and outputs the byte code.

Further, the S3 specifically includes the following sub-steps:

s3.1: linearly scanning the original byte codes, and identifying a contract initialization code segment and a Swarm hash segment through a default contract initialization code segment and a Swarm hash start-end feature code given by a contract compiler in the process;

s3.2: decompiling the original byte codes into instructions and immediate data of the Etheng virtual machine, and creating a contract copy by using the information;

s3.3: by maintaining a simulation stack, the contract code obfuscating platform executes contract codes step by step, traverses all branches which can be reached, identifies a function selector section, a contract function section and a data section of a contract during the process, and marks a jump instruction in a contract copy and a value used by the jump instruction;

s3.4: generating and inserting an instruction sequence corresponding to the confusion method according to the marked instruction;

s3.5: and correcting the jump address misplaced in the instruction to finish the confusion.

Further, the obfuscation method selects any one of the following methods:

(1) by adding a PUSH instruction, the tool finds two contract initialization code starting characteristics which are recognized by mistake when the byte codes are scanned linearly, so that the tool recognizes wrong contract main body codes;

(2) by rewriting byte codes in the Swarm Hash segment and rewriting instructions near jump instructions in the contract, all the jump instructions in the original contract jump to respective target addresses through the Swarm Hash segment, and a control flow graph of the contract is flattened;

(3) preventing a contract analysis tool from acquiring a function signature stored in a contract through a feature sequence of the interrupt function selector;

(4) by inserting a large number of JUMPDEST instructions, an analysis tool using symbolic execution and simulation execution is forced to maintain a large number of basic block entry states, and further the tool runs slowly or even crashes;

(5) by changing the immediate for jumping into a series of immediate operation results, the default jumping target address is not able to analyze the jumping address in the contract;

(6) the target address used for jumping is put into a memory and retrieved by calling a precompiled contract, so that the static analysis tool mistakenly calls a contract on a chain, and the information of the target address of jumping is lost.

The invention has the following beneficial effects:

aiming at the situation that the intelligent contract byte codes can be analyzed by using the existing analysis tool, a contract developer can adopt the confusion device and the confusion method to add a layer of confusion to the contract, so that the non-readability of the contract byte codes of the contract developer is enhanced, and the protection of the contract codes is enhanced.

Drawings

FIG. 1 is a schematic diagram of a contract code obfuscation platform based on intelligent contract bytecode features;

FIG. 2 is a flow diagram of a contract code obfuscation method based on intelligent contract bytecode features.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.

As shown in FIG. 1, the contract code obfuscation platform based on intelligent contract bytecode features of the present invention includes:

the bytecode injector generates an insertion instruction according to an obfuscation method, inserts the insertion instruction at a corresponding position of the instruction sequence to form a new instruction sequence and sends the new instruction sequence to the skip target re-analyzer; since the size of the original bytecode is changed, the original jump address does not correspond to the correct jump target any more, and therefore a new instruction sequence needs to be sent to the jump target re-parser to correct the misplaced jump addresses.

As shown in fig. 2, the contract code obfuscation method based on the intelligent contract bytecode feature of the present invention specifically includes the following steps:

s3: the contract code obfuscation platform converts the original byte code into an instruction sequence, extracts an instruction position to be rewritten and an original jump target address according to the obfuscation method, then generates an insertion instruction and inserts the insertion instruction into a corresponding position of the instruction sequence, then corrects the jump address of the instruction sequence to enable the jump address to correspond to a correct jump address, and finally converts the corrected instruction sequence into the byte code, namely the obfuscated byte code, and outputs the byte code. The specific process is as follows:

Specifically, there are six obfuscation methods used herein:

(1) by adding a PUSH instruction, the tool finds two contract initiation code start features that are misidentified when linearly scanning the bytecode, resulting in the tool identifying the wrong contract body code.

(2) By rewriting byte codes in the Swarm Hash segment and rewriting instructions near jump instructions in the contract, all the jump instructions in the original contract jump to respective target addresses through the Swarm Hash segment, and the control flow graph of the contract is flattened.

(3) The signature sequence of the break function selector prevents the contract analysis tool from acquiring the function signature stored in the contract.

(4) By inserting a large number of JUMPDEST instructions, an analysis tool that uses symbolic and analog execution is forced to maintain a large number of basic block entry states, thereby slowing or even crashing the tool.

(5) By changing the immediate for jumping into a series of immediate operation results, the default jump target address must be the immediate, so that the tool cannot resolve the jump address in the contract.

(6) By putting the target address for jumping into the memory and retrieving the target address by calling the precompiled contract, the static analysis tool mistakenly considers that a contract on a chain is called (namely dynamic information under an intelligent contract scene, the static analysis tool considers that the information is not known), and the information of the target address for jumping is lost.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims

1. A contract code obfuscation platform based on intelligent contract bytecode features, the platform comprising:

2. A contract code obfuscation method based on intelligent contract bytecode features is characterized by comprising the following steps:

s2: inputting said raw bytecode into said contract code obfuscation platform of claim 1 and selecting an obfuscation method desired to be used;

3. A contract code obfuscation method based on intelligent contract bytecode features according to claim 2, wherein the S3 specifically includes the following sub-steps:

4. A contract code obfuscation method based on intelligent contract bytecode features according to claim 2, wherein the obfuscation method selects any one of the following methods: