WO2024096822A1 - Vulnerability detection for smart contracts in blockchain platforms - Google Patents
Vulnerability detection for smart contracts in blockchain platforms Download PDFInfo
- Publication number
- WO2024096822A1 WO2024096822A1 PCT/SG2023/050730 SG2023050730W WO2024096822A1 WO 2024096822 A1 WO2024096822 A1 WO 2024096822A1 SG 2023050730 W SG2023050730 W SG 2023050730W WO 2024096822 A1 WO2024096822 A1 WO 2024096822A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- smart contract
- source code
- domain
- execution
- abstract
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
Definitions
- This disclosure generally relates to methods and systems for detecting vulnerabilities in smart contracts of blockchain platforms.
- Smart contracts are computer programs that run on blockchain networks to automatically execute agreements between different parties.
- the smart contract source code controls the executions, and the transactions are tractable and irreversible.
- blockchains like Ethereum, Polygon, Solana, Polkadot, Cosmos, Aptos, Sui, and Corda
- these source codes are stored on the decentralized and distributed blockchain networks. They are also involved in storing and exchanging valuable digital assets and cryptocurrencies, hence they have been a popular target for hackers.
- Recent reports by Chainalysis and Elliptic have revealed that, in the year 2022, the blockchain and decentralized finance (DeFi) industries have lost around US$ 2B due to hacks and attacks, many of which are related to smart contract vulnerabilities introduced by the mistakes of developers.
- Static-analysis-based techniques include works like Oyente, Slither, Mythril, Security. These techniques currently are performed directly on EVM bytecode or on custom intermediate code constructed by each technique for the Solidity smart contracts. The tools that use EVM bytecode, like Oyente, Mythril can produce less precise results since EVM bytecode does not maintain the type information of Solidity source code.
- the present disclosure provides a system for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, the system comprising: one or more processor (processor(s)); a memory comprising instructions that when executed by the processor(s) cause the processor(s) to: receive smart contract source code embodying smart contract implementation logic; transform the smart contract source code into smart contract intermediate code (smart contract IC); compute execution states of the smart contract implementation logic based on the smart contract IC wherein each execution state relates to a potential state of the smart contract during execution; produce one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions; and generate a smart contract vulnerability report comprising the one or more evaluation outcomes.
- processor processor
- a memory comprising instructions that when executed by the processor(s) cause the processor(s) to: receive smart contract source code embodying smart contract implementation logic; transform the smart contract source code into smart contract intermediate code (smart contract IC); compute execution states of the smart contract implementation logic based on the smart contract IC wherein each execution state
- Also disclosed is a method for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms comprising: receiving smart contract source code embodying smart contract implementation logic; transforming the smart contract source code into smart contract intermediate code (smart contract IC); computing execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution; producing one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions; and generating a smart contract vulnerability report comprising the one or more evaluation outcomes.
- smart contract source code embodying smart contract implementation logic
- computing execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution
- producing one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions
- generating a smart contract vulnerability report comprising the one or more evaluation outcomes.
- Figure 1 is a method for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, in accordance with present teachings
- Figure 2 illustrates a block diagram of a design of the vulnerability detection framework
- Figure 3 illustrates a block diagram of a Static Analyzer
- Figure 4 illustrates a block diagram of a design of an Abstract Interpreter
- Figure 5 illustrates a block diagram of a Function Analyzer that is a part of the
- Figure 6 illustrates a block diagram of a Vulnerability Detector
- Figure 7 illustrates a system for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platform.
- Embodiments relate to systems and method for analysis of smart contract source code.
- the embodiments incorporate a bug detection framework for blockchain smart contracts using abstract interpretation.
- the framework may be built on top of LLVM, an industrial strength compilation framework, which supports smart contracts written in different languages, such as Solidity, Rust, Move, Golang, C++, and Java.
- Embodiments incorporate a core analysis algorithm using abstract interpretation instantiated to perform different analyses, such as integer interval analysis, pointer analysis, and control flow analysis so that our framework can detect different types of bugs in smart contracts.
- the framework has been tested with Ethereum smart contracts written in Solidity. The results include bug detection in some existing benchmarks with an accuracy of 90%.
- the locations of bugs in smart contracts are best illustrated using examples - see below.
- the example below describes a buggy smart contract named Deposit of the Ethereum blockchain.
- This smart contract is written in the Solidity programming language. It has 2 public functions named deposit (line 6) and withdraw (line 10). These functions allow users to deposit funds into or withdraw funds from the smart contract account. There is a vulnerability in the withdraw function which allows an attacker to repeatedly withdraw the funds from this contract account. In particular, this contract allows the attacker to first send an amount of funds (line 14) to his account and then update his account balance later (line 17).
- the present disclosure sets out a framework for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms.
- the framework implements method 100 of Figure 1 , comprising (step 102) receiving smart contract source code, (step 104) transforming that smart contract source code into an intermediate code, (step 106) computing execution states on the intermediate code, (step 108) evaluating the execution states for violation of predefined errors, and (step 110) generating or outputting a smart contract vulnerability report.
- Smart contract implementation logic is embodied by smart contract source code. That source code can be received, per step 102, by any known means - e.g. uploading a smart contract source code file to a server running method 100, or by directly entering the source code into a platform running method 100.
- smart contracts can be written and received in variables languages, including Solidity, Rust, Move, Golang, C++, and Java.
- the smart contract source code is then transformed into smart contract IC per step 104.
- An intermediate code comprises a data structure or code used by a compiler or virtual machine to represent smart contract source code.
- the smart contract IC is more suitable for analysis for bug or vulnerability detection in a structured manner.
- the smart contract IC is also agnostic to the programming language the smart contract source code is written in. Accordingly, the smart contract IC provides an efficient starting point for subsequent analysis.
- Embodiments of the present method 100 include methods to find vulnerabilities in smart contracts using abstract interpretation, a static analysis technique. Some embodiments leverage the LLVM compilation framework to analyze smart contract execution states, although other language independent ICs may be used such as that produced in the GNU Compiler Collection. For the purpose of illustration, LLVM compiler systems and LLVM bitcode will be used in the discussion below. However, the skilled person will understand the present teachings to be useable or extendable to other compiler systems and ICs as needed. To that end, step 104 involves compiling smart contract source code to LLVM bitcode using the corresponding compiler.
- the LLVM bitcode is an example of a smart contract intermediate representation (IR), interchangeably used herein with the term “intermediate code” or "(IC)” unless context dictates otherwise.
- IR smart contract intermediate representation
- Some embodiments use Solang to compile Solidity smart contracts of the Ethereum and EVM-compatible blockchains, Rustc for Rust-based smart contracts of the Solana and Polkadot blockchains, Move for the Move-based smart contracts of the Aptos and Sui blockchains, Gollvm compiler for Golang smart contracts of Hyperleger Fabric and Cosmos blockchains, Clang compiler for C++ smart contracts of the EOSIO blockchain, and JLang for Java smart contracts of the Corda blockchain.
- An LLVM (or other compiler system) bitcode module consists of functions, global variables, and data types. Functions are written in an assembly-like language with an infinite number of registers.
- a function contains a list of basic blocks, forming the control flow graph (CFG) of the function. Each basic block starts with a label and contains a list of instructions, and ends with a terminator instruction.
- a terminator instruction is an instruction such as a branch to jump to (e.g. function call or decision point reflecting that there is more than one possible next operation depending on the execution state) other basic blocks or a function return. Each block is therefore non-branching - i.e. the instructions in a block are executed in a linear fashion such that, within a block, no other block is jumped to.
- LLVM uses an infinite set of typed virtual registers to represent local variables to hold values of primitive types (such as integer, floating point, and pointer). These variables are always maintained in Static Single Assignment (SSA) form.
- SSA Static Single Assignment
- LLVM also includes an explicit PHI instruction to handle variables whose values come from two or more basic blocks. This instruction corresponds directly to the PHI function of the SSA form.
- LLVM programs transfer values between registers and memory via two instructions load and store. Unlike registers, memory locations are not in SSA form because many possible locations may be modified at a single store through a pointer.
- LLVM uses an instruction called GetElementPtr, which can preserve type information while performing pointer arithmetic.
- GetElementPtr is effectively a combined operator for both field-access of data structures and elementaccess of arrays (one-dimensional or multidimensional) in high-level programming languages, in which its 0-based indexing sequence indicates how the memory address of the data structure’s field or array’s element is computed from the root.
- the simplified LLVM bitcode corresponding to the withdraw function in the contract Deposit is set out below. It contains 3 basic blocks, whose labels are entry, transfer, and update_balance.
- the Solidity code that transfers funds and updates the account balance in the Deposit contract (line 14 and line 17, "Contract Deposit written in Solidity") is compiled to the function call in LLVM bitcode at lines 22 and 35 of "LLVM Bitcode of the withdraw function in contract Deposit”.
- FIG. 2 illustrates a detailed design of a static analysis framework referred to as Discover.
- step 204 (corresponding to step 104 of Figure 1) involves Discover compiling and normalizing the input smart contract into an output LLVM bitcode program at 206.
- the static analyzer component at 208 analyzes this LLVM bitcode program to find any potential bugs as described with reference to step 106 and 108. These potential bugs are identified in a vulnerability report at 210.
- the methods and systems are agnostic to the programming language of the smart contract source code.
- Supported programming languages include: Solidity (for Ethereum and EVM-based blockchains), Rust (for Solana and Polkadot blockchains), Move (for Aptos and Sui blockchains), Golang (for Hyperledger Fabric and Cosmos blockchain), C/C++ (for EOSIO blockchain), and Java (for Corda blockchain).
- Embodiments use Embodiments use Embodiments use Embodiments use Embodiments use Embodiments use Embodiments use various compilers Solang, Rustc, Move, Gollvm, Clang, J Lang to respectively compile smart contracts written in these programming languages and disable all of their optimizations to make the output LLVM bitcode retain as much information as possible related to the original smart contracts.
- Solang solang compile -0 none - -no-constant-folding - -no-strength-reduce
- command line arguments “-0 none --no-constant-folding --no- strength-reduce -no-dead-storage -no-vector-to-slice” are used to disable all optimizations, while the options “-target ewasm -emit llvm-bc” are used to generate LLVM bitcode for Ethereum WebAssembly platform, which is compatible to the Ethereum blockchain.
- step 204 may involve promoting memory references of the bitcode program to become register references. In some embodiments, this is achieved by invoking an existing optimization pass -mem2reg of LLVM to promote memory references of the bitcode program to become register references by removing unnecessary load and store instructions and minimizing the uses of memory references.
- Step 104 may also comprise one or both of normalizing global variable initialisations and using constant expressions to represent instructions in IC. This can make it easier to implement the present analysis framework. Transformation may, for example, include passes to normalize global variables’ initializations and constant expressions representing instructions in LLVM bitcode.
- Step 104/204 may comprise declaring and initializing global variables (e.g. global variables in LLVM bitcode 206) in a section (of source code or module of source code) separated from other function definitions. This may be achieved by wrapping Global variable initializations into a wrapper bitcode function. This will avoid having to write a separate analysis function for handling the initialization of global variables. Then, this wrapper function can be analyzed similarly to other functions to capture pointer information stored in the global variables. Some embodiments of the method therefore comprise wrapping variable initialisations into a wrapper bitcode function and analyzing the wrapper bitcode function to capture pointer information.
- global variables e.g. global variables in LLVM bitcode 206
- This wrapper function can be analyzed similarly to other functions to capture pointer information stored in the global variables.
- Transforming the smart contract source code, per step 104/204, into IC may replacing data structure, such as constant expressions, that are used to represent instructions.
- LLVM makes use of constant expressions.
- constant expressions can represent instructions like BitCast or GetElementPtr. When these constant expressions are used as operands of another instruction, that instruction is considered nested.
- the last operand “i8* getelementptr %struct. vector, %struct. vector* null, i32 0, i32 2, i32 0), i32 %length)” is a constant expression:
- %11 call i32 @call (i64 9223372036854775807 i8* %addresSj i8* %value_transferj i8* getelementptr (%struct . vector %struct . vector* nullj i32 0 i32 2 i32 0) i32 %length)
- embodiments extract the nested instruction, assign it to a new variable, and replace the new variable in the place of the corresponding operand of the outer instruction.
- the above instruction can be transformed into below: getelementptr (%struct . vector ⁇ %struct .vector* nullj i32 0 i32 2, i32
- Figure 3 presents an architecture 300 of the static analyzer Discover (208 of Figure 2). It consists of 2 components: an abstract interpreter 302 and a vulnerability detector 304.
- the abstract interpreter 302 takes the smart contract intermediate code 306 (written in LLVM bitcode) of a smart contract as the input and computes the abstract state representing all or a subset of all execution states of the smart contract - step 106. Each execution state corresponds, is or relates to a potential state of the smart contract during execution. Then, the computed abstract states 308 are used by the vulnerability detector 304 to check if any states of the contract abstract state 308 (comprising all computed abstract states) violates any bug conditions (predefined error conditions) from the input smart contract bitcode 306.
- the report comprises one or more evaluation outcomes, each evaluation outcome being one of a confirmation that a predefined bug has not been found or a confirmation that a predefined bug has been found.
- Generating the report per steps 110, 210 and 312 may comprise identifying smart contract source code corresponding to each potential bug (evaluation outcome confirming that a predefined bug has been found). This will facilitate correction of the relevant source code.
- generating the report per steps 110, 210 and 312 may comprise a smart contract source code correction - e.g. a suggested change to the smart contract source code to correct a predefined bug - for one or more evaluation outcomes that indicate a predefined bug has been found.
- the abstract interpreter computes execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution, per step 106.
- An example of the analysis algorithm of the abstract interpreter 400 component is presented in Figure 4 and Algorithm 1.
- Computing execution states may therefore involve initializing each abstract state and/or designing each abstract state for each bug type or for a subset of bug types, thereby producing a contract analysis state (i.e.
- the abstract interpreter 400 then computes all candidate functions FuncList that need to be analyzed (line 2) and subsequently extracts a candidate function from FuncList to analyze (lines 4, 5), per 408. This function then forms an analysis input per 410 which is then analysed per 412. The analysis of each function is described with reference to Figure 4 and Algorithm 2. The output analysis state 414 of this function will be updated to the global abstract state S per 416. The list of candidate functions FuncList may also be updated with new functions that are called by the current candidate function (line 7). Moreover, a function summary of F is also created and is updated to the contract state S (lines 8). After all functions have been analysed, the contract abstract state is outputted per 418.
- Figure 5 and Algorithm 2 present a workflow 500 to analyze a function, which can be represented as a sequence or tree of blocks.
- the algorithm first checks from the analysis state S if the function F is analyzed previously (line 1). If yes, then it obtains the function summary of F and uses it to compute the output state SF of F (line 2). If no function summary is found, then the algorithm will continue the function normally.
- embodiments use a variable (e.g. BlockList) to capture a list of candidate basic blocks to be analyzed. This list is firstly initialized with the entry block of the function F (line 5, Algorithm 2), per 504. Then, a candidate block B from BlockList is extracted (e.g.
- BlockList e.g. BlockList
- the input of block B can be computed as follows. If B is the entry block of the function F, then the input of B is the same as the input of F. Otherwise, the input of B is a merge of all abstract states corresponding to all instructions that reach B. These abstract states can be easily obtained by querying the function abstract state SF, per 510. After obtaining the input state of B, each instruction of B is analysed (e.g. using the procedure AnalyzeFunction), per 512, to produce the block state and called functions (514). The output abstract state is updated for these instructions to the block analysis state SB (line 10), per 516. The output state SB for this block B will be updated to function analysis state SF, at 510.
- the function analysis summary (e.g. function analysis state and called functions) is computed and outputted at 518.
- Algorithm 2 Algorithm To Analyze Function
- Analyzelnstruction function which is called at line 10 of the function AnalyzeFunction, is implemented specifically for each analysis and each bug type. They are described specifically as follows:
- embodiments may use the pointer graph domain, and only the instructions related to handling pointers and control flow of the contracts are needed to be analyzed. They include the following instructions: Load, Store, BitCast, Call, GetElementPtr, PHI, Br, IndirectBr, Return.
- embodiments may use the integer interval domain, only the instructions that are relevant to arithmetic computation, variable, and control flow manipulating are needed to be analyzed. They are the following instructions: Add, Sub, Mui, SDiv, UDiv, Cast, Load, Store, Call, Invoke, PHI, Br, IndirectBr, Return.
- Step 108 involves producing one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions.
- Figure 6 presents a design 600 of the vulnerability (bug/potential bug) detector component 304. It is designed in a modular and extensible way to support checking different bug types, such as arithmetic, reentrancy, input validation, access control bugs. To detect each bug type, it receives input IC 602 and marks the suspicious instructions related to the bug in the input IC of the target smart contract, per 604. Afterward, it will compute the bug condition for each suspicious instruction per 606.
- a bug condition is a logical constraint indicating when the bug can occur at a corresponding instruction.
- the vulnerability detector will utilize the abstract contract state 608 computed by the abstract interpreter to check these bug conditions at the validator 610 to decide whether the suspicious instructions are potential vulnerabilities.
- the bug conditions 606 and abstract contract state 608 will be evaluated by the validator 610, to produce an evaluation outcome, based on the type of bug being detected.
- the bug condition for an Integer Overflow bug happens is the sum of the values of two variables %y and %z exceed the capacity of a 32-bit integer stored in %x (its maximum value is 2 32 - 1).
- the vulnerability detector will rely on the abstract state computed by the abstract interpreter using the integer interval domain. This abstract state contains value ranges for all variables, each range is an interval representing the lower-bound and upper-bound values that the variable can take. If the upper bound value of the addition of %y and %z exceeds 2 32 - 1, then there is a potential Integer Overflow bug.
- %11 call i32 @call(i64 9223372036854775807, i8* %address, i8* %value_transfer, i8* getelementptr (%struct. vector, %struct. vector* null, i32 0, i32 2, i32 0), i32 %length)
- the condition for the reentrancy bug to occur in this instruction is that there is no check on the account balance before calling the above instruction to transfer the fund. This condition can be checked by the vulnerability detector using the control flow state computed by the abstract interpreter.
- any bugs or vulnerabilities detected by the vulnerability detection component are catalogued in an evaluation outcome that summarises the vulnerabilities detected in a smart contract vulnerability report, per 612.
- the evaluation outcome may also include information relating to the abstract states and/or blocks associated with the detected vulnerabilities.
- the methods and systems for smart contract code analysis are part of a smart contract deployment platform and serve to perform analysis of smart contract program code before a potential deployment.
- a predefined error condition is determined to be violated by an execution state
- the deployment of the smart contract is halted - this is reflected at step 112 of Figure 1.
- the embodiments analyse smart contracts before they are deployed and mitigate undesirable consequences of deployment of smart contracts with bugs or errors.
- FIG. 7 is a block diagram illustrating a system 700 for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, implementing the methods or flows of Figures 1 to 6.
- the system includes one or more processors (processor(s)) 702 and memory 704.
- the memory 704 comprises instructions in the form of program code 706 that is executable by the processor(s) 702 to implement the methods and flows described above. This can include receiving, via network interface 708, smart contract source code embodying smart contract implementation logic.
- the transformation of the smart contract source code to smart contract intermediate code (LLVM bitcode) is performed in compiler 710.
- the execution states are then computed by the abstract interpreter 712.
- the vulnerability detector 714 evaluates the execution states and the report module 716 generates the smart contract vulnerability report.
- system 700 has been represented as distinct components, these components may be physically separate devices, implemented in program code, distributed across multiple systems or housed in a single system, without departing from the substantive teachings herein.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Stored Programmes (AREA)
Abstract
Methods and systems for analysis of smart contract source code comprising: receiving smart contract source code; transforming the smart contract source code into smart contract intermediate code (smart contract IC); computing execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution; evaluating the execution states for violation of a plurality of predefined error conditions; generating a smart contract vulnerability report comprising evaluation outcome.
Description
Vulnerability Detection for Smart Contracts in Blockchain Platforms
Technical Field
[0001] This disclosure generally relates to methods and systems for detecting vulnerabilities in smart contracts of blockchain platforms.
Background
[0002] This background description is provided for the purpose of generally presenting the context of the disclosure. Contents of this background section are neither expressly nor impliedly admitted as prior art against the present disclosure.
[0003] Smart contracts are computer programs that run on blockchain networks to automatically execute agreements between different parties. The smart contract source code controls the executions, and the transactions are tractable and irreversible. In blockchains like Ethereum, Polygon, Solana, Polkadot, Cosmos, Aptos, Sui, and Corda, these source codes are stored on the decentralized and distributed blockchain networks. They are also involved in storing and exchanging valuable digital assets and cryptocurrencies, hence they have been a popular target for hackers. Recent reports by Chainalysis and Elliptic have revealed that, in the year 2022, the blockchain and decentralized finance (DeFi) industries have lost around US$ 2B due to hacks and attacks, many of which are related to smart contract vulnerabilities introduced by the mistakes of developers.
[0004] Existing works that detect vulnerabilities in smart contracts are classified into two groups: using static analysis or using dynamic testing. Dynamic testing, such as the fuzzing-based techniques, is mostly conducted directly on EVM bytecode. This technique is implemented by tools such as Echidna, ConFuzzius, sFuzz, Harvey. These tools directly run the smart contract bytecode on different generated inputs and observe if any bugs occur. Since the input space of smart contracts is often unlimited, dynamic testing can miss bugs that should be detected.
[0005] Static-analysis-based techniques include works like Oyente, Slither, Mythril, Security. These techniques currently are performed directly on EVM bytecode or on
custom intermediate code constructed by each technique for the Solidity smart contracts. The tools that use EVM bytecode, like Oyente, Mythril can produce less precise results since EVM bytecode does not maintain the type information of Solidity source code. Furthermore, it is difficult to extend these works to support other smart contract languages like Rust, Move, Golang, C++, Java for Solana, Cosmos, Polkadot, Aptos, Sui, Hyperledger Fabric, EOSIO, and Corda, since the existing compilers do not support compiling Rust and Golang smart contracts to EVM bytecode or these tools’ custom intermediate code.
[0006] It is desirable, therefore, that a framework be provided that can support multiple smart contract languages while addressing the potential infinite input space.
Summary
[0007] Known dynamic testing tools do not compute abstract states representing all possible execution states of smart contracts - an issue addressed by abstract interpretation frameworks described herein.
[0008] In one embodiment, the present disclosure provides a system for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, the system comprising: one or more processor (processor(s)); a memory comprising instructions that when executed by the processor(s) cause the processor(s) to: receive smart contract source code embodying smart contract implementation logic; transform the smart contract source code into smart contract intermediate code (smart contract IC); compute execution states of the smart contract implementation logic based on the smart contract IC wherein each execution state relates to a potential state of the smart contract during execution; produce one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions; and generate a smart contract vulnerability report comprising the one or more evaluation outcomes.
[0009] Also disclosed is a method for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, the method comprising: receiving smart contract source code embodying smart contract implementation logic; transforming the smart contract source code into smart contract intermediate code (smart contract IC); computing execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution; producing one or more evaluation outcomes by evaluating the
execution states for violation of a plurality of predefined error conditions; and generating a smart contract vulnerability report comprising the one or more evaluation outcomes.
Brief Description of the Drawings
[0010] Some embodiments of systems and methods for analysis of smart contract source code, in accordance with present disclosure, will now be described, by way of non-limiting example only, with reference to the accompanying drawings in which:
[0011] Figure 1 is a method for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, in accordance with present teachings;
[0012] Figure 2 illustrates a block diagram of a design of the vulnerability detection framework;
[0013] Figure 3 illustrates a block diagram of a Static Analyzer;
[0014] Figure 4 illustrates a block diagram of a design of an Abstract Interpreter;
[0015] Figure 5 illustrates a block diagram of a Function Analyzer that is a part of the
Abstract Interpreter of Figure 3;
[0016] Figure 6 illustrates a block diagram of a Vulnerability Detector; and
[0017] Figure 7 illustrates a system for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platform.
Detailed Description
[0018] Embodiments relate to systems and method for analysis of smart contract source code. The embodiments incorporate a bug detection framework for blockchain smart contracts using abstract interpretation. The framework may be built on top of LLVM, an industrial strength compilation framework, which supports smart contracts written in different languages, such as Solidity, Rust, Move, Golang, C++, and Java. Embodiments incorporate a core analysis algorithm using abstract interpretation instantiated to perform different analyses, such as integer interval analysis, pointer analysis, and control flow analysis so that our framework can detect different types of bugs in smart contracts. The framework has been tested with Ethereum smart contracts written in Solidity. The results include bug detection in some existing benchmarks with an accuracy of 90%.
[0019] The locations of bugs in smart contracts are best illustrated using examples - see below.
[0020] The example below describes a buggy smart contract named Deposit of the Ethereum blockchain. This smart contract is written in the Solidity programming language. It has 2 public functions named deposit (line 6) and withdraw (line 10). These functions allow users to deposit funds into or withdraw funds from the smart contract account. There is a vulnerability in the withdraw function which allows an attacker to repeatedly withdraw the funds from this contract account. In particular, this contract allows the attacker to first send an amount of funds (line 14) to his account and then update his account balance later (line 17).
Contract Deposit written in Solidity
[0021] However, when the attacker sends the funds to his account at line 14, the receive function of his Attack contract (line 11 of the "Contract Attack written in Solidity") will be executed automatically. This function will call the withdraw function again (line 13, of the "Contract Attack written in Solidity"), hence the balance update at the Deposit contract (line 17, of the "Contract Deposit written in Solidity") will never be executed. This vulnerability allows the attackers to repeatedly re-enter the withdraw function. Hence, it is called the reentrancy bug.
Contract Attack written in Solidity
[0022] In view of bugs such as that described above, the present disclosure sets out a framework for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms. The framework implements method 100 of Figure 1 , comprising (step 102) receiving smart contract source code, (step 104) transforming that smart contract source code into an intermediate code, (step 106) computing execution states on the intermediate code, (step 108) evaluating the execution states for violation of predefined errors, and (step 110) generating or outputting a smart contract vulnerability report.
[0023] Smart contract implementation logic is embodied by smart contract source code. That source code can be received, per step 102, by any known means - e.g. uploading a smart contract source code file to a server running method 100, or by directly entering the source code into a platform running method 100. With reference to Figure 2, in which step 202 corresponds to step 102, smart contracts can be written and received in variables languages, including Solidity, Rust, Move, Golang, C++, and Java.
[0024] The smart contract source code is then transformed into smart contract IC per step 104. An intermediate code comprises a data structure or code used by a compiler or virtual machine to represent smart contract source code. The smart contract IC is more suitable for analysis for bug or vulnerability detection in a structured manner. The smart
contract IC is also agnostic to the programming language the smart contract source code is written in. Accordingly, the smart contract IC provides an efficient starting point for subsequent analysis.
[0025] Embodiments of the present method 100 include methods to find vulnerabilities in smart contracts using abstract interpretation, a static analysis technique. Some embodiments leverage the LLVM compilation framework to analyze smart contract execution states, although other language independent ICs may be used such as that produced in the GNU Compiler Collection. For the purpose of illustration, LLVM compiler systems and LLVM bitcode will be used in the discussion below. However, the skilled person will understand the present teachings to be useable or extendable to other compiler systems and ICs as needed. To that end, step 104 involves compiling smart contract source code to LLVM bitcode using the corresponding compiler. The LLVM bitcode is an example of a smart contract intermediate representation (IR), interchangeably used herein with the term "intermediate code" or "(IC)" unless context dictates otherwise.
[0026] Some embodiments use Solang to compile Solidity smart contracts of the Ethereum and EVM-compatible blockchains, Rustc for Rust-based smart contracts of the Solana and Polkadot blockchains, Move for the Move-based smart contracts of the Aptos and Sui blockchains, Gollvm compiler for Golang smart contracts of Hyperleger Fabric and Cosmos blockchains, Clang compiler for C++ smart contracts of the EOSIO blockchain, and JLang for Java smart contracts of the Corda blockchain.
[0027] An LLVM (or other compiler system) bitcode module consists of functions, global variables, and data types. Functions are written in an assembly-like language with an infinite number of registers. A function contains a list of basic blocks, forming the control flow graph (CFG) of the function. Each basic block starts with a label and contains a list of instructions, and ends with a terminator instruction. A terminator instruction is an instruction such as a branch to jump to (e.g. function call or decision point reflecting that there is more than one possible next operation depending on the execution state) other basic blocks or a function return. Each block is therefore non-branching - i.e. the instructions in a block are executed in a linear fashion such that, within a block, no other block is jumped to. This allows each block to be self-contained in the execution pipeline - i.e. once it is initiated, it does not require calls or branches to functions outside the block, until after termination. Similarly, in some instances blocks may be defined as commencing
with a branch to the block and ending with a branch from the block. LLVM uses an infinite set of typed virtual registers to represent local variables to hold values of primitive types (such as integer, floating point, and pointer). These variables are always maintained in Static Single Assignment (SSA) form. LLVM also includes an explicit PHI instruction to handle variables whose values come from two or more basic blocks. This instruction corresponds directly to the PHI function of the SSA form.
[0028] In LLVM, programs transfer values between registers and memory via two instructions load and store. Unlike registers, memory locations are not in SSA form because many possible locations may be modified at a single store through a pointer. To compute the memory address of sub-elements of a value of aggregate type (data structure, array), LLVM uses an instruction called GetElementPtr, which can preserve type information while performing pointer arithmetic. In essence, GetElementPtr is effectively a combined operator for both field-access of data structures and elementaccess of arrays (one-dimensional or multidimensional) in high-level programming languages, in which its 0-based indexing sequence indicates how the memory address of the data structure’s field or array’s element is computed from the root.
[0029] The simplified LLVM bitcode corresponding to the withdraw function in the contract Deposit is set out below. It contains 3 basic blocks, whose labels are entry, transfer, and update_balance. The Solidity code that transfers funds and updates the account balance in the Deposit contract (line 14 and line 17, "Contract Deposit written in Solidity") is compiled to the function call in LLVM bitcode at lines 22 and 35 of "LLVM Bitcode of the withdraw function in contract Deposit".
LLVM Bitcode of the withdraw function in contract Deposit
[0030] Figure 2 illustrates a detailed design of a static analysis framework referred to as Discover. Given an input smart contract, step 204 (corresponding to step 104 of Figure 1) involves Discover compiling and normalizing the input smart contract into an output LLVM bitcode program at 206. Then, the static analyzer component at 208 analyzes this LLVM bitcode program to find any potential bugs as described with reference to step 106 and 108. These potential bugs are identified in a vulnerability report at 210.
[0031] The methods and systems are agnostic to the programming language of the smart contract source code. Supported programming languages include: Solidity (for Ethereum and EVM-based blockchains), Rust (for Solana and Polkadot blockchains), Move (for Aptos and Sui blockchains), Golang (for Hyperledger Fabric and Cosmos blockchain), C/C++ (for EOSIO blockchain), and Java (for Corda blockchain). Embodiments use Embodiments use various compilers Solang, Rustc, Move, Gollvm, Clang, J Lang to respectively compile smart contracts written in these programming languages and disable all of their optimizations to make the output LLVM bitcode retain as much information as possible related to the original smart contracts. For example, some embodiments use the following compilation command for Solang:
solang compile -0 none - -no-constant-folding - -no-strength-reduce
- -no-dead-storage - -no-vector-to-slice - -target ewasm - -emit llvm-bc
[0032] Here, the command line arguments “-0 none --no-constant-folding --no- strength-reduce -no-dead-storage -no-vector-to-slice” are used to disable all optimizations, while the options “-target ewasm -emit llvm-bc” are used to generate LLVM bitcode for Ethereum WebAssembly platform, which is compatible to the Ethereum blockchain.
[0033] After the compilation process, step 204 (step 104) may involve promoting memory references of the bitcode program to become register references. In some embodiments, this is achieved by invoking an existing optimization pass -mem2reg of LLVM to promote memory references of the bitcode program to become register references by removing unnecessary load and store instructions and minimizing the uses of memory references.
[0034] Step 104 may also comprise one or both of normalizing global variable initialisations and using constant expressions to represent instructions in IC. This can make it easier to implement the present analysis framework. Transformation may, for example, include passes to normalize global variables’ initializations and constant expressions representing instructions in LLVM bitcode.
[0035] Step 104/204 may comprise declaring and initializing global variables (e.g. global variables in LLVM bitcode 206) in a section (of source code or module of source code) separated from other function definitions. This may be achieved by wrapping Global variable initializations into a wrapper bitcode function. This will avoid having to write a separate analysis function for handling the initialization of global variables. Then, this wrapper function can be analyzed similarly to other functions to capture pointer information stored in the global variables. Some embodiments of the method therefore comprise wrapping variable initialisations into a wrapper bitcode function and analyzing the wrapper bitcode function to capture pointer information.
[0036] Transforming the smart contract source code, per step 104/204, into IC may replacing data structure, such as constant expressions, that are used to represent instructions. LLVM makes use of constant expressions. In LLVM, constant expressions can represent instructions like BitCast or GetElementPtr. When these constant expressions are used as operands of another instruction, that instruction is considered
nested. For example, in the following function call instruction at line 22 of Example "LLVM Bitcode of the withdraw function in contract Deposit", the last operand “i8* getelementptr (%struct. vector, %struct. vector* null, i32 0, i32 2, i32 0), i32 %length)” is a constant expression:
%11 = call i32 @call (i64 9223372036854775807 i8* %addresSj i8* %value_transferj i8* getelementptr (%struct . vector %struct . vector* nullj i32 0 i32 2 i32 0) i32 %length)
[0037] To handle this situation, embodiments extract the nested instruction, assign it to a new variable, and replace the new variable in the place of the corresponding operand of the outer instruction. For example, the above instruction can be transformed into below: getelementptr (%struct . vector^ %struct .vector* nullj i32 0 i32 2, i32
%11 = call i32 @call (i64 9223372036854775807 i8* %addresSj i8* %value_transferj i8* %operandj i32 %length)
[0038] Figure 3 presents an architecture 300 of the static analyzer Discover (208 of Figure 2). It consists of 2 components: an abstract interpreter 302 and a vulnerability detector 304. The abstract interpreter 302 takes the smart contract intermediate code 306 (written in LLVM bitcode) of a smart contract as the input and computes the abstract state representing all or a subset of all execution states of the smart contract - step 106. Each execution state corresponds, is or relates to a potential state of the smart contract during execution. Then, the computed abstract states 308 are used by the vulnerability detector 304 to check if any states of the contract abstract state 308 (comprising all computed abstract states) violates any bug conditions (predefined error conditions) from the input smart contract bitcode 306. Finally, it will output a report at 312 (similarly step 110 and 210 of Figures 1 and 2, respectively) containing all potential bugs that are detected. The report comprises one or more evaluation outcomes, each evaluation outcome being one of a confirmation that a predefined bug has not been found or a confirmation that a predefined bug has been found.
[0039] Generating the report per steps 110, 210 and 312 may comprise identifying smart contract source code corresponding to each potential bug (evaluation outcome confirming that a predefined bug has been found). This will facilitate correction of the relevant source code. In addition, generating the report per steps 110, 210 and 312 may
comprise a smart contract source code correction - e.g. a suggested change to the smart contract source code to correct a predefined bug - for one or more evaluation outcomes that indicate a predefined bug has been found.
Abstract Interpreter Component
[0040] The abstract interpreter computes execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution, per step 106. An example of the analysis algorithm of the abstract interpreter 400 component is presented in Figure 4 and Algorithm 1. Given an input bitcode program (e.g. smart contract IC) 402, e.g. a LLVM bitcode program P, abstract interpreter 400 first computes an initial abstract state S for the program (line 1 and 404). This abstract state needs to be designed specifically for each analysis and bug type per 406. Computing execution states may therefore involve initializing each abstract state and/or designing each abstract state for each bug type or for a subset of bug types, thereby producing a contract analysis state (i.e. a state that can be functionally analysed). In framework 200, embodiments use the integer interval domain for detecting arithmetic bugs, the control flow domain for detecting reentrancy bugs, and the taintedness domain for detecting missing access control, lack of input validation, and zero address validation bugs.
[0041] The abstract interpreter 400 then computes all candidate functions FuncList that need to be analyzed (line 2) and subsequently extracts a candidate function from FuncList to analyze (lines 4, 5), per 408. This function then forms an analysis input per 410 which is then analysed per 412. The analysis of each function is described with reference to Figure 4 and Algorithm 2. The output analysis state 414 of this function will be updated to the global abstract state S per 416. The list of candidate functions FuncList may also be updated with new functions that are called by the current candidate function (line 7). Moreover, a function summary of F is also created and is updated to the contract state S (lines 8). After all functions have been analysed, the contract abstract state is outputted per 418.
Algorithm 1 : Analysis Algorithm
[0042] Figure 5 and Algorithm 2 present a workflow 500 to analyze a function, which can be represented as a sequence or tree of blocks. On being given in input function 502, the algorithm first checks from the analysis state S if the function F is analyzed previously (line 1). If yes, then it obtains the function summary of F and uses it to compute the output state SF of F (line 2). If no function summary is found, then the algorithm will continue the function normally. In particular, embodiments use a variable (e.g. BlockList) to capture a list of candidate basic blocks to be analyzed. This list is firstly initialized with the entry block of the function F (line 5, Algorithm 2), per 504. Then, a candidate block B from BlockList is extracted (e.g. using AnalyzeFunction), per 506, and the initial analysis state SB corresponding to this block’s input in computed (line 8) to prepare for the analysis, per 508. The input of block B can be computed as follows. If B is the entry block of the function F, then the input of B is the same as the input of F. Otherwise, the input of B is a merge of all abstract states corresponding to all instructions that reach B. These abstract states can be easily obtained by querying the function abstract state SF, per 510. After obtaining the input state of B, each instruction of B is analysed (e.g. using the procedure AnalyzeFunction), per 512, to produce the block state and called functions (514). The output abstract state is updated for these instructions to the block analysis state SB (line 10), per 516. The output state SB for this block B will be updated to function analysis state SF, at 510.
[0043] The function analysis summary (e.g. function analysis state and called functions) is computed and outputted at 518.
Algorithm 2: Algorithm To Analyze Function
[0044] The Analyzelnstruction function, which is called at line 10 of the function AnalyzeFunction, is implemented specifically for each analysis and each bug type. They are described specifically as follows:
[0045] For pointer analysis, embodiments may use the pointer graph domain, and only the instructions related to handling pointers and control flow of the contracts are needed to be analyzed. They include the following instructions: Load, Store, BitCast, Call, GetElementPtr, PHI, Br, IndirectBr, Return.
[0046] For detecting integer bugs, embodiments may use the integer interval domain, only the instructions that are relevant to arithmetic computation, variable, and control flow manipulating are needed to be analyzed. They are the following instructions: Add, Sub, Mui, SDiv, UDiv, Cast, Load, Store, Call, Invoke, PHI, Br, IndirectBr, Return.
[0047] For the missing access control, lack of input validation, and zero address validation bugs, most of the instructions that handle variables, memory, and control flow of the contracts are needed to be analyzed. They include the instructions like Load, Store, BitCast, Call, Invoke, GetElementPtr, PHI, Br, IndirectBr, Return, Add, Sub, Mui, SDiv, UDiv, Cast, Return.
[0048] Depending on the chosen abstract domain and the targeted bug type, the Analyzelnstruction function updates the analysis state accordingly.
Vulnerability Detection Component
[0049] Step 108 involves producing one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions. Figure 6 presents a design 600 of the vulnerability (bug/potential bug) detector component 304. It is designed in a modular and extensible way to support checking different bug types, such as arithmetic, reentrancy, input validation, access control bugs. To detect each bug type, it receives input IC 602 and marks the suspicious instructions related to the bug in the input IC of the target smart contract, per 604. Afterward, it will compute the bug condition for each suspicious instruction per 606. A bug condition is a logical constraint indicating when the bug can occur at a corresponding instruction. The vulnerability detector will utilize the abstract contract state 608 computed by the abstract interpreter to check these bug conditions at the validator 610 to decide whether the suspicious instructions are potential vulnerabilities. The bug conditions 606 and abstract contract state 608 will be evaluated by the validator 610, to produce an evaluation outcome, based on the type of bug being detected.
[0050] For example, to detect arithmetic bugs, all instructions related to arithmetic computation, such as Add, Sub, Mui, SDiv, UDiv will be marked as suspicious instructions. Given the following instruction which performs the addition operator over two 32 bit integers:
%x = add i32 %y, %z
[0051] The bug condition for an Integer Overflow bug happens is the sum of the values of two variables %y and %z exceed the capacity of a 32-bit integer stored in %x (its maximum value is 232 - 1). To detect this Integer Overflow bug, the vulnerability detector will rely on the abstract state computed by the abstract interpreter using the integer interval domain. This abstract state contains value ranges for all variables, each range is an interval representing the lower-bound and upper-bound values that the variable can take. If the upper bound value of the addition of %y and %z exceeds 232 - 1, then there is a potential Integer Overflow bug.
[0052] On the other hand, to detect the reentrancy bug mentioned in Example "Contract Deposit written in Solidity", all instructions (e.g. LLVM instructions) of function calls (or, rather, the Call opcode related to function calls), are marked as suspicious instructions. For example, the following instruction in Example "LLVM Bitcode of the withdraw function in contract Deposit" is related to the transfer of funding.
%11 = call i32 @call(i64 9223372036854775807, i8* %address, i8* %value_transfer, i8* getelementptr (%struct. vector, %struct. vector* null, i32 0, i32 2, i32 0), i32 %length)
[0053] The condition for the reentrancy bug to occur in this instruction is that there is no check on the account balance before calling the above instruction to transfer the fund. This condition can be checked by the vulnerability detector using the control flow state computed by the abstract interpreter.
[0054] After completion of the analysis of the smart contract source code, any bugs or vulnerabilities detected by the vulnerability detection component are catalogued in an evaluation outcome that summarises the vulnerabilities detected in a smart contract vulnerability report, per 612. In some embodiments, the evaluation outcome may also include information relating to the abstract states and/or blocks associated with the detected vulnerabilities.
[0055] In some embodiments, the methods and systems for smart contract code analysis are part of a smart contract deployment platform and serve to perform analysis of smart contract program code before a potential deployment. In the event a predefined error condition is determined to be violated by an execution state, the deployment of the smart contract is halted - this is reflected at step 112 of Figure 1. The embodiments analyse smart contracts before they are deployed and mitigate undesirable consequences of deployment of smart contracts with bugs or errors.
[0056] Figure 7 is a block diagram illustrating a system 700 for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, implementing the methods or flows of Figures 1 to 6. The system includes one or more processors (processor(s)) 702 and memory 704. The memory 704 comprises instructions in the form of program code 706 that is executable by the processor(s) 702 to implement the methods and flows described above. This can include receiving, via network interface 708, smart contract source code embodying smart contract implementation logic. The
transformation of the smart contract source code to smart contract intermediate code (LLVM bitcode) is performed in compiler 710. The execution states are then computed by the abstract interpreter 712. The vulnerability detector 714 then evaluates the execution states and the report module 716 generates the smart contract vulnerability report.
[0057] Though various components of system 700 have been represented as distinct components, these components may be physically separate devices, implemented in program code, distributed across multiple systems or housed in a single system, without departing from the substantive teachings herein.
[0058] The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavor to which this specification relates.
[0059] Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
[0060] The scope of this disclosure encompasses all changes, substitutions, variations, alterations, and modifications to the example embodiments described or illustrated herein that a person having ordinary skill in the art would comprehend. The scope of this disclosure is not limited to the example embodiments described or illustrated herein. Moreover, although this disclosure describes and illustrates respective embodiments herein as including particular components, elements, feature, functions, operations, or steps, any of these embodiments may include any combination or permutation of any of the components, elements, features, functions, operations, or steps described or illustrated anywhere herein that a person having ordinary skill in the art would comprehend. Although this disclosure describes or illustrates particular embodiments as providing particular advantages, particular embodiments may provide none, some, or all of these advantages.
Claims
1. A system for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, the system comprising: one or more processor (processor(s)); a memory comprising instructions that when executed by the processor(s) cause the processor(s) to: receive smart contract source code embodying smart contract implementation logic; transform the smart contract source code into smart contract intermediate code (smart contract IC); compute execution states of the smart contract implementation logic based on the smart contract IC wherein each execution state relates to a potential state of the smart contract during execution; produce one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions; and generate a smart contract vulnerability report comprising the one or more evaluation outcomes.
2. The system of claim 1 , wherein the smart contract IC comprises one or more functions, each function comprising one or more blocks of non-branching instructions; and computing the execution states comprises computing an execution state of a subset of the one or more blocks.
3. The system of claim 1 or claim 2, wherein the execution states are computed in relation to an abstract domain; and the error conditions relate to vulnerabilities in relation to the abstract domain.
4. The system of claim 3, wherein the abstract domain may comprise any one of: pointer domain, arithmetic domain, control flow domain, taintedness domain.
The system of claim 2, wherein the processor(s) is further configured to identify a block relating to violation of one of the predefined error conditions. The system of any one of claims 1 to 5, wherein the transformation of the smart contract source code into smart contract IC is agnostic to a programming language of the smart contract source code. The system of any one of claims 1 to 6, wherein the smart contract IC comprises LLVM bitcode. The system of any one of claims 1 to 7, wherein the system is part of a smart contract deployment platform, and processor(s) is further configured to halt the deployment of a smart contract on evaluation of an execution state being in violation of any one of the plurality of predefined error conditions. A method for analyzing and detecting vulnerabilities in smart contract source code for multiple blockchain platforms, the method comprising: receiving smart contract source code embodying smart contract implementation logic; transforming the smart contract source code into smart contract intermediate code (smart contract IC); computing execution states of the smart contract implementation logic based on the smart contract IC, wherein each execution state relates to a potential state of the smart contract during execution; producing one or more evaluation outcomes by evaluating the execution states for violation of a plurality of predefined error conditions; and generating a smart contract vulnerability report comprising the one or more evaluation outcomes.
The method of claim 9, wherein the smart contract IC comprises one or more functions, each function comprising one or more blocks of non-branching instructions; and computing the execution states comprises computing an execution state of a subset of the one or more blocks. The method of claim 9 or claim 10, wherein the execution states are computed in relation to an abstract domain; and the error conditions relate to vulnerabilities in relation to the abstract domain. The method of claim 11 , wherein the abstract domain may comprise any one of: pointer domain, arithmetic domain, control flow domain, taintedness domain. The method of claim 10, further comprising identifying a block relating to violation of one of the predefined error conditions. The method of any one of claims 9 to 14, wherein the transformation of the smart contract source code into smart contract IC is agnostic to a programming language of the smart contract source code. The method of any one of claims 9 to 14, wherein the smart contract IC comprises LLVM bitcode. The method of any one of claims 9 to 15, implemented in part of a smart contract deployment platform, the method further comprising halting deployment of a smart contract on evaluation of an execution state being in violation of any one of the plurality of predefined error conditions.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10202251633U | 2022-11-04 | ||
SG10202251633U | 2022-11-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024096822A1 true WO2024096822A1 (en) | 2024-05-10 |
Family
ID=90931703
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SG2023/050730 WO2024096822A1 (en) | 2022-11-04 | 2023-11-03 | Vulnerability detection for smart contracts in blockchain platforms |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024096822A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118337529A (en) * | 2024-06-12 | 2024-07-12 | 烟台大学 | Smart contract vulnerability detection method and device based on execution path and stack events |
CN118509159A (en) * | 2024-07-19 | 2024-08-16 | 浙江大学 | A method and device for accelerating the execution of smart contracts based on just-in-time compilation |
CN119538246A (en) * | 2024-11-01 | 2025-02-28 | 北京鼎玺盈动科技有限公司 | A method for detecting and analyzing malicious transactions in smart contracts based on dynamic data storage |
CN119991134A (en) * | 2025-04-15 | 2025-05-13 | 浙江大学 | Ether-shop phishing fraud contract detection method based on transaction simulation execution |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113672515A (en) * | 2021-08-26 | 2021-11-19 | 北京航空航天大学 | A WASM smart contract vulnerability detection method based on symbolic execution |
CN115037512A (en) * | 2022-04-27 | 2022-09-09 | 中国科学院信息工程研究所 | Formalized static analysis method and device for Ethernet public chain intelligent contract |
-
2023
- 2023-11-03 WO PCT/SG2023/050730 patent/WO2024096822A1/en unknown
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113672515A (en) * | 2021-08-26 | 2021-11-19 | 北京航空航天大学 | A WASM smart contract vulnerability detection method based on symbolic execution |
CN115037512A (en) * | 2022-04-27 | 2022-09-09 | 中国科学院信息工程研究所 | Formalized static analysis method and device for Ethernet public chain intelligent contract |
Non-Patent Citations (2)
Title |
---|
KALRA SUKRIT, GOEL SEEP, DHAWAN MOHAN, SHARMA SUBODH: "ZEUS: Analyzing Safety of Smart Contracts", PROCEEDINGS 2018 NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM, INTERNET SOCIETY, RESTON, VA, 21 February 2018 (2018-02-21), Reston, VA , XP093043462, ISBN: 978-1-891562-49-5, DOI: 10.14722/ndss.2018.23082 * |
KUSHWAHA SATPAL SINGH, JOSHI SANDEEP, SINGH DILBAG, KAUR MANJIT, LEE HEUNG-NO: "Ethereum Smart Contract Analysis Tools: A Systematic Review", IEEE ACCESS, vol. 10, 20 April 2022 (2022-04-20), pages 57037 - 57062, XP093043485, DOI: 10.1109/ACCESS.2022.3169902 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118337529A (en) * | 2024-06-12 | 2024-07-12 | 烟台大学 | Smart contract vulnerability detection method and device based on execution path and stack events |
CN118509159A (en) * | 2024-07-19 | 2024-08-16 | 浙江大学 | A method and device for accelerating the execution of smart contracts based on just-in-time compilation |
CN119538246A (en) * | 2024-11-01 | 2025-02-28 | 北京鼎玺盈动科技有限公司 | A method for detecting and analyzing malicious transactions in smart contracts based on dynamic data storage |
CN119991134A (en) * | 2025-04-15 | 2025-05-13 | 浙江大学 | Ether-shop phishing fraud contract detection method based on transaction simulation execution |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Gaschecker: Scalable analysis for discovering gas-inefficient smart contracts | |
WO2024096822A1 (en) | Vulnerability detection for smart contracts in blockchain platforms | |
US11036614B1 (en) | Data control-oriented smart contract static analysis method and system | |
Chen et al. | One engine to fuzz’em all: Generic language processor testing with semantic validation | |
Hills et al. | An empirical study of PHP feature usage: a static analysis perspective | |
US6973644B2 (en) | Program interpreter | |
US7380242B2 (en) | Compiler and software product for compiling intermediate language bytecodes into Java bytecodes | |
US9824214B2 (en) | High performance software vulnerabilities detection system and methods | |
JP2023545140A (en) | Methods and systems for supporting smart contracts in blockchain networks | |
US20200201838A1 (en) | Middleware to automatically verify smart contracts on blockchains | |
US20130097593A1 (en) | Computer-Guided Holistic Optimization of MapReduce Applications | |
CN104636256A (en) | Memory access abnormity detecting method and memory access abnormity detecting device | |
CN111768187A (en) | Method for deploying intelligent contract, block chain node and storage medium | |
Xue et al. | Simber: Eliminating redundant memory bound checks via statistical inference | |
US7418699B2 (en) | Method and system for performing link-time code optimization without additional code analysis | |
JP2012022686A (en) | Solution of hybrid constraints to validate specification requirements of software module | |
US20210303283A1 (en) | Generating compilable machine code programs from dynamic language code | |
Cachera et al. | Certified memory usage analysis | |
Sharma et al. | Java Ranger: Statically summarizing regions for efficient symbolic execution of Java | |
Nassirzadeh et al. | Gas gauge: A security analysis tool for smart contract out-of-gas vulnerabilities | |
US20100162211A1 (en) | Providing access to a dataset in a type-safe manner | |
Garmany et al. | Static detection of uninitialized stack variables in binary code | |
Schilling et al. | Vandalir: Vulnerability analyses based on datalog and llvm-ir | |
US9329845B2 (en) | Determining target types for generic pointers in source code | |
Jain et al. | SKLEE: A dynamic symbolic analysis tool for ethereum smart contracts (tool paper) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23886451 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |