WO2024055437A1 - Method and apparatus for detecting compatibility of contract upgrading - Google Patents

Method and apparatus for detecting compatibility of contract upgrading Download PDF

Info

Publication number
WO2024055437A1
WO2024055437A1 PCT/CN2022/135220 CN2022135220W WO2024055437A1 WO 2024055437 A1 WO2024055437 A1 WO 2024055437A1 CN 2022135220 W CN2022135220 W CN 2022135220W WO 2024055437 A1 WO2024055437 A1 WO 2024055437A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
abstract syntax
contract
syntax tree
state variable
Prior art date
Application number
PCT/CN2022/135220
Other languages
French (fr)
Chinese (zh)
Inventor
曹蓉
Original Assignee
蚂蚁区块链科技(上海)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 蚂蚁区块链科技(上海)有限公司 filed Critical 蚂蚁区块链科技(上海)有限公司
Publication of WO2024055437A1 publication Critical patent/WO2024055437A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management

Definitions

  • the embodiments of this specification belong to the field of blockchain technology, and particularly relate to a method and device for detecting the compatibility of contract upgrades.
  • Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.
  • the purpose of this disclosure is to provide a method and device for detecting the compatibility of contract upgrades, including: a method for detecting the compatibility of contract upgrades, including: generating abstract syntax trees of contracts before and after the upgrade; parsing the generated abstract syntax trees, Sequentially extract the basic information in the node information of each abstract syntax tree; compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
  • a device for detecting compatibility of contract upgrades including: an abstract syntax tree generation unit for generating abstract syntax trees of contracts before and after the upgrade; an extraction unit for parsing the generated abstract syntax trees and sequentially extracting each abstract syntax tree The basic information in the node information; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
  • a client includes: a processor and a memory storing a program, wherein when the processor executes the program, the above method is executed.
  • the upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts.
  • a contract upgrade storage data compatibility detection solution based on solidity and other types of abstract syntax trees can be implemented.
  • Figure 1 is a schematic diagram of deploying smart contracts in an embodiment
  • Figure 2 is a schematic diagram of calling a smart contract in an embodiment
  • Figure 3 is a schematic diagram of a block storage structure in an embodiment
  • Figure 4 is a schematic diagram of a block storage structure in an embodiment
  • Figure 5 is a schematic diagram of an MPT tree in an embodiment
  • FIG. 6 is a schematic diagram of the modules involved in the transaction processing process and the relationship between CPU, memory and disk in an embodiment
  • FIG. 7 is a schematic diagram of the EVM virtual machine module involved in the transaction processing process in an embodiment
  • Figure 8 is a schematic diagram of the slot structure in an embodiment
  • Figure 9 is a schematic diagram of the slot structure in an embodiment
  • Figure 10 is a flow chart of a method for detecting compatibility of contract upgrades in an embodiment
  • Figure 11 is a schematic diagram of the slot structure in an embodiment
  • Figure 12 is a schematic diagram of the slot structure in an embodiment.
  • Blockchains are generally divided into three types: Public Blockchain, Private Blockchain and Consortium Blockchain.
  • the most decentralized one is the public chain. Participants who join the public chain can read data records on the chain, participate in transactions, and compete for the accounting rights of new blocks.
  • each participant reflected as a participant's node on the blockchain
  • the private chain has the writing permission of the network controlled by an organization or institution, and the data reading permission is regulated by the organization.
  • a private chain can be a weakly centralized system with strict restrictions and few participating nodes. This type of blockchain is more suitable for internal use within specific organizations.
  • the alliance chain is a blockchain between the public chain and the private chain, which can achieve "partial decentralization".
  • Each node in the alliance chain usually has a corresponding entity or organization; participants join the network through authorization and form a stakeholder alliance to jointly maintain the operation of the blockchain.
  • Smart contracts on the blockchain are contracts that can be triggered and executed by transactions on the blockchain system. Smart contracts can be defined in the form of code.
  • Smart contracts allow users to create and invoke some complex logic in the blockchain network. This is the biggest challenge that distinguishes programmable blockchain from original blockchain technology.
  • EVM virtual machine
  • each blockchain node can run the EVM.
  • EVM is a Turing-complete virtual machine, which means that various complex logic can be implemented through it.
  • users publish and call smart contracts in the blockchain they run on the EVM.
  • the virtual machine directly runs the virtual machine code (virtual machine bytecode, hereinafter referred to as "bytecode").
  • Smart contracts deployed on the blockchain can be in the form of bytecode.
  • the EVM of node 1 can execute the transaction and generate the corresponding contract instance.
  • "0x6f8ae93" in Figure 1 represents the address of this contract.
  • the data field of the transaction can store bytecode, and the to field of the transaction is an empty account. After the nodes reach an agreement through the consensus mechanism, the contract is successfully created, and subsequent users can call this contract.
  • a contract account corresponding to the smart contract appears on the blockchain and has a specific address.
  • the contract code and account storage will be saved in the contract account.
  • the behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract.
  • smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain.
  • the data field containing the transaction that creates the smart contract can store the bytecode of the smart contract.
  • Bytecode consists of a series of bytes, each byte can identify an operation. Based on various considerations such as development efficiency and readability, developers can choose a high-level language to write smart contract code instead of writing bytecode directly.
  • the smart contract code written in a high-level language is compiled by a compiler to generate bytecode, which can then be deployed on the blockchain.
  • Blockchain supports many high-level languages, such as Solidity, Serpent, LLL language, etc.
  • the second line declares the character (string) state variable storedData
  • the third line declares the event (event).
  • the content stored in this event is the address of the initiator of the call and the string s.
  • Lines 4-7 define the set function, and the input parameter is the string s.
  • the operations performed by the set function include setting the input parameters to the state variable storedData and generating an event.
  • the content of the event includes the address of the initiator of the call and the string s.
  • the first topic, topic1 is generally the default value, for example, it is the identifier of the receipt, which can be a hash value obtained by sequentially splicing the event name, event parameter type, etc. topic2 ⁇ topicn, whether each topic exists depends on whether the Indexed modification is added when defining the parameter. Otherwise, the value of this parameter will be a topic in the receipt, and those without Indexed modification will generally be placed in the data.
  • the two parameters address from and s are not modified by Indexed and are generally placed in data.
  • the code in line 6 sets the data content [msg.sender, s] in the event through the stored() event. In this way, for the event operated on line 6, the overall form is:
  • Lines 8-10 define the get function.
  • the operation of this function includes returning the value of the storedData of the query. returns(string) indicates the type of return value, and the constant modifier indicates that the function cannot modify the value of the state variable in the contract.
  • the EVM of node 1 can execute the transaction and generate the corresponding contract instance.
  • the from field of the transaction in Figure 2 is the address of the account that initiated the call to the smart contract. "0x6f8ae93" in the to field represents the address of the called smart contract.
  • the data field of the transaction saves the method and parameters for calling the smart contract.
  • a value field can also be included to represent the value of the ether in the transaction.
  • the value of storedData may change. Subsequently, a client can view the current value of storedData through a certain blockchain node (such as node 6 in Figure 2).
  • Smart contracts can be executed independently on each node in the blockchain network in a prescribed manner. All execution records and data are saved on the blockchain, so when such a transaction is completed, it is stored on the blockchain and cannot be tampered with. , Transaction vouchers that will not be lost.
  • the storedData in the above example is the state variable, which is stored in the account storage of the smart contract.
  • accounts can usually include two types:
  • Contract account stores the executed smart contract code and the value of the state in the smart contract code. It can usually only be activated through external account calls;
  • Externally owned account A user's account, such as an Ethereum owner's account.
  • the design of external accounts and contract accounts is actually the mapping of account addresses to account status.
  • the status of the account usually includes Nonce, Balance, Storage root, CodeHash and other fields. Nonce and Balance exist in both external accounts and contract accounts. CodeHash and Storage root attributes are generally only valid on contract accounts.
  • Nonce Counter. For external accounts, this number can represent the number of transactions sent from the account address; for contract accounts, it can be the number of contracts created by the account.
  • Storage root The hash of the root node of an MPT tree. This MPT tree organizes the storage of state variables of contract accounts.
  • CodeHash The hash value of the smart contract code. For contract accounts, this is the hash value of the smart contract; for external accounts, since smart contracts are not included, the CodeHash field can generally be an empty string/all 0 string.
  • MPT's full name is Merkle Patricia Tree, which is a tree structure that combines Merkle Tree (Merkle tree) and Patricia Tree (compressed prefix tree, a more space-saving Trie tree, dictionary tree).
  • Merkle Tree the Merkle tree algorithm calculates a Hash value for each transaction, and then connects the two to calculate the Hash again, all the way to the top-level Merkle root.
  • Some blockchain networks use an improved MPT tree, such as a 16-fork tree structure, which is often referred to as an MPT tree.
  • the data structure of the MPT tree includes a state trie.
  • the state tree contains the key-value pair (key and value pair, also written as key-value, referred to as k-v or kv) corresponding to the storage content of each account in the blockchain network.
  • the "key” in the state tree can be a 160-bit identifier (such as the address of a blockchain account or a part of the hash value of the address, hereafter collectively referred to as the account address). This account address is distributed from the state tree.
  • the root node starts in the storage of leaf nodes.
  • the "values" in the state tree are generated by encoding the blockchain account's information (using the Recursive-Length Prefix encoding (RLP) method).
  • RLP Recursive-Length Prefix encoding
  • Contract accounts are used to store status related to smart contracts. After the smart contract is deployed on the blockchain, a corresponding contract account will be generated. This contract account generally has some states, which are defined by the state variables in the smart contract and generate new values when the smart contract is executed.
  • the smart contract usually refers to a contract defined in the form of code in the blockchain environment that can automatically execute the terms. Once an event triggers a clause in the contract (execution conditions are met), the code can be executed automatically.
  • the relevant status of the contract is stored in the storage trie.
  • the hash value of the root node of the storage trie is stored in the above-mentioned storageroot, thereby locking all the status of the contract to the contract account through hash.
  • the storage trie is also an MPT tree structure, which stores the key-value mapping from state address to state value. Part of the information from the root node of the storage trie tree to the leaf nodes is arranged sequentially to store the address of a state, and the value of the state is stored in the leaf node.
  • the block header of each block includes several fields, such as the previous block hash previous_Hash (Prev Hash in the figure), the random number Nonce (in some blockchains The Nonce in the system is not a random number, or the Nonce in the block header is not enabled in some blockchain systems), timestamp Timestamp, block number Block Num, state root hash State_Root, transaction root hash Transaction_Root, receipt root hash Hope Receipt_Root et al.
  • the Prev Hash in the block header of the next block (such as block N+1) points to the previous block (such as block N), which is the hash value of the previous block.
  • State_Root, Transaction_Root and Receipt_Root respectively lock the state collection, transaction collection and receipt collection.
  • the state collection, transaction collection and receipt collection organize states, transactions and receipts in the form of trees respectively.
  • it can be the same tree structure or different tree structures.
  • the same MPT structure is used.
  • a two-level MPT structure is included: the leaf nodes of the upper-level MPT structure include two types: external accounts and contract accounts; each contract account includes the next-level MPT structure, the leaf nodes of the next level include the value of the state in the contract account.
  • FIG 4 is a schematic structural diagram of blockchain data storage.
  • state_root is the hash value of the root of the MPT tree composed of the status of all accounts in the current block, that is, the point pointing to state_root is a state trie in the form of an MPT.
  • the root node of this MPT tree is generally an extension node (Extension Node) or a branch node (Branch Node). What is stored in state_root is generally the hash value of this root node.
  • the root node can be connected to one or more layers of Extension Node/Branch Node below. These multi-layer tree nodes can be collectively called intermediate nodes (Internal Node).
  • a part of the value in each node from the root node of this MPT to the leaf node can be concatenated in order to form the account address and serve as the key.
  • the account information stored in the leaf node is the value corresponding to the account address.
  • This key can also be a part after sha3 (Address), that is, a part of the hash value of the account address (the hash algorithm uses the sha3 algorithm, for example), and its stored value value can be rlp (Account), which is the rlp encoding of the account information.
  • the account information is a four-tuple consisting of [nonce, balance, storageRoot, codeHash].
  • Contract accounts generally include Nonce, Balance, Storage root, and CodeHash. Among them, Nonce is the transaction counter of the contract account; Balance is the account balance; Storage root corresponds to another MPT, through which Storage root can be linked to contract-related status information; CodeHash is the hash value of the contract code. Whether it is an external account or a contract account, its account information is generally located in a separate leaf node (Leaf Node). From the Extension Node/Branch Node of the root node to the Leaf Node of each account, there may be several branch nodes and extension nodes in the middle.
  • the state trie can be a tree in the form of MPT, which is generally a 16-fork tree, that is, each layer can have up to 16 child nodes.
  • Extension Node it is used to store common prefixes. It generally has one child node, and this child node can be Branch Node.
  • Branch Node it can have up to 16 child nodes, which may include Extension Node and/or Leaf Node.
  • this Storage Trie tree also stores key-value pairs.
  • the key indicates the address of the state variable. Its value can be the result of processing the state variable declaration position in the contract (a value counting from 0) after certain rules, such as sha3 (the position where the state variable is declared), or sha3 (contract name + location of state variable declaration). value is used to store the value of a state variable (for example, an RLP-encoded value).
  • this Storage trie can also be a tree in the form of MPT, which is generally a 16-fork tree, that is, for Branch Node, it can have up to 16 child nodes, and these child nodes may include Extension Node and/or Leaf Node. For Extension Node, it generally can have 1 child node, and this child node can be Branch Node or Leaf Node.
  • the Leaf Node Account P of the state Trie in Figure 4 is a contract account, and its Storage Root locks all states in the contract storage. These states are organized into MPT trees, and the tree structure is such as the Storage trie linked to the Storage Root. In the Storage trie of this link, take Leaf Node State Variable N as an example.
  • Leaf Node State Variable N For example, if it is the value of storedData in the aforementioned contract code example, its key is sha3 (the declaration location of storedData will be detailed later), and its value is s( For the sake of simplicity, the encoding format of value is omitted here, such as RLP, which will be similar later and will not be described again).
  • the key value is distributed sequentially from the root node to the leaf node of the storage Trie (that is, Leaf Node Variable N).
  • Leaf Node Account C in the state Trie in Figure 4 is an external account, and its key is sha3 (Address C), which is the hash value of the address of account C (the hash algorithm uses the sha3 algorithm, for example).
  • the stored value value can be (Account), where the account information Account is a tuple composed of [nonce, balance].
  • the address of Account C is key, and its values are sequentially distributed from the root node to the leaf node of the state Trie (ie, Leaf Node Variable C).
  • the leaf nodes of A1, A2 and A3 store the information of the external account
  • the leaf node of A4 stores the information of the contract account.
  • the next-level MPT which constitutes a Storage Trie, which is used to store the state variables in the contract account.
  • leaf node A16 For leaf node A16, through slot f in root node A10 (Branch Node) - a of the shared nibble in intermediate node A13 (Extension Node) - slot 9 of intermediate node A14 (Branch Node) - key in leaf node A16 - The 9365 of end are combined sequentially to form the key of the leaf node, which is fa99365.
  • the prefix prefix is used to indicate the tree node type. For example, 0 indicates an Extension Node containing an even number of shared nibbles (shared nibbles), and 1 indicates an Extension Node containing an odd number of shared nibble(s). Use 2 to represent a Leaf Node containing an even number of nibbles, and use 3 to represent a Leaf Node containing an odd number of nibble(s).
  • the hash value of the entire content of the next tree node is filled in the corresponding position of the previous tree node.
  • the key-value mapping of each tree node is actually stored, where value includes the content stored in the tree node, and the corresponding key is the hash value of the overall content of the tree node.
  • the tree nodes k-v actually stored in the database are as follows:
  • H(A9) Prev Hash:,Nonce:,Timestamp:,Block Num:,State Root:H(A8),Transaction Root:,Receipt Root:,... H(A8) prefix:0,shared nibble(s):a7,next node:H(A7) H(A7) 0:,1:H(A1),2:,3:,4:,5:,6:,7:H(A6),8:,9:,a:,b:,c:,d:, e:,f:H(A3),value: H(A1) prefix:2,Key-end:1335,balance:45.0ETH,nonce:n1 H(A6) value:prefix:0,shared nibble(s):d3,next node:H(A5) H(A3) prefix:2,Key-end:9365,balance:1.1ETH,nonce:n3 H(A5) 0:,
  • H() is used to represent hash calculation. In this way, the hash value of the next tree node is anchored in the previous tree node. Through such layers of hashing, the root hash of the entire state trie tree is obtained, and the root hash is locked into the state root field of the block header.
  • the code of the blockchain platform can include P2P (Peer to Peer, point-to-point) module, consensus module, execution module and storage module.
  • P2P is a form of computer network. Different from common web networks, P2P is decentralized and decentralized.
  • the P2P module can complete the distributed dissemination of data.
  • data can be transmitted and received in a point-to-point manner through the P2P module.
  • Different participants can establish a distributed blockchain network through deployed nodes.
  • the ledger constructed using a chain block structure is stored on each node (or on most nodes, such as consensus nodes) in the distributed blockchain network. This is also called decentralization (or multi-centering). )'s distributed ledger.
  • Such a blockchain system needs to solve the problem of consistency and correctness of respective ledger data on multiple decentralized (or multi-centered) nodes.
  • Each node runs the same blockchain platform program.
  • the consensus module can ensure that all loyal nodes have the same transactions, thereby ensuring that all loyal nodes have consistent execution results for the same transactions, and The transaction and execution results are packaged to generate blocks.
  • the current mainstream consensus mechanisms include: Proof of Work (POW), Proof of Stake (POS), Delegated Proof of Stake (DPOS), Practical Byzantine Fault Tolerance (PBFT) ) algorithm, Honey Badger Byzantine Fault Tolerance (HoneyBadgerBFT) algorithm, etc.
  • the consensus module can generally also generate the timestamp of the block corresponding to the current transaction set, etc.
  • the execution module can execute transactions, including ordinary transfer transactions and transactions involving contracts, before or after the consensus module completes consensus.
  • the execution module can introduce a virtual machine to execute the code of the smart contract, such as a virtual machine (Ethereum Virtual Machine, EVM), thereby shielding the differences in hardware configuration and software environment of each node through EVM to ensure that each node
  • EVM Virtual Machine
  • the process and results of executing smart contracts are the same, and the sandbox environment is used to prevent the execution of smart contracts from affecting the blockchain platform code, other programs or operating systems on the host.
  • the nodes can determine the transaction content and transaction sequence in a transaction set through the consensus module, and then output a deterministic transaction set of consensus results to the execution module.
  • the execution module generates execution results by executing ordinary transfer transactions/transactions involving contracts and sends them to the storage module.
  • the storage module can be responsible for storing execution results to the node's local persistent storage medium.
  • a blockchain node physically includes CPU, memory, disk, etc.
  • the blockchain platform code executed by this blockchain node can include P2P module, consensus module, execution module and storage module.
  • the function implementation of the P2P module, consensus module and execution module generally requires the participation of CPU and memory.
  • the storage module can include a building tree module, a block header generation module, a WAL (Write Ahead Log) module, and a state database module.
  • the building tree module is used to build a tree (such as an MPT tree) based on the state k-v passed in by the execution module, such as the aforementioned state trie and storage trie, so as to obtain the k-v of the tree node, which generally requires the participation of CPU and memory.
  • the block header generation module is used to generate the block header based on the root node of the tree built by the building tree module and some other data (such as the previous block hash, timestamp, block number, etc.), which generally requires the participation of CPU and memory.
  • the WAL module is used to persistently store the tree nodes k-v generated by the building tree module before writing them to the state database module, to prevent the tree nodes k-v generated by the building tree module from being written into the state database module due to Data loss caused by situations such as power outages, and recovering data when this occurs generally requires the involvement of CPU, memory, and disk.
  • the state database module is used to store the tree nodes k-v in Table 1 built by the building tree module on the persistent storage device; since the tree node data will eventually be written to the persistent storage medium (such as the disk in the figure), so In addition to CPU and memory, the state database module generally requires the participation of disk.
  • the above-mentioned Merkle tree structure such as the above-mentioned MPT and Libra's SMT (Sparse Merkle Tree, similar to MPT) is located in the building tree module in the form of the corresponding relationship in Table 1 above. and stored in memory.
  • the upper Merkle tree is a prefix tree (dictionary tree), which can organize the data and obtain a unique Merkle root for the organized data.
  • the leaf nodes can save the state value, and the root node to the intermediate node to the leaf node implement lexicographic indexing of the state key.
  • These tree nodes are encoded as Key according to certain rules, and their contents are encoded as Value, and are eventually stored in the underlying database.
  • LSM Log-Structured Merge-Tree, log-structured merge tree
  • NoSQL Key-Value DB DataBase, database; Key-Value DB is also referred to as KVDB
  • KVDB Key-Value DB
  • the databases are levelDB and Libra's RocksDB. Both KVDBs are based on the LSM storage engine.
  • the transaction that creates the smart contract is sent to the blockchain. After consensus, each node of the blockchain can execute the transaction. At this time, a contract account corresponding to the smart contract appears on the blockchain (including, for example, the account's Identity, the contract's hash value Codehash, and the root StorageRoot of the contract storage), and has a specific address.
  • the contract code and account storage can be saved. In the storage of the contract account, as shown in Figure 7.
  • the behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract.
  • smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain.
  • the blockchain node can receive a transaction request to call the deployed smart contract.
  • the transaction request can include the address of the called contract, the function in the called contract and the input parameters.
  • each node of the blockchain can independently execute the specified smart contract call.
  • the left side of Figure 7 shows an example of a smart contract written in solidity and its compilation and execution process.
  • the smart contract is compiled by a compiler to generate bytecode.
  • the solc in the picture is solidity's command line compiler.
  • Smart contracts written through solidity can be compiled through the command line tool solc with parameters, thereby generating bytecode that can be run on the EVM.
  • the smart contract can be successfully created on the blockchain.
  • a contract account corresponding to the smart contract is generated on the blockchain.
  • the contract account includes, for example, the contract counter Nonce, the balance of the account, the hash value of the contract bytecode Codehash, the root StorageRoot of the contract storage, etc.
  • the contract will have a specific address on the chain, which is the contract address.
  • This contract address is, for example, calculated by hashing the address of the external account where the contract is deployed and its counter nonce.
  • sha3 (rlp.encode([address_sender,nonce]))
  • rlp is an encoding format as mentioned above. It can be replaced by other encoding formats in different blockchains without even re-encoding, so rlp will be omitted later.
  • sha3 is a type of hash algorithm, such as the commonly used algorithm such as keccak256.
  • rlp represents an encoding format
  • rlp.encode([address_sender,nonce]) represents rlp encoding the content in parentheses.
  • the [address_sender, nonce] in parentheses indicates the sequential concatenation of the two fields address_sender of the external account where the contract is deployed and its counter nonce. For example, using the keccak256 algorithm, you can get a hash value with a length of 256 bits. Based on this hash value, you can get the address of the deployed contract on the blockchain (for example, take the first 20 bytes). 256bits is 32bytes. The balance of the account can be set to the default value of 0 or when the deployment is completed.
  • the hash value of the contract bytecode, Codehash can be calculated by the blockchain platform by hashing the contract bytecode.
  • the root StorageRoot of contract storage can be a default value or a hash value calculated based on the root node of the underlying storage Trie. This generally depends on whether initialization operations are performed in the deployed contract, such as executing the constructor in the contract. If the deployed contract contains a constructor, it generally includes the work of initializing some state variables that will eventually be stored in the underlying database. This initialization work can be performed in the virtual machine. After initializing the state variables, as mentioned above, an MPT tree can be constructed, so that the root node of the MPT tree can be obtained, and then the hash value of the root node can be obtained. If the deployed contract does not contain a constructor, the specific function does not need to be executed. Instead, the blockchain platform gives StorageRoot a default value, such as a hash value of empty content.
  • smart contracts usually define contracts in the form of code in a blockchain environment that can automatically execute terms.
  • the terms are usually related to business-level logic. Therefore, the contract code as a whole reflects the business logic. As the business develops and changes, the business logic may change, and at this time the contract code also needs to be adjusted. In addition, the code of the contract may have loopholes and need to be repaired, or the upgrade of the language version in which the contract is written will also bring about upgrade requirements for the contract. In the above situations, the deployed contract usually needs to be upgraded.
  • the new contract will have a different address than the original contract.
  • the contract address can be calculated by hashing the address of the external account where the contract is deployed and its counter nonce, for example, sha3([address_sender,nonce]). Contracts deployed by different contract deployers have different contract addresses; even for the same contract deployer, since a new transaction was initiated when upgrading the contract, the nonce as the transaction counter has changed. Therefore, the newly deployed contract's The address will also change. In this way, the storage of the new contract is also different from the old contract.
  • the state variables in the contract storage of the new contract can only be set and read from the block where the new contract is deployed, but the state in the old contract cannot be accessed.
  • the upgraded contract maintains the same contract address as the contract before the upgrade. In this way, the same contract storage space can be maintained after the upgrade as before the upgrade, historical data will not be lost, and users do not have to change the contract address entered when calling the contract.
  • the generation rules of contract account addresses can be set to be independent of nonce. Furthermore, they can be set to be independent of the deployer.
  • the address of the contract account can be determined by the name of the contract, such as sha3([name_contract]). This ensures that as long as the contract has the same name, its address on the blockchain will be the same.
  • contract accounts generally include Nonce, Balance, Storage root, and CodeHash, where CodeHash is the hash value of the contract bytecode.
  • the bytecode of the upgraded contract is different from the bytecode before the upgrade. Therefore, the CodeHash in the contract account will change after the contract is upgraded. That is, the Codehash in the contract account will generally be updated after the contract is upgraded.
  • state variable r the value of r will be read starting from the block of the upgraded contract. Since the key changes, it cannot be read correctly. Take the value of r before upgrading the contract.
  • code example 2 is the code example of the old contract before the upgrade:
  • ID is of unit256 type, which is 32 bytes in solidity
  • sex is of bool type, which is 1 bytes. If these two variables are before the function, they will generally be used as persistent storage state variables, that is, they will be stored in the underlying database.
  • the setID() function is defined on lines 6-8.
  • the public after the setID() function is used as a modifier to indicate that the setID() function serves as an internal/external interface function.
  • unit256 represents an unsigned integer of 256 bits, with a length of 32 bytes.
  • assign the value of parameter x to ID to implement the externally provided interface, and set the parameter x input by the user to the value of the state variable ID.
  • the getID() function is defined on lines 10-12. The view after the getID() function indicates that the function can only read state variables but cannot modify them.
  • lines 22-24 define the version() function, and the return value of this function is uint256 type.
  • the operation in the function body is to return 1. Users can call this function, which returns the version of the current contract, here version is 1.
  • state variables that need to be stored persistently are generally a paired key-value structure.
  • the key represents the address of the state variable, and the value represents the value of the state variable.
  • two state variables ID and sex are declared in the header, and each state variable will have a key. It should be noted that the space occupied by these two state variables is fixed, namely 32bytes and 1bytes.
  • Each contract generally has its own storage space.
  • This storage space is virtual, and the capacity can be a very large array, such as an array with 2 256 elements, numbered from 0 to 2 256 -1. Each element can occupy a certain length, such as 32 bytes. Each element is called a slot here, as shown in Figure 8.
  • the values of the two state variables, ID and sex, can be stored in slots 0 and 1, for example. It should be noted that the total storage space of 2,256 slots is the total capacity of the virtual space. In other words, unused slots will not occupy the actual storage space of the underlying database.
  • demo1 contract written in a high-level language such as solidity is compiled by a compiler to generate bytecode.
  • the execution of the contract can be shown in Figure 7.
  • a transaction that calls a contract in Figure 2 is sent to the blockchain network, and after consensus, each node can execute the transaction.
  • the to field of the transaction indicates the address of the called contract.
  • Any node can find the storage of the contract account based on the address of the contract, and then can read the Codehash from the storage of the contract account, and then find the corresponding contract bytecode based on the Codehash.
  • the node can load the bytecode of the contract from storage into the virtual machine.
  • the interpreter interprets and executes it, including parsing the bytecode of the called contract (Parse, such as parsing Push, Add, SGET, SSTORE, Pop, etc.) to obtain the operation code (OPcode) and function, and Store these OPcodes in the memory space (memory) opened by the virtual machine (alloc in the figure; after the program execution is completed, the corresponding memory release operation, such as Free in the figure), and also obtain the jump position of the called function in the memory space. (JumpCode).
  • the virtual machine loads and executes the bytecode of the contract, which may generate status and/or read status, thereby requiring access to the underlying database.
  • the virtual machine needs to easily access the underlying KV database.
  • To access the KV database you can generally use pointer-like data access capabilities. For example, if you need to read the value corresponding to a key from the KV database, you need to know the key of this data before accessing it.
  • the execution module (including the virtual machine therein) executes the contract to generate kv
  • the execution module (including the virtual machine therein) executes the contract to generate k.
  • This k is the key generated by the execution module or blockchain platform, here called the state key.
  • this state key needs to be built into the MPT tree to obtain a series of tree node keys from the MPT root node-intermediate node-leaf node. If it is a read operation, the corresponding value can be found in the state database module according to the tree node key. If it is a write operation, a series of tree node key-values from the MPT root node-intermediate node-leaf node are generated, and these tree node kv are written to the state database module in an append manner.
  • the compiler compilation process can roughly include steps such as lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation. Among them, during the lexical/grammatical analysis based on the abstract syntax tree, the position information of the contract's state variables can be generated. For example, the two state variables ID and sex in the above demo1 are located at 0 and 1 respectively, which can correspond to the following two positions in the aforementioned slot:
  • the positions of the two slots above are each 256 bits, which is represented by 32 bytes in hexadecimal (0x).
  • the four consecutive hexadecimal numbers in each segment separated by spaces after 0x above represent 2 bytes, so there are 16 such segments in total.
  • these two positions in the slot can be used to replace the identifiers of the two state variables, such as the above two 256 bits replacing ID and sex respectively.
  • the storage location can be pre-allocated for each data to be stored according to the field sort order during compilation. This is equivalent to specifying a fixed data pointer in advance.
  • the operation on the ID is the operation on 0x000...00 (that is, the position 0 of the above slot, with ellipses replacing the many 0s in the middle).
  • the operation of sex is the operation of the slot position 0x000...01
  • the setID() function in the contract is called, and the input parameter is the string "0001".
  • the slot position 0x000...01 is stored as 0001.
  • the virtual machine can push the 32-byte slot of 0x000...01 into the stack, and then push the corresponding value into the stack.
  • the OPcode of the called function is obtained from memory and starts execution. According to the first-in-last-out or last-in-first-out characteristics of the stack, the value is popped from the stack, and then the slot is popped from the stack to form a slot-value pair.
  • the contract virtual machine executes the current opcode, that is, writes the value of value to the storage at the slot location.
  • the stack generally uses 32 bytes as a storage unit, which is equal to the length of one slot.
  • the corresponding value may be less than, equal to, or greater than 32 bytes, and the value may occupy one or more units in the stack. unit.
  • the setSex() function in the contract is called, and the input parameter is "1" (for example, 1 means a boy, 0 means a girl), then when the virtual machine executes the transaction, 0x000.. .02 stores 1 in this slot.
  • the above content can also be shown in Figure 9.
  • the bool type is stored, which only occupies 1 byte.
  • the bool value 1 can be stored in the lower 8 bytes of this slot, as shown at the bottom of Figure 9.
  • the virtual machine or blockchain platform can convert the slots in the three slot-value pairs (1) and (2) into state keys.
  • the state key can be obtained by splicing the contract address + slot position.
  • the address length of the demo1 contract is 20 bytes, which is 0x3321dcaf8911d384 2e14a7a4 15be 2fb1a337f43e. Then use the splicing contract address + slot position,
  • the status key of (1) is:
  • the status key of (2) is:
  • these keys can be converted into the storage trie tree as shown in Figure 5 by the tree building module. Then, the tree building module constructs the value in the state key-value into a leaf node of the MPT tree according to the tree structure. It should be noted that, as mentioned above, the state key is divided into several small segments and stored in the tree nodes in order from the root of the storage trie tree to the leaf nodes. As for which segment of the state key is stored in each tree node, it depends on the common prefix between the state key and other state keys in the tree.
  • the slot for reading and writing state variables is fixed during the compilation process. As mentioned above, it is determined by the contract address and the slot position of the state variable.
  • the state key can be obtained by concatenating the contract address and slot. In this way, running the same contract and reading and writing the same state variables will use the same slot and correspond to a fixed state key.
  • code example 3 is a code example of the upgraded new contract:
  • demo2 compared to demo1, inserts a new state variable age in line 4 of demo2. In this way, the sex originally located on line 4 in demo1 is moved backward in demo2 and becomes line 5.
  • demo2 adds new write and read functions related to state variables in lines 15-17 and 19-21, which are setAge() and getAge() respectively.
  • the new demo2 code will still be compiled by the compiler to obtain bytecode. During the compilation process, similarly, the compiler will generate slot positions for the three state variables ID, age, and sex in the above demo2, which are:
  • these three positions in the slot can be used to replace the identifiers of the three state variables.
  • the above three 256 bits replace ID, age, and sex respectively.
  • the sex in the 5th line of the demo2 code after the upgrade is obviously the 4th line of the demo1 code before the upgrade, and both are state variables with the same meaning.
  • the operation of the slot position 0x000...01 after the upgrade changes to the operation of age, which is obviously inconsistent with the sex of the slot position 0x000...01 before the upgrade. Then, if you execute the operation of reading age in the upgraded contract bytecode, you will read the actual sex value at this position before the upgrade, causing confusion.
  • the operation of sex has changed to the operation of the slot position 0x000...01. Then, if you perform the operation of reading sex from the contract bytecode after the upgrade, since there was no value at this position before, you can only read the default null value or 0 value, but cannot read the correct value.
  • the compiler compilation process can roughly include lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation.
  • lexical/grammatical analysis can be performed on the smart contract code before and after the upgrade based on the abstract syntax tree, and the abstract syntax tree of the contract before and after the upgrade can be generated.
  • the above code example 4 is the generated abstract syntax tree of the demo1 contract before the upgrade, and is marked with //... comments. Among them, the storage location in line 11 is the calculation rule for slot.
  • the information of the contract state variables in the abstract syntax tree is described. Information related to a state variable is placed in a node. Regarding the two state variables ID and sex, they are actually divided into two nodes. Lines 5-19 are the first node about ID, and lines 20-34 are the second node about sex. Each node contains several pieces of information. On the whole, the information inside a node can be called the node information of the abstract syntax tree.
  • the above code example 5 is the generated abstract syntax tree of the upgraded demo2 contract. Since a line is added between the original ID and sex in the demo2 code to declare the age variable, which is line 4 in demo2, the ID node information in lines 5-19 and 35-50 in the abstract syntax tree of demo2 Between the sex node information of the rows, the age node information of rows 20-34 is inserted.
  • S120 Parse the generated abstract syntax tree, and sequentially extract basic information from the node information of each abstract syntax tree.
  • the basic information here can at least include the node order, and further include the state variable name and/or type.
  • the following example is to extract the abstract syntax tree node information including node order, state variable name and state variable type from the demo1 contract code before the upgrade:
  • Node information 1 ID ⁇ typeString:uint256,... ⁇
  • Node information 2 sex ⁇ typeString:bool,... ⁇
  • Node information 1 ID ⁇ typeString:uint256,... ⁇
  • Node information 2 age ⁇ typeString:uint256,... ⁇
  • Node information 3 sex ⁇ typeString:bool,... ⁇
  • the basic information in the abstract syntax tree node information before and after the upgrade in S120 above can be obtained as the following comparison table:
  • the node information of the abstract syntax tree before and after the upgrade can be placed in the same row in order. Furthermore, through line-by-line scanning and comparison, it can be determined whether the left and right are the same. In other words, it is to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade. Of course, it is best to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade in the order of node numbers.
  • Table 2 above when the row of node information 2 is scanned, the state variable names can be compared and found to be different, and the conclusion that they are incompatible can be drawn.
  • the state key of the state variable is generated by concatenating the contract address and slot.
  • the slots generated in the node information 2 line are the same, for example, 0x000...01, and are not related to the variable names.
  • the value generated by the same slot or state key according to the contract logic after the contract upgrade is another state variable different from the state variable before the upgrade, which is generally inconsistent.
  • the reason why it is said that it is generally inconsistent is because if the name of the state variable is only adjusted before and after the upgrade, the contract logic involving the state variable will not change, and it will not actually cause incompatibility. However, such a contract upgrade situation is more complicated. Rare.
  • conflicting slots can also be obtained, such as node order 2 in Table 2 above.
  • the basic information/node information of conflicting slot locations can be fed back to developers, such as generating logs, alarms, etc., or notified through screen prompts, emails, instant messages, etc. to advise developers or platform positioning Incompatibilities, and are particularly helpful for developers to make modifications.
  • demo2' compared with demo1, the state variable age is added after the original ID and sex.
  • abstract syntax tree of demo2' is as follows:
  • Node information 1 ID ⁇ typeString:uint256,... ⁇
  • Node information 2 sex ⁇ typeString:bool,... ⁇
  • Node information 3 age ⁇ typeString:uint256,... ⁇
  • the state variable ID and sex before the upgrade have not changed.
  • the total state variable name, state variable type and abstract syntax tree node order of the code after the upgrade have not changed.
  • the ID and sex after the upgrade remain the same as before the upgrade. slot and therefore are compatible.
  • the upgraded state variable is newly added after the pre-upgraded state variable. According to the slot generation rules, it will not affect the slot of the previous state variable, and will not affect the state key, so a compatible conclusion can be drawn. Obviously, in this case, it is easier to compare in node number order.
  • the status variables of ID, age, and sex in the above example are of type uint256, uint256, and bool respectively, where uint256 is 256 bits, that is, 32 bytes, and the bool type is 1 byte.
  • the type of these state variables determines that the length of the data is fixed, or fixed-length.
  • there are types such as uint, uint8, uint128, etc., which are also fixed-length.
  • An array of a certain number of fixed-length elements is also fixed-length.
  • uint[2] includes 2 elements. Each element is a uint type of 32 bytes, so the overall uint[2] is 64 bytes.
  • dictionary is a data type of indefinite length.
  • the storage layout of the dictionary is to store Key and its corresponding value, and each Key corresponds to one storage.
  • the corresponding storage location of a Key is keccak256 (key.slot), where ".” is the splicing symbol, the key before ".” is the key of a dictionary element, and the slot dictionary name after ".” is located in the position of the slot.
  • the number of elements in this dictionary is uncertain, and the lengths of key and value in the elements are also uncertain.
  • the demo2 contract is called several times, there may be two elements in dictionary a, which are:
  • the name a of the dictionary can be stored.
  • the key of the first item in dictionary a is u1 and the value is ox18.
  • the storage location of u1 can be keccak256 ("u1".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes).
  • the key of the second item in dictionary a is u2, and the value is a long hexadecimal number.
  • the storage location of u2 can be keccak256 ("u2".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes).
  • the value length of u1 is less than 32 bytes, and its value can be stored in one slot.
  • the location of this slot is, for example, the value of keccak256("u1",0x000...03), for example:
  • the value length of u2 is greater than 32 bytes and less than 64 bytes. Its value can be stored in two consecutive slots. The starting positions of these two slots are, for example, the value of keccak256("u2",0x000...03). Then this The locations of the two slots are:
  • Whether the data of the above-mentioned variable length is compatible can also be determined by executing the above-mentioned processes of S110 to S130.
  • the abstract syntax tree node information generated for the above dictionary structure is:
  • each slot in the variable-length data structure needs to be traversed, because the positions of these slots may be the basis for calculating the final state key.
  • the starting position can be used as the basis for calculating the state key.
  • composite structure of structure and dictionary is as follows solidity code:
  • a structure StructDemo is declared in pages 7-10, which includes two elements, c_ of uint256 type and d_ of bytes type. Then, a dictionary is declared on line 11. This dictionary is a mapping of uint to the structure StructDemo.
  • the abstract syntax tree generated according to S110 includes:
  • Lines 6-17 of the above abstract syntax tree are the node information of uint256c_ in the structure StructDemo, and lines 18-31 are the node information of bytes d_ in the structure StructDemo.
  • the bytes type is also 32 bytes. It should be noted that lines 10 and 23 are both "stateVariable": false, indicating that neither c_ nor d_ is a state variable, and neither will be stored in the underlying database.
  • the generated abstract syntax tree is parsed, and the basic information in the node information of each abstract syntax tree is sequentially and recursively extracted.
  • the basic information in the node information of the structure itself is false, as a value in the dictionary, it has its own slot and state key.
  • Node information 1 ID ⁇ typeString:uint256,... ⁇
  • Node information 2 age ⁇ typeString:uint256,... ⁇
  • Node information 3 sex ⁇ typeString:bool,... ⁇
  • Node information 7 map2_StructDemo_1:c_ ⁇ typeString:uint256,... ⁇
  • Node information 8 map2_StructDemo_2:d_ ⁇ typeString:bytes,... ⁇
  • Node information 1 ID ⁇ typeString:uint256,... ⁇
  • Node information 2 age ⁇ typeString:uint256,... ⁇
  • Node information 3 sex ⁇ typeString:bool,... ⁇
  • node information 2 the nested struct structure needs to be compared recursively. If the abstract syntax tree node information extracted before and after the upgrade, node information 4 and the structures nested in it are consistent, then it is compatible (assuming that node information 1, 2, and 3 are all consistent), otherwise it is incompatible.
  • the upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts.
  • a solidity contract upgrade storage data compatibility detection solution based on abstract syntax trees can be implemented.
  • a contract upgrade compatibility detection device of this application which includes: an abstract syntax tree generation unit, used to generate abstract syntax trees of contracts before and after the upgrade; an extraction unit, used to parse the generated abstract syntax tree and sequentially extract each The basic information in the node information of an abstract syntax tree; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
  • the extraction unit parses the generated abstract syntax tree, and if the state variable in the node information is true, sequentially extracts the basic information in the node information of each abstract syntax tree.
  • the abstract syntax tree generation unit performs lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generates the abstract syntax tree of the contract before and after the upgrade.
  • the basic information includes the node order of the abstract syntax tree, and further includes state variable names and/or types.
  • the comparison unit compares basic information in node information with the same node number in the abstract syntax tree before and after the upgrade.
  • the comparison unit compares the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade according to the node number sequence.
  • the comparison unit if the state variable names obtained by comparison are different, it is incompatible; or if the state variable types obtained by comparison are different, it is incompatible; or if the state variable names and state variable types are both different, it is incompatible.
  • the comparison unit compares the upgraded state variable with a new state variable added after the pre-upgraded state variable, and determines that it is compatible.
  • the extraction unit parses the generated abstract syntax tree, and sequentially and recursively extracts basic information in the node information of each abstract syntax tree.
  • the detection device also includes a feedback unit, where the comparison result of the comparison unit is incompatible, and the feedback unit feeds back the basic information/node information of the conflicting slot position.
  • the following introduces a client embodiment of the present application, which includes: a processor, a memory, and a program stored therein.
  • the processor executes the program, the method described in any of the above embodiments is executed to implement, for example, contract detection. Upgrade compatibility and other purposes.
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • the controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor. , logic gates, switches, Application Specific Integrated Circuit (ASIC), programmable logic controllers and embedded microcontrollers.
  • controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, For Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic.
  • the controller in addition to implementing the controller in the form of pure computer-readable program code, the controller can be completely programmed with logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded logic by logically programming the method steps. Microcontroller, etc. to achieve the same function. Therefore, this controller can be considered as a hardware component, and the devices included therein for implementing various functions can also be considered as structures within the hardware component. Or even, the means for implementing various functions can be considered as structures within hardware components as well as software modules implementing the methods.
  • the systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions.
  • a typical implementation device is a server system.
  • the computer that implements the functions of the above embodiments may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular phone, a camera phone, a smart phone, or a personal digital assistant. , media player, navigation device, email device, game console, tablet, wearable device, or a combination of any of these devices.
  • the functions are divided into various modules and described separately.
  • the functions of each module can be implemented in the same or multiple software and/or hardware, or the modules that implement the same function can be implemented by a combination of multiple sub-modules or sub-units, etc. .
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash random access memory
  • Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information.
  • Information may be computer-readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape magnetic tape storage, graphene storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, one or more embodiments of the present description may employ a computer program implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. Product form.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • program modules may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One or more embodiments of the present description provide a method and apparatus for detecting compatibility of contract upgrading, and a client, which are applied to the field of blockchains. The method for detecting compatibility of contract upgrading comprises: generating abstract syntax trees of a contract before and after upgrading; parsing the generated abstract syntax trees, and sequentially extracting basic information in node information of each abstract syntax tree; and comparing the basic information in the node information of the abstract syntax trees before and after upgrading to obtain a compatibility conclusion.

Description

一种检测合约升级的兼容性的方法和装置A method and device for detecting compatibility of contract upgrades 技术领域Technical field
本说明书实施例属于区块链技术领域,尤其涉及一种检测合约升级的兼容性的方法和装置。The embodiments of this specification belong to the field of blockchain technology, and particularly relate to a method and device for detecting the compatibility of contract upgrades.
背景技术Background technique
区块链(Blockchain)是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链系统中按照时间顺序将数据区块以顺序相连的方式组合成链式数据结构,并以密码学方式保证的不可篡改和不可伪造的分布式账本。由于区块链具有去中心化、信息不可篡改、自治性等特性,区块链也受到人们越来越多的重视和应用。Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. In the blockchain system, data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.
发明内容Contents of the invention
本公开的目的在于提供一种检测合约升级的兼容性的方法和装置,包括:一种检测合约升级的兼容性的方法,包括:生成升级前后合约的抽象语法树;解析生成的抽象语法树,顺序提取每个抽象语法树的节点信息中的基础信息;比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。The purpose of this disclosure is to provide a method and device for detecting the compatibility of contract upgrades, including: a method for detecting the compatibility of contract upgrades, including: generating abstract syntax trees of contracts before and after the upgrade; parsing the generated abstract syntax trees, Sequentially extract the basic information in the node information of each abstract syntax tree; compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
一种合约升级的兼容性的检测装置,包括:抽象语法树生成单元,用于生成升级前后合约的抽象语法树;提取单元,用于解析生成的抽象语法树,并顺序提取每个抽象语法树的节点信息中的基础信息;比较单元,用于比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。A device for detecting compatibility of contract upgrades, including: an abstract syntax tree generation unit for generating abstract syntax trees of contracts before and after the upgrade; an extraction unit for parsing the generated abstract syntax trees and sequentially extracting each abstract syntax tree The basic information in the node information; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
一种客户端,包括:处理器,存储器,存储有程序,其中在所述处理器执行所述程序时,执行上述方法。A client includes: a processor and a memory storing a program, wherein when the processor executes the program, the above method is executed.
升级的合约写法需要满足一定的规范才能使得升级前后兼容,即升级后的新合约中的状态,需要保持能够读取旧合约中相同状态的值的能力。用户在编写升级的合约时,往往忽略这些规范,从而导致升级后的合约出现数据丢失、错乱等严重问题。通过上述例子,可以实现基于抽象语法树的solidity等类型的合约升级存储数据兼容性检测方案。The upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts. Through the above example, a contract upgrade storage data compatibility detection solution based on solidity and other types of abstract syntax trees can be implemented.
附图说明Description of drawings
为了更清楚地说明本说明书实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of this specification more clearly, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some of the embodiments recorded in this specification. , for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.
图1是一实施例中部署智能合约的示意图;Figure 1 is a schematic diagram of deploying smart contracts in an embodiment;
图2是一实施例中调用智能合约的示意图;Figure 2 is a schematic diagram of calling a smart contract in an embodiment;
图3是一实施例中区块存储结构的示意图;Figure 3 is a schematic diagram of a block storage structure in an embodiment;
图4是一实施例中区块存储结构的示意图;Figure 4 is a schematic diagram of a block storage structure in an embodiment;
图5是一实施例中MPT树的示意图;Figure 5 is a schematic diagram of an MPT tree in an embodiment;
图6是一实施例中交易处理过程中涉及的模块及CPU、内存和磁盘关系的示意图;Figure 6 is a schematic diagram of the modules involved in the transaction processing process and the relationship between CPU, memory and disk in an embodiment;
图7是一实施例中交易处理过程中涉及的EVM虚拟机模块的示意图;Figure 7 is a schematic diagram of the EVM virtual machine module involved in the transaction processing process in an embodiment;
图8是一实施例中slot结构的示意图;Figure 8 is a schematic diagram of the slot structure in an embodiment;
图9是一实施例中slot结构的示意图;Figure 9 is a schematic diagram of the slot structure in an embodiment;
图10是一实施例中检测合约升级的兼容性的方法的流程图;Figure 10 is a flow chart of a method for detecting compatibility of contract upgrades in an embodiment;
图11是一实施例中slot结构的示意图;Figure 11 is a schematic diagram of the slot structure in an embodiment;
图12是一实施例中slot结构的示意图。Figure 12 is a schematic diagram of the slot structure in an embodiment.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本说明书中的技术方案,下面将结合本说明书实施例中的附图,对本说明书实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书保护的范围。In order to enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of this specification. Obviously, the described The embodiments are only some of the embodiments of this specification, but not all of the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of this specification.
区块链一般被划分为三种类型:公有链(Public Blockchain),私有链(Private Blockchain)和联盟链(Consortium Blockchain)。此外,还有多种类型的结合,比如私有链+联盟链、联盟链+公有链等不同组合形式。其中去中心化程度最高的是公有链。加入公有链的参与者可以读取链上的数据记录、参与交易以及竞争新区块的记账权等。而且,各参与者(体现为参与者在区块链上的节点)可自由加入以及退出网络,并进行相关操作。私有链则相反,该网络的写入权限由某个组织或者机构控制,数据读取权限受组织规定。简单来说,私有链可以为一个弱中心化系统,参与节点具有严格限制且少。这种类型的区块链更适合于特定机构内部使用。联盟链则是介于公有链以及私有链之间 的区块链,可实现“部分去中心化”。联盟链中各个节点通常有与之相对应的实体机构或者组织;参与者通过授权加入网络并组成利益相关联盟,共同维护区块链运行。Blockchains are generally divided into three types: Public Blockchain, Private Blockchain and Consortium Blockchain. In addition, there are many types of combinations, such as private chain + alliance chain, alliance chain + public chain and other different combinations. Among them, the most decentralized one is the public chain. Participants who join the public chain can read data records on the chain, participate in transactions, and compete for the accounting rights of new blocks. Moreover, each participant (reflected as a participant's node on the blockchain) can freely join and exit the network and perform related operations. On the contrary, the private chain has the writing permission of the network controlled by an organization or institution, and the data reading permission is regulated by the organization. Simply put, a private chain can be a weakly centralized system with strict restrictions and few participating nodes. This type of blockchain is more suitable for internal use within specific organizations. The alliance chain is a blockchain between the public chain and the private chain, which can achieve "partial decentralization". Each node in the alliance chain usually has a corresponding entity or organization; participants join the network through authorization and form a stakeholder alliance to jointly maintain the operation of the blockchain.
不论是公有链、私有链还是联盟链,除了可以支持账户间转移区块链上的原生资产,还可以提供智能合约的功能。区块链上的智能合约是在区块链系统上可以被交易触发执行的合约。智能合约可以通过代码的形式定义。Whether it is a public chain, a private chain or a consortium chain, in addition to supporting the transfer of native assets on the blockchain between accounts, it can also provide smart contract functions. Smart contracts on the blockchain are contracts that can be triggered and executed by transactions on the blockchain system. Smart contracts can be defined in the form of code.
智能合约支持用户在区块链网络中创建并调用一些复杂的逻辑,这是可编程的区块链区别于原有区块链技术的最大挑战。作为一个可编程区块链的核心是虚拟机(EVM),每个区块链节点都可以运行EVM。EVM是一个图灵完备的虚拟机,这意味着可以通过它实现各种复杂的逻辑。用户在区块链中发布和调用智能合约就是在EVM上运行的。实际上,虚拟机直接运行的是虚拟机代码(虚拟机字节码,下简称“字节码”)。部署在区块链上的智能合约可以是字节码的形式。Smart contracts allow users to create and invoke some complex logic in the blockchain network. This is the biggest challenge that distinguishes programmable blockchain from original blockchain technology. As the core of a programmable blockchain is the virtual machine (EVM), each blockchain node can run the EVM. EVM is a Turing-complete virtual machine, which means that various complex logic can be implemented through it. When users publish and call smart contracts in the blockchain, they run on the EVM. In fact, the virtual machine directly runs the virtual machine code (virtual machine bytecode, hereinafter referred to as "bytecode"). Smart contracts deployed on the blockchain can be in the form of bytecode.
例如图1所示,Bob将一个包含创建智能合约信息的交易发送到区块链网络后,节点1的EVM可以执行这个交易并生成对应的合约实例。图1中的“0x6f8ae93…”代表了这个合约的地址,交易的data字段保存的可以是字节码,交易的to字段为一个空的账户。节点间通过共识机制达成一致后,这个合约成功创建,后续用户可以调用这个合约。For example, as shown in Figure 1, after Bob sends a transaction containing smart contract creation information to the blockchain network, the EVM of node 1 can execute the transaction and generate the corresponding contract instance. "0x6f8ae93..." in Figure 1 represents the address of this contract. The data field of the transaction can store bytecode, and the to field of the transaction is an empty account. After the nodes reach an agreement through the consensus mechanism, the contract is successfully created, and subsequent users can call this contract.
合约创建后,区块链上出现一个与该智能合约对应的合约账户,并拥有一个特定的地址,合约代码和账户存储将保存在该合约账户中。智能合约的行为由合约代码控制,而智能合约的账户存储则保存了合约的状态。换句话说,智能合约使得区块链上产生包含合约代码和账户存储(Storage)的虚拟账户。After the contract is created, a contract account corresponding to the smart contract appears on the blockchain and has a specific address. The contract code and account storage will be saved in the contract account. The behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract. In other words, smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain.
前述提到,包含创建智能合约的交易的data字段保存的可以是该智能合约的字节码。字节码由一连串的字节组成,每一字节可以标识一个操作。基于开发效率、可读性等多方面考虑,开发者可以不直接书写字节码,而是选择一门高级语言编写智能合约代码。高级语言编写的智能合约代码,经过编译器编译,生成字节码,进而该字节码可以部署到区块链上。区块链支持的高级语言很多,如Solidity、Serpent、LLL语言等。As mentioned above, the data field containing the transaction that creates the smart contract can store the bytecode of the smart contract. Bytecode consists of a series of bytes, each byte can identify an operation. Based on various considerations such as development efficiency and readability, developers can choose a high-level language to write smart contract code instead of writing bytecode directly. The smart contract code written in a high-level language is compiled by a compiler to generate bytecode, which can then be deployed on the blockchain. Blockchain supports many high-level languages, such as Solidity, Serpent, LLL language, etc.
以Solidity语言为例,用其编写的合约与面向对象编程语言中的类(Class)很相似,在一个合约中可以声明多种成员,包括状态变量、函数、函数修改器、事件等。状态变量是存储在智能合约的账户存储中的值,用于保存合约的状态。Take the Solidity language as an example. Contracts written in it are very similar to classes in object-oriented programming languages. A variety of members can be declared in a contract, including state variables, functions, function modifiers, events, etc. State variables are values stored in a smart contract's account storage and are used to save the state of the contract.
如下是以Solidity语言编写的一个简单的智能合约的代码示例1:The following is code example 1 of a simple smart contract written in Solidity language:
Figure PCTCN2022135220-appb-000001
Figure PCTCN2022135220-appb-000001
代码示例1.SimpleStorage代码Code example 1.SimpleStorage code
上述代码示例1中,第2行声明了字符型(string)的状态变量storedData,第3行声明了事件(event),该事件存储的内容是调用该合约的发起者地址和字符串s。第4-7行定义了set函数,入参是字符串s。该set函数执行的操作包括将入参设置到状态变量storedData,产生一个事件,该事件的内容包括调用该合约的发起者地址和字符串s。In the above code example 1, the second line declares the character (string) state variable storedData, and the third line declares the event (event). The content stored in this event is the address of the initiator of the call and the string s. Lines 4-7 define the set function, and the input parameter is the string s. The operations performed by the set function include setting the input parameters to the state variable storedData and generating an event. The content of the event includes the address of the initiator of the call and the string s.
如前所述,状态变量最终将存储在数据库中。生成的事件,一般是如下形式:As mentioned before, state variables will eventually be stored in the database. The generated events are generally in the following form:
[topic1][topic2]...[topicn][data][topic1][topic2]...[topicn][data]
这里,第一个topic,即topic1一般是默认值,例如是回执的标识,可以是对事件名称、事件参数类型等顺序拼接后取的hash值。topic2~topicn,每一个topic是否存在,取决于定义参数时是否加入了Indexed修饰,有则这个参数的值将是回执中的一个topic,而不加Indexed修饰的一般会放入data中。上面代码示例1中的例子,第3行声明event时,两个参数address from和s,都没有Indexed修饰,一般会放入data中。第6行的代码,通过stored()事件设定了事件中的data内容[msg.sender,s]。这样,对于第6行操作的事件,整体上形式为:Here, the first topic, topic1, is generally the default value, for example, it is the identifier of the receipt, which can be a hash value obtained by sequentially splicing the event name, event parameter type, etc. topic2~topicn, whether each topic exists depends on whether the Indexed modification is added when defining the parameter. Otherwise, the value of this parameter will be a topic in the receipt, and those without Indexed modification will generally be placed in the data. In the example in code example 1 above, when the event is declared in line 3, the two parameters address from and s are not modified by Indexed and are generally placed in data. The code in line 6 sets the data content [msg.sender, s] in the event through the stored() event. In this way, for the event operated on line 6, the overall form is:
[topic1:事件标识][data:msg.sender,s][topic1: event identification] [data: msg.sender, s]
第8-10行定义了get函数。该函数的操作包括返回查询的storedData的值。returns(string)表示返回值的类型,constant修饰符标识该函数不能对合约中的状态变量的值进行修改。Lines 8-10 define the get function. The operation of this function includes returning the value of the storedData of the query. returns(string) indicates the type of return value, and the constant modifier indicates that the function cannot modify the value of the state variable in the contract.
此外,如图2所示,Bob将一个包含调用智能合约信息的交易发送到区块链网络后,节点1的EVM可以执行这个交易并生成对应的合约实例。图2中交易的from字段是发起调用智能合约的账户的地址,to字段中的“0x6f8ae93…”代表了被调用的智能合约的地址,交易的data字段保存的调用智能合约的方法和参数。此外,还可以包括value字段,用以表示该交易中以太币的值。调用智能合约后,storedData的值可能改变。后续,某个客户端可以通过某一区块链节点(例如图2中的节点6)查看storedData的当前值。In addition, as shown in Figure 2, after Bob sends a transaction containing smart contract call information to the blockchain network, the EVM of node 1 can execute the transaction and generate the corresponding contract instance. The from field of the transaction in Figure 2 is the address of the account that initiated the call to the smart contract. "0x6f8ae93..." in the to field represents the address of the called smart contract. The data field of the transaction saves the method and parameters for calling the smart contract. In addition, a value field can also be included to represent the value of the ether in the transaction. After calling the smart contract, the value of storedData may change. Subsequently, a client can view the current value of storedData through a certain blockchain node (such as node 6 in Figure 2).
智能合约可以以规定的方式在区块链网络中每个节点独立的执行,所有执行记录和数据都保存在区块链上,所以当这样的交易完成后,区块链上就保存了无法篡改、不会丢失的交易凭证。Smart contracts can be executed independently on each node in the blockchain network in a prescribed manner. All execution records and data are saved on the blockchain, so when such a transaction is completed, it is stored on the blockchain and cannot be tampered with. , Transaction vouchers that will not be lost.
如前所述,上述示例中的storedData即是状态变量,其存储在智能合约的账户存储中。引入智能合约的各种区块链网络中,通常账户可以包括两种类型:As mentioned before, the storedData in the above example is the state variable, which is stored in the account storage of the smart contract. In various blockchain networks that introduce smart contracts, accounts can usually include two types:
合约账户(contract account):存储执行的智能合约代码以及智能合约代码中状态的值,通常只能通过外部账户调用激活;Contract account: stores the executed smart contract code and the value of the state in the smart contract code. It can usually only be activated through external account calls;
外部账户(Externally owned account):用户的账户,例如以太币拥有者账户。Externally owned account: A user's account, such as an Ethereum owner's account.
外部账户和合约账户的设计,实际上是账户地址到账户状态的映射。账户的状态通常包括Nonce、Balance、Storage root、CodeHash等字段。Nonce、Balance在外部账户和合约账户中都存在。CodeHash和Storage root属性一般仅在合约账户上有效。The design of external accounts and contract accounts is actually the mapping of account addresses to account status. The status of the account usually includes Nonce, Balance, Storage root, CodeHash and other fields. Nonce and Balance exist in both external accounts and contract accounts. CodeHash and Storage root attributes are generally only valid on contract accounts.
Nonce:计数器。对于外部账户,这个数字可以代表从账户地址发送的交易数量;对于合约账户,可以是账户创建的合约数量。Nonce: Counter. For external accounts, this number can represent the number of transactions sent from the account address; for contract accounts, it can be the number of contracts created by the account.
Balance:这个地址拥有的以太币的数量。Balance: The number of ethers owned by this address.
Storage root:一个MPT树根节点的哈希,这个MPT树对合约账户的状态变量的存储进行组织。Storage root: The hash of the root node of an MPT tree. This MPT tree organizes the storage of state variables of contract accounts.
CodeHash:智能合约代码的哈希值。对于合约账户,这是智能合约的哈希值;对于外部账户,由于不包括智能合约,因此CodeHash字段一般可以是空字符串/全0字符串。CodeHash: The hash value of the smart contract code. For contract accounts, this is the hash value of the smart contract; for external accounts, since smart contracts are not included, the CodeHash field can generally be an empty string/all 0 string.
MPT全称为Merkle Patricia Tree,是结合了Merkle Tree(默克尔树)和Patricia Tree(压缩前缀树,一种更节省空间的Trie树,字典树)的一种树形结构。Merkle Tree,默克尔树算法对每个交易都计算一个Hash值,然后两两连接再次计算Hash,一直到最顶层的Merkle根。一些区块链网络中采用改进的MPT树,例如是16叉树的结构,通常也简称为MPT树。MPT's full name is Merkle Patricia Tree, which is a tree structure that combines Merkle Tree (Merkle tree) and Patricia Tree (compressed prefix tree, a more space-saving Trie tree, dictionary tree). Merkle Tree, the Merkle tree algorithm calculates a Hash value for each transaction, and then connects the two to calculate the Hash again, all the way to the top-level Merkle root. Some blockchain networks use an improved MPT tree, such as a 16-fork tree structure, which is often referred to as an MPT tree.
MPT树的数据结构包括状态树(state trie)。状态树中包含区块链网络中每个账户所对应的存储内容的键值对(key and value pair,也写作key-value,简称k-v或kv)。状态树中的“键”(key)可以是一个的160bits的标识符(例如区块链账户的地址或地址的hash值的一部分,下面统称为账户地址),这个账户地址分布于从状态树的根节点开始到叶子节点的存储中。状态树中的“值”是通过对区块链账户的信息进行编码(使用递归长度字典编码(Recursive-Length Prefix encoding,RLP)方法)生成的。如前所述,对于外部账户来说,值包括nonce和balance;对于合约账户来说,值包括nonce、balance、codehash和storageroot。The data structure of the MPT tree includes a state trie. The state tree contains the key-value pair (key and value pair, also written as key-value, referred to as k-v or kv) corresponding to the storage content of each account in the blockchain network. The "key" in the state tree can be a 160-bit identifier (such as the address of a blockchain account or a part of the hash value of the address, hereafter collectively referred to as the account address). This account address is distributed from the state tree. The root node starts in the storage of leaf nodes. The "values" in the state tree are generated by encoding the blockchain account's information (using the Recursive-Length Prefix encoding (RLP) method). As mentioned before, for external accounts, the values include nonce and balance; for contract accounts, the values include nonce, balance, codehash, and storageroot.
合约账户用于存储智能合约相关的状态。智能合约在区块链上完成部署后,会产生一个对应的合约账户。这个合约账户一般会具有一些状态,这些状态由智能合约中状态变量所定义并在智能合约执行时产生新的值。所述的智能合约通常是指在区块链环境中以代码形式定义的能够自动执行条款的合约。一旦某个事件触发合约中的条款(满足执行条件),代码即可以自动执行。在区块链中,合约的相关状态保存在storage trie中,storage trie根节点的hash值即存储于上述storageroot中,从而将该合约的所有状态通过hash锁定到该合约账户下。storage trie也是一个MPT树形结构,存储了状态地址到状态值的key-value映射。从storage trie树的根节点到叶子节点中的部分信息顺序排布后用以存储一个状态的地址,该叶子节点中存储状态的值。Contract accounts are used to store status related to smart contracts. After the smart contract is deployed on the blockchain, a corresponding contract account will be generated. This contract account generally has some states, which are defined by the state variables in the smart contract and generate new values when the smart contract is executed. The smart contract usually refers to a contract defined in the form of code in the blockchain environment that can automatically execute the terms. Once an event triggers a clause in the contract (execution conditions are met), the code can be executed automatically. In the blockchain, the relevant status of the contract is stored in the storage trie. The hash value of the root node of the storage trie is stored in the above-mentioned storageroot, thereby locking all the status of the contract to the contract account through hash. The storage trie is also an MPT tree structure, which stores the key-value mapping from state address to state value. Part of the information from the root node of the storage trie tree to the leaf nodes is arranged sequentially to store the address of a state, and the value of the state is stored in the leaf node.
如图3所示的一些区块链数据存储中,每一区块的区块头包括若干字段,例如上一区块哈希previous_Hash(图中的Prev Hash),随机数Nonce(在一些区块链系统中这个Nonce不是随机数,或者在一些区块链系统中不启用区块头中的Nonce),时间戳Timestamp,区块号Block Num,状态根哈希State_Root,交易根哈希Transaction_Root,收据根哈希Receipt_Root等。其中,下一区块(如区块N+1)的区块头中的Prev Hash指向上一区块(如区块N),即为上一区块的hash值。通过这种方式,区块链上通过区块头实现了下一区块对上一区块的锁定。其中,State_Root、Transaction_Root和Receipt_Root分别锁定了状态集合、交易集合和收据集合。状态集合、交易集合和收据集合分别以树的形式组织了状态、交易和收据。一般的,可以是相同的树形结构,也可以是不同的树形结构。例如在上述采用MPT结构的区块链网络中,采用了相同的MPT结构。在一些包括智能合约的状态集合的树形结构中,包括两级的MPT结构:上一级的MPT结构的叶子节点包括外部账户和合约账户两种类型;其中的每个合约账户包括下一级的MPT结构,下一级的叶子节点中包括合约账户中的状态的值。As shown in Figure 3, in some blockchain data storage, the block header of each block includes several fields, such as the previous block hash previous_Hash (Prev Hash in the figure), the random number Nonce (in some blockchains The Nonce in the system is not a random number, or the Nonce in the block header is not enabled in some blockchain systems), timestamp Timestamp, block number Block Num, state root hash State_Root, transaction root hash Transaction_Root, receipt root hash Hope Receipt_Root et al. Among them, the Prev Hash in the block header of the next block (such as block N+1) points to the previous block (such as block N), which is the hash value of the previous block. In this way, the next block locks the previous block through the block header on the blockchain. Among them, State_Root, Transaction_Root and Receipt_Root respectively lock the state collection, transaction collection and receipt collection. The state collection, transaction collection and receipt collection organize states, transactions and receipts in the form of trees respectively. Generally, it can be the same tree structure or different tree structures. For example, in the above-mentioned blockchain network using MPT structure, the same MPT structure is used. In some tree structures including smart contract state collections, a two-level MPT structure is included: the leaf nodes of the upper-level MPT structure include two types: external accounts and contract accounts; each contract account includes the next-level MPT structure, the leaf nodes of the next level include the value of the state in the contract account.
图4是一个区块链数据存储的结构示意图。可以结合图3所示,state_root是当前区块中所有账户的状态组成的MPT树的根的哈希值,即指向state_root的为一颗MPT形式的状态树state trie。这个MPT树的根节点一般为一个扩展节点(Extension Node)或一个分支节点(Branch Node),state_root中存储的一般为这个根节点的hash值。根节点可以与下面一层或多层的Extension Node/Branch Node相连,这些多层的树节点可以统称为中间节点(Internal Node)。从这个MPT的根节点到叶子节点中每个节点中的一部分值按照顺序串联起来可以构成账户地址并作为key,叶子节点中存储的账户信息为这个账户地址对应的value,这样,构成了key-value键值对。这个key也可以是sha3(Address)后取一部分,即账户地址的hash值(hash算法例如采用sha3算法)的一部分,其存储的值value可以为rlp(Account),即账户信息的rlp编码。其中账户信息是[nonce,balance,storageRoot,codeHash]构成的四元组。如前所述,对于外部账户来说,一般只有nonce和balance两项,而storageRoot、codeHash字段默认存储空字符串/全0字符串。也就是说,外部账户不存储合约,也不存储合约执行后的产生的状态变量。合约账户一般包括Nonce,Balance,Storage root,CodeHash。其中Nonce是该合约账户的交易计数器;Balance是账户余额;Storage root对应另外一个MPT,通过Storage  root能链接到合约相关的状态的信息;CodeHash是合约代码的hash值。不论是外部账户还是合约账户,其账户信息一般都位于一个单独的叶子节点(Leaf Node)中。从根节点的Extension Node/Branch Node到每个账户的Leaf Node,可能中间会经过若干个分支节点以及扩展节点。Figure 4 is a schematic structural diagram of blockchain data storage. As shown in Figure 3, state_root is the hash value of the root of the MPT tree composed of the status of all accounts in the current block, that is, the point pointing to state_root is a state trie in the form of an MPT. The root node of this MPT tree is generally an extension node (Extension Node) or a branch node (Branch Node). What is stored in state_root is generally the hash value of this root node. The root node can be connected to one or more layers of Extension Node/Branch Node below. These multi-layer tree nodes can be collectively called intermediate nodes (Internal Node). A part of the value in each node from the root node of this MPT to the leaf node can be concatenated in order to form the account address and serve as the key. The account information stored in the leaf node is the value corresponding to the account address. In this way, the key- value key-value pair. This key can also be a part after sha3 (Address), that is, a part of the hash value of the account address (the hash algorithm uses the sha3 algorithm, for example), and its stored value value can be rlp (Account), which is the rlp encoding of the account information. The account information is a four-tuple consisting of [nonce, balance, storageRoot, codeHash]. As mentioned before, for external accounts, there are generally only two items, nonce and balance, and the storageRoot and codeHash fields store empty strings/all 0 strings by default. In other words, the external account does not store the contract, nor does it store the state variables generated after the contract is executed. Contract accounts generally include Nonce, Balance, Storage root, and CodeHash. Among them, Nonce is the transaction counter of the contract account; Balance is the account balance; Storage root corresponds to another MPT, through which Storage root can be linked to contract-related status information; CodeHash is the hash value of the contract code. Whether it is an external account or a contract account, its account information is generally located in a separate leaf node (Leaf Node). From the Extension Node/Branch Node of the root node to the Leaf Node of each account, there may be several branch nodes and extension nodes in the middle.
state trie可以是MPT形式的树,一般是16叉树,即每一层最多可以有16个孩子节点。对于Extension Node,用于存储共同前缀,其一般有1个孩子节点,这个孩子节点可以是Branch Node。对于Branch Node,其最多可以有16个孩子节点,其中可能包括Extension Node和/或Leaf Node。The state trie can be a tree in the form of MPT, which is generally a 16-fork tree, that is, each layer can have up to 16 child nodes. For Extension Node, it is used to store common prefixes. It generally has one child node, and this child node can be Branch Node. For a Branch Node, it can have up to 16 child nodes, which may include Extension Node and/or Leaf Node.
其中,对于state trie中的一个合约账户来说,其storage_Root指向另一颗同为MPT形式的树,其中存储了合约执行涉及的状态变量(state variable)的数据。这个storage_Root指向的MPT形式的树为Storage Trie,即Storage Trie的根节点的hash值。一般的,这个Storage Trie树存储的也是key-value键值对。key表明状态变量的地址,其取值可以是合约中的状态变量声明的位置(从0开始计数的值)经过一定规则处理后得到的结果,例如是sha3(状态变量声明的位置),或者是sha3(合约名称+状态变量声明的位置)。value用于存储状态变量的取值(例如是经RLP编码的值)。从根节点经中间节点到叶子节点的路径上存储的一部分数据连起来构成key,叶子节点中存储value。前面提到,这个Storage trie也可以是MPT形式的树,一般也是16叉树,即对于Branch Node,其最多可以有16个孩子节点,这些孩子节点可能包括Extension Node和/或Leaf Node。而对于Extension Node,其一般可以有1个孩子节点,这个孩子节点可以是Branch Node或Leaf Node。Among them, for a contract account in the state trie, its storage_Root points to another tree in the form of MPT, which stores the data of state variables involved in contract execution. The tree in MPT form pointed to by this storage_Root is Storage Trie, which is the hash value of the root node of Storage Trie. Generally, this Storage Trie tree also stores key-value pairs. The key indicates the address of the state variable. Its value can be the result of processing the state variable declaration position in the contract (a value counting from 0) after certain rules, such as sha3 (the position where the state variable is declared), or sha3 (contract name + location of state variable declaration). value is used to store the value of a state variable (for example, an RLP-encoded value). A part of the data stored on the path from the root node to the leaf node through the intermediate node is connected to form the key, and the value is stored in the leaf node. As mentioned earlier, this Storage trie can also be a tree in the form of MPT, which is generally a 16-fork tree, that is, for Branch Node, it can have up to 16 child nodes, and these child nodes may include Extension Node and/or Leaf Node. For Extension Node, it generally can have 1 child node, and this child node can be Branch Node or Leaf Node.
例如图4中的state Trie的Leaf Node Account P,该账户是一个合约账户,其Storage Root锁定了该合约存储中的所有状态。这些状态组织为MPT树,树形结构如该Storage Root链接的Storage trie。这个链接的Storage trie中,以Leaf Node State Variable N为例,例如为前述合约代码示例的storedData的值,则其key为sha3(storedData的声明位置,后面将详述),其value值为s(为了简洁,这里省略了对value的编码格式,例如是RLP,后续类似,不再赘述)。其中,key的值顺序的分布于storage Trie的根节点到叶子节点(即Leaf Node Variable N)中。For example, the Leaf Node Account P of the state Trie in Figure 4 is a contract account, and its Storage Root locks all states in the contract storage. These states are organized into MPT trees, and the tree structure is such as the Storage trie linked to the Storage Root. In the Storage trie of this link, take Leaf Node State Variable N as an example. For example, if it is the value of storedData in the aforementioned contract code example, its key is sha3 (the declaration location of storedData will be detailed later), and its value is s( For the sake of simplicity, the encoding format of value is omitted here, such as RLP, which will be similar later and will not be described again). Among them, the key value is distributed sequentially from the root node to the leaf node of the storage Trie (that is, Leaf Node Variable N).
再例如,图4中的state Trie中的Leaf Node Account C,该账户是一个外部账户,其key为sha3(Address C),即账号C的地址的hash值(hash算法例如采用sha3算法),其存储的值value可以为(Account),其中账户信息Account是[nonce,balance]构成的二元组。如前所述,由于Account C为外部账户,因此其账户信息是nonce和balance两项(这里省略了codehash和storage root,以下类似)。例如一个外部账户,其nonce为20,Balance为4550,则Leaf Node State Variable C这个叶子节点中即存储nonce=20,balance=4550。而Account C的地址为key,其值顺序的分布于state Trie的根节点到叶子节点(即Leaf Node Variable C)中。For another example, Leaf Node Account C in the state Trie in Figure 4 is an external account, and its key is sha3 (Address C), which is the hash value of the address of account C (the hash algorithm uses the sha3 algorithm, for example). The stored value value can be (Account), where the account information Account is a tuple composed of [nonce, balance]. As mentioned before, since Account C is an external account, its account information is nonce and balance (codehash and storage root are omitted here, similar below). For example, if an external account has a nonce of 20 and a balance of 4550, then the leaf node Leaf Node State Variable C stores nonce=20 and balance=4550. The address of Account C is key, and its values are sequentially distributed from the root node to the leaf node of the state Trie (ie, Leaf Node Variable C).
这些状态,包括外部账户的k-v和合约账户的k-v,最终存储于数据库中。数据库中的存储,并不是直接存储这些账户的状态,即不是直接存储这些账户的k-v,而是存储每个树节点本身的k-v值。These states, including k-v of external accounts and k-v of contract accounts, are ultimately stored in the database. The storage in the database does not directly store the status of these accounts, that is, it does not directly store the k-v of these accounts, but stores the k-v value of each tree node itself.
如图5的示例中所示,上一级的MPT结构中,对于叶子节点A1,通过根节点A8(Extension Node)中shared nibble的a7—中间节点A7(Branch Node)的槽位1—叶子节点A1中key-end的1335,顺序组合起来构成该叶子节点的key,即为a711335,在该叶子节点中存储Balance=45.0ETH,Nonce=n1。对于叶子节点A2,通过根节点A8(Extension Node)中shared nibble的a7—中间节点A7(Branch Node)的槽位7—节点A6(Extension Node)中shared nibbles的d3—中间节点A5(Branch Node)中的槽位3-叶子节点A2中key-end的7,顺序组合起来构成该叶子节点的key,即为a77d337,在该叶子节点中存储Balance=1.00WEI,Nonce=n2。对于叶子节点A3,通过根节点A8(Extension Node)中shared nibble的a7—中间节点A7(Branch Node)的槽位f—叶子节点A3中key-end的9365,顺序组合起来构成该叶子节点的key,即为a7f9365,在该叶子节点中存储Balance=1.1ETH,Nonce=n3。对于叶子节点A4,通过根节点A8(Extension Node)中shared nibble的a7—中间节点A7(Branch Node)的槽位7—节点A6(Extension Node)中shared nibbles的d3—中间节点A5(Branch Node)中的槽位9-叶子节点A4中key-end的7,顺序组合起来构成该叶子节点的key,即为a77d397,在该叶子节点中存储Balance=0.12ETH,Nonce=n4,CodeHash=c1,Storage root=s1。s1可以为H(A10),即下一层树的根节点A10的hash至。其中,A1、A2和A3的叶子节点中存储的是外部账户的信息,A4的叶子节点中存储的是合约账户的信息。对于合约账户,其包含下一级MPT,构成Storage Trie,用于存储该合约账户中的状态变量。As shown in the example of Figure 5, in the upper-level MPT structure, for leaf node A1, through a7 of the shared nibble in the root node A8 (Extension Node) - slot 1 of the intermediate node A7 (Branch Node) - leaf node The key-end 1335 in A1 is combined sequentially to form the key of the leaf node, which is a711335. Balance=45.0ETH and Nonce=n1 are stored in the leaf node. For leaf node A2, through a7 of shared nibbles in root node A8 (Extension Node) - slot 7 of intermediate node A7 (Branch Node) - d3 of shared nibbles in node A6 (Extension Node) - intermediate node A5 (Branch Node) Slot 3 in - 7 of the key-end in leaf node A2 are combined sequentially to form the key of the leaf node, which is a77d337. Balance=1.00WEI and Nonce=n2 are stored in the leaf node. For leaf node A3, a7 of the shared nibble in the root node A8 (Extension Node) - slot f of the intermediate node A7 (Branch Node) - 9365 of the key-end in the leaf node A3 are combined sequentially to form the key of the leaf node. , which is a7f9365. Balance=1.1ETH and Nonce=n3 are stored in this leaf node. For leaf node A4, through a7 of shared nibbles in root node A8 (Extension Node) - slot 7 of intermediate node A7 (Branch Node) - d3 of shared nibbles in node A6 (Extension Node) - intermediate node A5 (Branch Node) Slot 9 in - 7 of the key-end in leaf node A4 are combined sequentially to form the key of the leaf node, which is a77d397. Balance=0.12ETH, Nonce=n4, CodeHash=c1, Storage are stored in the leaf node. root=s1. s1 can be H(A10), which is the hash to the root node A10 of the next level tree. Among them, the leaf nodes of A1, A2 and A3 store the information of the external account, and the leaf node of A4 stores the information of the contract account. For a contract account, it contains the next-level MPT, which constitutes a Storage Trie, which is used to store the state variables in the contract account.
如图5的示例中所示,下一级的MPT结构中,对于叶子节点A11,通过根节点A10(Branch Node)中的槽位3—叶子节点A11中key-end的35b2e4,顺序组合起来构成该叶子节点的key,即为335b2e4,在该叶子节点中存储“张三_A=20”,例如表示在合约中定义的A类型数字资产归属于张三的份额为20,即张三的A类资产的余额为20。对于叶子节点A12,通过根节点A10(Branch Node)中的槽位7—叶子节点A12中key-end的c25988,顺序组合起来构成该叶子节点的key,即为7c25988,在该叶子节点中存储“李四_B=20”,例如表示在合约中定义的B类型数字资产归属于李四的份额为50,即李四的B类资产的余额为50。对于叶子节点A15,通过根节点A10(Branch Node)中的槽位f—中间节点A13(Extension Node)中shared nibble的a—中间节点A14(Branch Node)的槽位6—叶子节点A15中key-end的be33,顺序组合起来构成该叶子节点的key,即为fa6be33,在该叶子节点中存储“storedData=s”。对于叶子节点A16,通过根节点A10(Branch Node)中的槽位f—中间节点A13(Extension Node)中shared nibble的a—中间节点A14(Branch Node)的槽位9—叶子节点A16中key-end的9365,顺序组合起来构成该叶子节点的key,即为fa99365,在该叶子节点中存储“王五_A=35”,例如表示在合约中定义的A类型数字资产归属于王五的份额为35,即王五的A类资产的余额为35。As shown in the example of Figure 5, in the next-level MPT structure, for leaf node A11, slot 3 in the root node A10 (Branch Node) - 35b2e4 of the key-end in the leaf node A11 are sequentially combined to form The key of the leaf node is 335b2e4, and "Zhang San_A=20" is stored in the leaf node. For example, it means that the share of type A digital assets defined in the contract belonging to Zhang San is 20, that is, Zhang San's A The balance of class assets is 20. For leaf node A12, through slot 7 in the root node A10 (Branch Node) - c25988 of the key-end in the leaf node A12, they are combined sequentially to form the key of the leaf node, which is 7c25988, and stored in the leaf node " "Johnny_B=20", for example, means that the share of type B digital assets defined in the contract that belongs to John is 50, that is, the balance of John's B-type assets is 50. For leaf node A15, through slot f in root node A10 (Branch Node) - a of the shared nibble in intermediate node A13 (Extension Node) - slot 6 of intermediate node A14 (Branch Node) - key in leaf node A15 - The be33 of end is combined sequentially to form the key of the leaf node, which is fa6be33, and "storedData=s" is stored in the leaf node. For leaf node A16, through slot f in root node A10 (Branch Node) - a of the shared nibble in intermediate node A13 (Extension Node) - slot 9 of intermediate node A14 (Branch Node) - key in leaf node A16 - The 9365 of end are combined sequentially to form the key of the leaf node, which is fa99365. "Wang Wu_A=35" is stored in the leaf node, for example, it indicates the share of the A type digital assets defined in the contract that belongs to Wang Wu. is 35, that is, the balance of Wang Wu’s Class A assets is 35.
上述MPT树的节点构成中,用前缀prefix表示树节点类型,例如0表示包含偶数个shared nibbles(共享的半字节)的Extension Node,用1表示包含奇数个shared nibble(s)的Extension Node,用2表示包含偶数个nibbles的Leaf Node, 用3表示包含奇数个nibble(s)的Leaf Node。In the node composition of the above MPT tree, the prefix prefix is used to indicate the tree node type. For example, 0 indicates an Extension Node containing an even number of shared nibbles (shared nibbles), and 1 indicates an Extension Node containing an odd number of shared nibble(s). Use 2 to represent a Leaf Node containing an even number of nibbles, and use 3 to represent a Leaf Node containing an odd number of nibble(s).
上述节点构成中,下一个树节点的整体内容的hash值,填入上一个树节点的对应位置中。数据库中,实际上存储每个树节点的key-value的映射,其中value包括这个树节点中存储的内容,对应的key是这个树节点整体内容的hash值。这样,数据库中实际存储的树节点k-v如下表:In the above node composition, the hash value of the entire content of the next tree node is filled in the corresponding position of the previous tree node. In the database, the key-value mapping of each tree node is actually stored, where value includes the content stored in the tree node, and the corresponding key is the hash value of the overall content of the tree node. In this way, the tree nodes k-v actually stored in the database are as follows:
KeyKey Value(逻辑示意,且持久化存储中需要经过RLP编码)Value (logical representation, and RLP encoding is required in persistent storage)
H(A9)H(A9) Prev Hash:,Nonce:,Timestamp:,Block Num:,State Root:H(A8),Transaction Root:,Receipt Root:,…Prev Hash:,Nonce:,Timestamp:,Block Num:,State Root:H(A8),Transaction Root:,Receipt Root:,…
H(A8)H(A8) prefix:0,shared nibble(s):a7,next node:H(A7)prefix:0,shared nibble(s):a7,next node:H(A7)
H(A7)H(A7) 0:,1:H(A1),2:,3:,4:,5:,6:,7:H(A6),8:,9:,a:,b:,c:,d:,e:,f:H(A3),value:0:,1:H(A1),2:,3:,4:,5:,6:,7:H(A6),8:,9:,a:,b:,c:,d:, e:,f:H(A3),value:
H(A1)H(A1) prefix:2,Key-end:1335,balance:45.0ETH,nonce:n1prefix:2,Key-end:1335,balance:45.0ETH,nonce:n1
H(A6)H(A6) value:prefix:0,shared nibble(s):d3,next node:H(A5)value:prefix:0,shared nibble(s):d3,next node:H(A5)
H(A3)H(A3) prefix:2,Key-end:9365,balance:1.1ETH,nonce:n3prefix:2,Key-end:9365,balance:1.1ETH,nonce:n3
H(A5)H(A5) 0:,1:,2:,3:H(A2),4:,5:,6:,7:,8:,9:H(A4),a:,b:,c:,d:,e:,f:,value:0:,1:,2:,3:H(A2),4:,5:,6:,7:,8:,9:H(A4),a:,b:,c:,d:, e:,f:,value:
H(A2)H(A2) prefix:0,Key-end:7,balance:1.00WEI,nonce:n2prefix:0,Key-end:7,balance:1.00WEI,nonce:n2
H(A4)H(A4) prefix:0,Key-end:7,balance:0.12ETH,nonce:n4,codehash:c1,storage root:H(10)prefix:0,Key-end:7,balance:0.12ETH,nonce:n4,codehash:c1,storage root:H(10)
H(A10)H(A10) 0:,1:,2:,3:H(A11),4:,5:,6:,7:H(A12),8:,9:,a:,b:,c:,d:,e:,f:H(A13),value:0:,1:,2:,3:H(A11),4:,5:,6:,7:H(A12),8:,9:,a:,b:,c:,d:, e:,f:H(A13),value:
H(A11)H(A11) prefix:2,Key-end:35b2e4,value:张三_A=20prefix:2,Key-end:35b2e4,value:Zhang San_A=20
H(A12)H(A12) prefix:2,Key-end:c25988,value:李四_B=50prefix:2,Key-end:c25988,value:李思_B=50
H(A13)H(A13) prefix:1,shared nibble(s):a,next node:H(A14)prefix:1,shared nibble(s):a,next node:H(A14)
H(A14)H(A14) 0:,1:,2:,3:,4:,5:,6:H(A15),7:,8:,9:H(A16),a:,b:,c:,d:,e:,f:,value:0:,1:,2:,3:,4:,5:,6:H(A15),7:,8:,9:H(A16),a:,b:,c:,d:, e:,f:,value:
H(A15)H(A15) prefix:2,Key-end:be33,value:storedData=sprefix:2,Key-end:be33,value:storedData=s
H(A16)H(A16) prefix:2,Key-end:9365,value:王五_A=50prefix:2,Key-end:9365,value:王五_A=50
表1、数据库中实际存储的树节点k-vTable 1. Tree nodes k-v actually stored in the database
上表1中,用H()表示hash计算。这样,下一个树节点的hash值锚定在了上一个树节点中。通过这样的层层hash,得到整颗state trie树的根hash,并将该根hash锁定到了区块头的state root字段中。In Table 1 above, H() is used to represent hash calculation. In this way, the hash value of the next tree node is anchored in the previous tree node. Through such layers of hashing, the root hash of the entire state trie tree is obtained, and the root hash is locked into the state root field of the block header.
在一些区块链系统中,区块链平台的代码可以包括P2P(Peer to Peer,点对点)模块,共识(consensus)模块,执行模块和存储模块。P2P是一种计算机网络的组成方式,与常见的web网络不同,P2P是分散的、去中心化的。P2P模块可以完成数据的分布式传播。对于区块链节点来说,通过P2P模块可以以点对点的方式传播和接收数据。不同参与方通过部署的节点(Node)可以建立一个分布式的区块链网络。利用链式区块结构构造的账本,保存于分布式的区块链网络中的每个节点(或大多节点上,如共识节点)上,这也称为去中心化(或称为多中心化)的分布式账本。这样的区块链系统需要解决去中心化(或多中心化)的多个节点上各自的账本数据的一致性和正确性的问题。每个节点上都运行着相同的区块链平台程序,在一定容错需求的设计下,通过共识模块可以保证所有忠诚节点具有相同的交易,从而保证所有忠诚节点对相同交易的执行结果一致,并将交易及执行结果打包生成区块。当前主流的共识机制包括:工作量证明(Proof of Work,POW)、股权证明(Proof of Stake,POS)、委任权益证明(Delegated Proof of Stake,DPOS)、实用拜占庭容错(Practical Byzantine Fault Tolerance,PBFT)算法,蜜獾拜占庭容错(HoneyBadgerBFT)算法等。共识过程中共识模块一般还可以生成当前交易集合对应的区块的时间戳等。执行模块可以执行交易,包括普通转账交易和涉及合约的交易,可以是在共识模块完成共识之前或之后。对于涉及合约的交易,执行模块可以引入虚拟机来执行智能合约的代码,如虚拟机(Ethereum Virtual Machine,EVM),从而通过EVM屏蔽各个节点硬件配置和软件环境的差异性,以保证各个节点上执行智能合约的过程和结果是相同的,并通过沙箱环境避免智能合约的执行给主机上的区块链平台代码、其它程序或操作系统带来影响。对于联盟链的一种情形来说,节点之间通过共识模块可以确定一个交易集合中的交易内容和交易顺序,进而将共识结果的一个确定性的交易集合输出至执行模块。执行模块通过执行普通转账交易/涉及合约的交易,生成执行结果,并发送至存储模块。存储模块可以负责将执行结果存储至节点本地的持久化存储介质中。In some blockchain systems, the code of the blockchain platform can include P2P (Peer to Peer, point-to-point) module, consensus module, execution module and storage module. P2P is a form of computer network. Different from common web networks, P2P is decentralized and decentralized. The P2P module can complete the distributed dissemination of data. For blockchain nodes, data can be transmitted and received in a point-to-point manner through the P2P module. Different participants can establish a distributed blockchain network through deployed nodes. The ledger constructed using a chain block structure is stored on each node (or on most nodes, such as consensus nodes) in the distributed blockchain network. This is also called decentralization (or multi-centering). )'s distributed ledger. Such a blockchain system needs to solve the problem of consistency and correctness of respective ledger data on multiple decentralized (or multi-centered) nodes. Each node runs the same blockchain platform program. Under the design of certain fault-tolerance requirements, the consensus module can ensure that all loyal nodes have the same transactions, thereby ensuring that all loyal nodes have consistent execution results for the same transactions, and The transaction and execution results are packaged to generate blocks. The current mainstream consensus mechanisms include: Proof of Work (POW), Proof of Stake (POS), Delegated Proof of Stake (DPOS), Practical Byzantine Fault Tolerance (PBFT) ) algorithm, Honey Badger Byzantine Fault Tolerance (HoneyBadgerBFT) algorithm, etc. During the consensus process, the consensus module can generally also generate the timestamp of the block corresponding to the current transaction set, etc. The execution module can execute transactions, including ordinary transfer transactions and transactions involving contracts, before or after the consensus module completes consensus. For transactions involving contracts, the execution module can introduce a virtual machine to execute the code of the smart contract, such as a virtual machine (Ethereum Virtual Machine, EVM), thereby shielding the differences in hardware configuration and software environment of each node through EVM to ensure that each node The process and results of executing smart contracts are the same, and the sandbox environment is used to prevent the execution of smart contracts from affecting the blockchain platform code, other programs or operating systems on the host. For a consortium chain, the nodes can determine the transaction content and transaction sequence in a transaction set through the consensus module, and then output a deterministic transaction set of consensus results to the execution module. The execution module generates execution results by executing ordinary transfer transactions/transactions involving contracts and sends them to the storage module. The storage module can be responsible for storing execution results to the node's local persistent storage medium.
如图6所示的一个区块链节点中,物理上包括CPU、内存和磁盘等。这个区块链节点所执行的区块链平台代码中,可以包括P2P模块,共识模块,执行模块和存储模块。P2P模块、共识模块和执行模块的功能实现一般需要CPU、内存的参与。存储模块可以包括构建树模块,区块头生成模块,WAL(Write Ahead Log,写前日志)模块,状态数据库模块。其中,构建树模块用于基于执行模块传入的状态k-v构建树(例如是MPT树),如前述的state trie和storage trie,从而得到树节点的k-v,一般需要CPU、内存的参与。区块头生成模块用于根据构建树模块所构建的树的根节点和其它一些数据(如上一区块hash、时间戳、区块号等)生成区块头,一般需要CPU、内存的参与。WAL模块用于构建树模块生成的树节点k-v写入状态数据库模块之前,持久化存储构建树模块生成的树节点k-v,以防止构建树模块生成的树节点k-v写入状态数据库模块的过程中由于断电等情形造成的数据丢失,并在发生这种情况时恢复数据,一般需要CPU、内存和磁盘的参与。状态数据库模块用于将构建树模块所构建的如表1中的树节点k-v存储在持久化存储设备上;由于最终会将树节点数据写入持久化存储介质(例如图中的磁盘),因此状态数据库模块除了CPU、内存外一般还需要磁盘的参与。As shown in Figure 6, a blockchain node physically includes CPU, memory, disk, etc. The blockchain platform code executed by this blockchain node can include P2P module, consensus module, execution module and storage module. The function implementation of the P2P module, consensus module and execution module generally requires the participation of CPU and memory. The storage module can include a building tree module, a block header generation module, a WAL (Write Ahead Log) module, and a state database module. Among them, the building tree module is used to build a tree (such as an MPT tree) based on the state k-v passed in by the execution module, such as the aforementioned state trie and storage trie, so as to obtain the k-v of the tree node, which generally requires the participation of CPU and memory. The block header generation module is used to generate the block header based on the root node of the tree built by the building tree module and some other data (such as the previous block hash, timestamp, block number, etc.), which generally requires the participation of CPU and memory. The WAL module is used to persistently store the tree nodes k-v generated by the building tree module before writing them to the state database module, to prevent the tree nodes k-v generated by the building tree module from being written into the state database module due to Data loss caused by situations such as power outages, and recovering data when this occurs generally requires the involvement of CPU, memory, and disk. The state database module is used to store the tree nodes k-v in Table 1 built by the building tree module on the persistent storage device; since the tree node data will eventually be written to the persistent storage medium (such as the disk in the figure), so In addition to CPU and memory, the state database module generally requires the participation of disk.
从存储结构上来说,上述的Merkle树结构,如上述的MPT、Libra的SMT(Sparse Merkle Tree,稀疏默克尔树,类似MPT),以上述表1中的对应关系形式位于构建树模块中,并存储于内存中。其中,上层Merkle树为前缀树(字典树),能够实现对数据的组织,并对组织后的数据得到唯一的Merkle根。叶子节点可以保存状态Value,根节点到中间节点到叶子节点实现对状态key的字典序索引。这些树节点按某种规则编码为Key、其内容编码为Value,最终存储于下层数据库中。数据库大多采用LSM(Log-Structured Merge-Tree,日志结构的合并树)类结构的NoSQL Key-Value DB(DataBase,数据库;Key-Value DB也简称为KVDB),位于状态数据库模块中,保存于磁盘。具体的,数据库例如是levelDB,Libra的RocksDB。这两种KVDB都是基于LSM存储引擎。In terms of storage structure, the above-mentioned Merkle tree structure, such as the above-mentioned MPT and Libra's SMT (Sparse Merkle Tree, similar to MPT), is located in the building tree module in the form of the corresponding relationship in Table 1 above. and stored in memory. Among them, the upper Merkle tree is a prefix tree (dictionary tree), which can organize the data and obtain a unique Merkle root for the organized data. The leaf nodes can save the state value, and the root node to the intermediate node to the leaf node implement lexicographic indexing of the state key. These tree nodes are encoded as Key according to certain rules, and their contents are encoded as Value, and are eventually stored in the underlying database. Most databases use LSM (Log-Structured Merge-Tree, log-structured merge tree) NoSQL Key-Value DB (DataBase, database; Key-Value DB is also referred to as KVDB), which is located in the state database module and is saved on disk . Specifically, the databases are levelDB and Libra's RocksDB. Both KVDBs are based on the LSM storage engine.
前述提到,创建智能合约的交易发送到区块链上,经过共识之后,区块链各节点可以执行这个交易。这时区块链上出现一个与该智能合约对应的合约账户(包括例如帐户的标识Identity,合约的hash值Codehash,合约存储的根StorageRoot),并拥有一个特定的地址,合约代码和账户存储可以保存在该合约账户的存储(Storage)中,如图7所示。智能合约的行为由合约代码控制,而智能合约的账户存储则保存了合约的状态。换句话说,智能合约使得区块链上产生包含合约代码和账户存储(Storage)的虚拟账户。对于合约部署交易或者合约更新交易,将产生或变更Codehash的值。后续,区块链节点可以接收调用部署的智能合约的交易请求,该交易请求可以包括调用的合约的地址、调用的合约中的函数和输入的参数。一般的,该交易请求经过共识后,区块链各个节点可以各自独立执行指定调用的智能合约。As mentioned above, the transaction that creates the smart contract is sent to the blockchain. After consensus, each node of the blockchain can execute the transaction. At this time, a contract account corresponding to the smart contract appears on the blockchain (including, for example, the account's Identity, the contract's hash value Codehash, and the root StorageRoot of the contract storage), and has a specific address. The contract code and account storage can be saved. In the storage of the contract account, as shown in Figure 7. The behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract. In other words, smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain. For contract deployment transactions or contract update transactions, the value of Codehash will be generated or changed. Subsequently, the blockchain node can receive a transaction request to call the deployed smart contract. The transaction request can include the address of the called contract, the function in the called contract and the input parameters. Generally, after the transaction request passes consensus, each node of the blockchain can independently execute the specified smart contract call.
图7左侧为一个采用solidity编写的智能合约及其经过编译和执行过程的示例。该智能合约经过编译器(compiler)编译(compile)后生成字节码(Bytecode)。图中的solc是solidity的命令行编译器,通过solidity编写的智能合约可通过带参数的命令行工具solc进行编译,从而生成可以运行于EVM的字节码。经过上述图1、图2中部署合约的过程,区块链上可以成功创建智能合约。部署合约后,区块链上生成一个与该智能合约对应的合约账户,该合约账户包括例如合约计数器Nonce,账户的余额Balance,合约字节码的hash值Codehash,合约存储的根StorageRoot等。该合约在链上会具有一个特定的地址,即合约地址。The left side of Figure 7 shows an example of a smart contract written in solidity and its compilation and execution process. The smart contract is compiled by a compiler to generate bytecode. The solc in the picture is solidity's command line compiler. Smart contracts written through solidity can be compiled through the command line tool solc with parameters, thereby generating bytecode that can be run on the EVM. After the process of deploying the contract in Figure 1 and Figure 2 above, the smart contract can be successfully created on the blockchain. After deploying the contract, a contract account corresponding to the smart contract is generated on the blockchain. The contract account includes, for example, the contract counter Nonce, the balance of the account, the hash value of the contract bytecode Codehash, the root StorageRoot of the contract storage, etc. The contract will have a specific address on the chain, which is the contract address.
这个合约地址,例如是根据部署合约的外部账户的地址和其计数器nonce一并做hash计算得到。具体,例如是sha3(rlp.encode([address_sender,nonce]))(rlp如前所述是一种编码格式,不同区块链中可以用其它编码格式替换,甚至不重新编码,因此后续省略rlp)。其中sha3是一类hash算法,例如经常采用的keccak256这样的算法。rlp如前所述表示一种编码格式,rlp.encode([address_sender,nonce])表示对圆括号中的内容进行rlp编码。圆括号中的[address_sender,nonce]表示对部署合约的外部账户的地址address_sender和其计数器nonce这两个字段做顺序拼接。例如采用keccak256算法,可以得到一个长度为256bits的hash值,根据这个hash值可以得到部署的合约在区块链上的地址(例如取前20字节)。256bits也就是32bytes。账户的余额Balance在完成部署时可以设置为默认值0或。合约字节码的hash值Codehash,可以由区块链平台通过对合约字节码进行hash计算得到。合约存储的根StorageRoot,可以是一个默认值,也可以是一个按照下层的storage Trie的根节点计算得到的hash值。这个一般取决于部署的合约中是否会进行初始化操作,初始化操作例如执行合约中构造函数。如果部署的合约中包含了构造函数,一般会包括对一些最终将存储于底层数据库中的状态变量进行初始化的工作,这个初始化的工作可以在虚拟机执行。经过初始化后的状态变量,如前述内容所述,可以构建一颗MPT树,从而可以得到这颗MPT树的根节点,进而得到这个根节点的hash值。如果部署的合约中不包含构造函数,则可以不执行具体的函数,而是由区块链平台赋予StorageRoot一个默认值,例如是空内容的hash值。This contract address is, for example, calculated by hashing the address of the external account where the contract is deployed and its counter nonce. Specifically, for example, sha3(rlp.encode([address_sender,nonce])) (rlp is an encoding format as mentioned above. It can be replaced by other encoding formats in different blockchains without even re-encoding, so rlp will be omitted later. ). Among them, sha3 is a type of hash algorithm, such as the commonly used algorithm such as keccak256. As mentioned earlier, rlp represents an encoding format, and rlp.encode([address_sender,nonce]) represents rlp encoding the content in parentheses. The [address_sender, nonce] in parentheses indicates the sequential concatenation of the two fields address_sender of the external account where the contract is deployed and its counter nonce. For example, using the keccak256 algorithm, you can get a hash value with a length of 256 bits. Based on this hash value, you can get the address of the deployed contract on the blockchain (for example, take the first 20 bytes). 256bits is 32bytes. The balance of the account can be set to the default value of 0 or when the deployment is completed. The hash value of the contract bytecode, Codehash, can be calculated by the blockchain platform by hashing the contract bytecode. The root StorageRoot of contract storage can be a default value or a hash value calculated based on the root node of the underlying storage Trie. This generally depends on whether initialization operations are performed in the deployed contract, such as executing the constructor in the contract. If the deployed contract contains a constructor, it generally includes the work of initializing some state variables that will eventually be stored in the underlying database. This initialization work can be performed in the virtual machine. After initializing the state variables, as mentioned above, an MPT tree can be constructed, so that the root node of the MPT tree can be obtained, and then the hash value of the root node can be obtained. If the deployed contract does not contain a constructor, the specific function does not need to be executed. Instead, the blockchain platform gives StorageRoot a default value, such as a hash value of empty content.
合约经过部署后,如前所述,后续可以被调用。如图2所示Bob发起一个调用智能合约的交易到区块链网络后,合约执行,从而将状态变量设置为“hello”这个字符串。图2中类似的是Alice发起调用合约的交易,从而通过合约的执行读取状态变量的值。After the contract is deployed, as mentioned above, it can be called later. As shown in Figure 2, after Bob initiates a transaction that calls the smart contract to the blockchain network, the contract is executed, thereby setting the state variable to the string "hello". Similar to Figure 2, Alice initiates a transaction that calls the contract, thereby reading the value of the state variable through the execution of the contract.
前述也提到,智能合约通常在区块链环境中以代码形式定义了能够自动执行条款的合约。其中的条款通常与业务层面的逻辑有关。因此,整体上合约代码体现了业务逻辑。随着业务的发展和变化,业务逻辑可能发生改变,这时合约的代码也需要进行调整。此外,合约的代码可能存在漏洞,也需要修复,或者是编写合约的语言版本升级也会带来合约的升级要求。上述种种情况,通常需要对部署的合约进行升级。As mentioned above, smart contracts usually define contracts in the form of code in a blockchain environment that can automatically execute terms. The terms are usually related to business-level logic. Therefore, the contract code as a whole reflects the business logic. As the business develops and changes, the business logic may change, and at this time the contract code also needs to be adjusted. In addition, the code of the contract may have loopholes and need to be repaired, or the upgrade of the language version in which the contract is written will also bring about upgrade requirements for the contract. In the above situations, the deployed contract usually needs to be upgraded.
一种合约升级方式是部署新的合约,新的合约会具有一个不同于就合约的地址。具体的,如前所述,合约地址可以是根据部署合约的外部账户的地址和其计数器nonce一并做hash计算得到,例如是sha3([address_sender,nonce])。不同合约部署者部署的合约,合约地址是不同的;即使对于相同的合约部署者,由于其升级合约时发起了一笔新的交易,nonce作为交易计数器发生了改变,因此,新部署的合约的地址也会发生变化。这样,新的合约的存储也不同 于旧的合约,新合约的合约存储中的状态变量,只能从部署了新的合约的区块开始设置和读取,而无法访问旧合约中的状态。One way to upgrade a contract is to deploy a new contract. The new contract will have a different address than the original contract. Specifically, as mentioned above, the contract address can be calculated by hashing the address of the external account where the contract is deployed and its counter nonce, for example, sha3([address_sender,nonce]). Contracts deployed by different contract deployers have different contract addresses; even for the same contract deployer, since a new transaction was initiated when upgrading the contract, the nonce as the transaction counter has changed. Therefore, the newly deployed contract's The address will also change. In this way, the storage of the new contract is also different from the old contract. The state variables in the contract storage of the new contract can only be set and read from the block where the new contract is deployed, but the state in the old contract cannot be accessed.
在一些联盟链中,合约的升级,需要保障:In some alliance chains, contract upgrades need to ensure:
第一,升级后的合约保持与升级前的合约是相同的合约地址。这样,在升级后才能保持与升级前相同的合约存储空间,历史数据才不会丢失,而且用户不必更改调用合约时输入的合约地址。First, the upgraded contract maintains the same contract address as the contract before the upgrade. In this way, the same contract storage space can be maintained after the upgrade as before the upgrade, historical data will not be lost, and users do not have to change the contract address entered when calling the contract.
第二,还需要保障升级前后的兼容性,即升级后的新合约中的状态,需要保持能够读取旧合约中相同状态的值的能力。Second, it is also necessary to ensure compatibility before and after the upgrade, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract.
上述第一点,合约账户地址的生成规则可以设定为与nonce无关,进一步的,可以设设定为与部署者无关。例如,可以由合约的名称来确定合约账户的地址,例如是sha3([name_contract])。这样可以保证只要是相同名字的合约,其在区块链上的地址就是相同的。前述提到,合约账户一般包括Nonce,Balance,Storage root,CodeHash,其中,CodeHash是合约字节码的hash值。升级后的合约,字节码与升级前的字节码不同,因此,升级合约后合约账户中的CodeHash会发生改变,即合约升级后合约账户中的Codehash一般也会更新。In the first point mentioned above, the generation rules of contract account addresses can be set to be independent of nonce. Furthermore, they can be set to be independent of the deployer. For example, the address of the contract account can be determined by the name of the contract, such as sha3([name_contract]). This ensures that as long as the contract has the same name, its address on the blockchain will be the same. As mentioned above, contract accounts generally include Nonce, Balance, Storage root, and CodeHash, where CodeHash is the hash value of the contract bytecode. The bytecode of the upgraded contract is different from the bytecode before the upgrade. Therefore, the CodeHash in the contract account will change after the contract is upgraded. That is, the Codehash in the contract account will generally be updated after the contract is upgraded.
上述第二点,本质是同一个状态变量,比如状态变量r,如果在合约升级之后的key发生变化,则从升级合约的区块开始读取r的值,由于key发生改变,就无法正确读取升级合约前的r的值。The second point above is essentially the same state variable, such as state variable r. If the key changes after the contract is upgraded, the value of r will be read starting from the block of the upgraded contract. Since the key changes, it cannot be read correctly. Take the value of r before upgrading the contract.
例如以下代码示例2是升级前的旧合约的代码示例:For example, the following code example 2 is the code example of the old contract before the upgrade:
Figure PCTCN2022135220-appb-000002
Figure PCTCN2022135220-appb-000002
代码示例2.demo1的solidity代码Code example 2. solidity code of demo1
上述demo1代码中,在第3-4行分别声明了2个状态变量ID、sex。这2个状态变量中:ID是unit256类型,这种类型在solidity中为32个bytes;sex是bool类型,是1个bytes。这2变量如果在函数之前,则一般会作为持久化存储的状态变量,即会存储于底层数据库中。In the above demo1 code, two state variables ID and sex are declared in lines 3-4 respectively. Among these two state variables: ID is of unit256 type, which is 32 bytes in solidity; sex is of bool type, which is 1 bytes. If these two variables are before the function, they will generally be used as persistent storage state variables, that is, they will be stored in the underlying database.
上述demo1代码中,在第6-8行定义了setID()函数。setID()函数后面的public作为修饰符,指明该setID()函数作为对内/对外的接口函数。setID()函数中有一个unit256类型的参数x。unit256表示256bits的无符号整数,长度是32bytes。在函数体内,将参数x的值赋值给ID,从而实现对外提供的接口,将用户输入的参数x设定为状态变量ID的值。在第10-12行定义了getID()函数。getID()函数后面的view表示该函数只能读取状态变量而不能修改状态变量。In the above demo1 code, the setID() function is defined on lines 6-8. The public after the setID() function is used as a modifier to indicate that the setID() function serves as an internal/external interface function. There is a unit256 type parameter x in the setID() function. unit256 represents an unsigned integer of 256 bits, with a length of 32 bytes. In the function body, assign the value of parameter x to ID to implement the externally provided interface, and set the parameter x input by the user to the value of the state variable ID. The getID() function is defined on lines 10-12. The view after the getID() function indicates that the function can only read state variables but cannot modify them.
上述demo1代码中,在第14-16行和第18-20行分别定义了setSex()函数和getSex()函数,代码的含义与上述类似,不再赘述。In the above demo1 code, the setSex() function and the getSex() function are defined in lines 14-16 and 18-20 respectively. The meaning of the code is similar to the above and will not be described again.
上述demo1代码中,第22-24行定义了version()函数,该函数返回值是uint256类型。函数体中的操作是返回1。用户可以调用该函数,该函数返回当前合约的版本,这里版本是1。In the above demo1 code, lines 22-24 define the version() function, and the return value of this function is uint256 type. The operation in the function body is to return 1. Users can call this function, which returns the version of the current contract, here version is 1.
如前所述,需要持久化存储的状态变量,一般是一个成对的key-value结构。其中key表示该状态变量的地址,value表示该状态变量的取值。上述demo1的代码中,在头部声明了2个状态变量ID、sex,每个状态变量将具有一 个key。需要注意的是,这2个状态变量所占用的空间都是固定的,即32bytes和1bytes。As mentioned before, state variables that need to be stored persistently are generally a paired key-value structure. The key represents the address of the state variable, and the value represents the value of the state variable. In the above code of demo1, two state variables ID and sex are declared in the header, and each state variable will have a key. It should be noted that the space occupied by these two state variables is fixed, namely 32bytes and 1bytes.
每个合约一般都有自身的存储空间,这个存储空间是虚拟的,容量可以是一个非常大的数组,例如有2 256个元素的数组,编号从0至2 256-1。每个元素可以是占用一定长度,例如是32bytes。每个元素这里称为插槽(slot),如图8所示。ID、sex这2个状态变量的值,例如可以存储在第0、1这2个slot的位置中。需要说明的是,总计2 256个slot的存储空间,是虚拟空间的总容量,也就是说,没有使用的slot并不会占用底层数据库的实际存储空间。 Each contract generally has its own storage space. This storage space is virtual, and the capacity can be a very large array, such as an array with 2 256 elements, numbered from 0 to 2 256 -1. Each element can occupy a certain length, such as 32 bytes. Each element is called a slot here, as shown in Figure 8. The values of the two state variables, ID and sex, can be stored in slots 0 and 1, for example. It should be noted that the total storage space of 2,256 slots is the total capacity of the virtual space. In other words, unused slots will not occupy the actual storage space of the underlying database.
如前所述,用solidity这类的高级语言编写的demo1合约,经过编译器编译后生成字节码。As mentioned before, the demo1 contract written in a high-level language such as solidity is compiled by a compiler to generate bytecode.
合约的执行,具体可以如图7所示。例如图2中的一个调用合约的交易发送至区块链网络中,并经过共识后,各个节点可以执行该交易。该交易的to字段表明被调用合约的地址。任一该节点可以根据合约的地址找到合约账户的存储,进而可以根据合约账户的存储中读取到Codehash,从而根据Codehash找到对应的合约字节码。节点可以将合约的字节码从存储载入虚拟机中。进而,由解释器(Interpreter)解释执行,例如包括对调用的合约的字节码进行解析(Parse,如解析Push、Add、SGET、SSTORE、Pop等),得到操作码(OPcode)和函数,并将这些OPcode存储到虚拟机开辟(图中的alloc;程序执行结束后对应释放内存操作,如图中Free)的内存空间(memory)中,同时还得到调用的函数在内存空间中的跳转位置(JumpCode)。一般经过对执行合约所需要消耗的Gas进行计算且Gas足够后,跳转到Memory的对应地址取得所调用函数的OPcode并开始执行,将所调用到的函数的OPcode所操作的数据进行计算(Data Computation)、推入/推出栈(Stack)等的操作,从而完成数据计算。这个过程中,还可能需要一些合约的上下文(Context)信息,例如区块号、调用合约的发起者的信息之类,这些信息可以从Context中得到(Get操作)。最后,将产生的状态通过调用存储接口以存入数据库存储(Storage)中。需要说明的是,合约创建的过程中,也可能产生对合约中某些函数的执行执行,例如初始化操作的函数,这时也会解析代码、产生跳转指令,存入Memory,对Stack的操作等。The execution of the contract can be shown in Figure 7. For example, a transaction that calls a contract in Figure 2 is sent to the blockchain network, and after consensus, each node can execute the transaction. The to field of the transaction indicates the address of the called contract. Any node can find the storage of the contract account based on the address of the contract, and then can read the Codehash from the storage of the contract account, and then find the corresponding contract bytecode based on the Codehash. The node can load the bytecode of the contract from storage into the virtual machine. Then, the interpreter (Interpreter) interprets and executes it, including parsing the bytecode of the called contract (Parse, such as parsing Push, Add, SGET, SSTORE, Pop, etc.) to obtain the operation code (OPcode) and function, and Store these OPcodes in the memory space (memory) opened by the virtual machine (alloc in the figure; after the program execution is completed, the corresponding memory release operation, such as Free in the figure), and also obtain the jump position of the called function in the memory space. (JumpCode). Generally, after calculating the gas required to execute the contract and the gas is sufficient, jump to the corresponding address of Memory to obtain the OPcode of the called function and start execution, and calculate the data operated by the OPcode of the called function (Data Computation), push/pull operations such as stack (Stack) to complete data calculation. During this process, you may also need some context information of the contract, such as the block number, the information of the initiator of the calling contract, etc. This information can be obtained from the Context (Get operation). Finally, the generated state is stored in the database storage (Storage) by calling the storage interface. It should be noted that during the process of contract creation, certain functions in the contract may also be executed, such as functions for initialization operations. At this time, the code will also be parsed, jump instructions generated, stored in Memory, and operations on Stack wait.
通过上述过程,虚拟机加载并执行合约的字节码,可能产生状态和/或读取状态,从而需要对底层数据库进行访问。虚拟机需要方便的访问底层的KV数据库。访问KV数据库,一般可以采用类似指针的访问数据能力。例如,如果需要从KV数据库中读取一个key对应的value,那么在访问前需要知道这个数据的key。Through the above process, the virtual machine loads and executes the bytecode of the contract, which may generate status and/or read status, thereby requiring access to the underlying database. The virtual machine needs to easily access the underlying KV database. To access the KV database, you can generally use pointer-like data access capabilities. For example, if you need to read the value corresponding to a key from the KV database, you need to know the key of this data before accessing it.
如前关于图5、图6及相应文字所描述,对于写操作,执行模块(包括其中的虚拟机)执行合约产生kv,对于读操作,执行模块(包括其中的虚拟机)执行合约产生k。这个k是执行模块或区块链平台产生的key,这里称为状态key。在构建树模块中需要将这个状态key构建到MPT树上,从而得到从MPT根节点-中间节点-叶子节点的一系列的树节点key。如果是读操作,则可以根据树节点key在状态数据库模块中查找对应的value。如果是写操作,则产生从MPT根节点-中间节点-叶子节点的一系列的树节点key-value,以追加(append)的方式将这些树节点kv写入状态数据库模块。As previously described with respect to Figures 5, 6 and corresponding text, for write operations, the execution module (including the virtual machine therein) executes the contract to generate kv, and for read operations, the execution module (including the virtual machine therein) executes the contract to generate k. This k is the key generated by the execution module or blockchain platform, here called the state key. In the tree building module, this state key needs to be built into the MPT tree to obtain a series of tree node keys from the MPT root node-intermediate node-leaf node. If it is a read operation, the corresponding value can be found in the state database module according to the tree node key. If it is a write operation, a series of tree node key-values from the MPT root node-intermediate node-leaf node are generated, and these tree node kv are written to the state database module in an append manner.
在虚拟机执行合约字节码的过程中,对于相同的状态变量,其位置需要是固定的,从而才能使得合约执行产生的key是固定的。这个固定,一般在合约代码确定下来之后也就固定了。因此,这个固定状态变量的位置的工作,通常是在编译器编译的环节决定,而与虚拟机不直接相关。When the virtual machine executes the contract bytecode, the position of the same state variable needs to be fixed, so that the key generated by the contract execution is fixed. This fixation is usually fixed after the contract code is determined. Therefore, the location of this fixed state variable is usually determined during the compiler compilation process and is not directly related to the virtual machine.
编译器编译的过程大致可以包括根据抽象语法树进行词法/语法分析,根据符号表填充符号,语义分析和代码生成等步骤。其中,在根据抽象语法树进行词法/语法分析这个环节,可以生成合约的状态变量的位置信息。例如上述demo1中的ID、sex这2个状态变量,位置分别是0、1,可以对应前述slot中的以下2个位置:The compiler compilation process can roughly include steps such as lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation. Among them, during the lexical/grammatical analysis based on the abstract syntax tree, the position information of the contract's state variables can be generated. For example, the two state variables ID and sex in the above demo1 are located at 0 and 1 respectively, which can correspond to the following two positions in the aforementioned slot:
0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00010x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
上面2个slot的位置,每个均是256bits,用16进制(0x)表示是32个bytes。上面0x之后用空格隔开的每个分段中4个连续的16进制数表示2个bytes,因此一共有16个这样的分段。The positions of the two slots above are each 256 bits, which is represented by 32 bytes in hexadecimal (0x). The four consecutive hexadecimal numbers in each segment separated by spaces after 0x above represent 2 bytes, so there are 16 such segments in total.
这样,在编译器编译得到的合约字节码中,可以分别用slot中的这2个位置来代替2个状态变量的标识,如上述2个256bits分别代替ID、sex。这样,当数据类型是固定大小的值时,编译时可以根据字段排序顺序,给每个要存储的数据预分配存储位置,这相当于提前指定了固定不变的数据指针。In this way, in the contract bytecode compiled by the compiler, these two positions in the slot can be used to replace the identifiers of the two state variables, such as the above two 256 bits replacing ID and sex respectively. In this way, when the data type is a fixed-size value, the storage location can be pre-allocated for each data to be stored according to the field sort order during compilation. This is equivalent to specifying a fixed data pointer in advance.
在虚拟机执行加载并执行合约字节码的过程中,对于ID的操作,即是对0x000...00(即上述slot的位置0,用省略号替代中间较多的0)的操作。类似的,对于sex的操作即是对0x000...01这个slot位置的操作When the virtual machine loads and executes the contract bytecode, the operation on the ID is the operation on 0x000...00 (that is, the position 0 of the above slot, with ellipses replacing the many 0s in the middle). Similarly, the operation of sex is the operation of the slot position 0x000...01
例如一个调用合约的交易,调用的是该合约中的setID()函数,入参是字符串"0001",则虚拟机执行该交易的过程中,将0x000...01这个slot位置存储0001。具体的,可以是虚拟机将这个32字节的0x000...01的slot压入栈中,然后将对应的value也压入栈中。解释执行的过程中从memory取得所调用函数的OPcode并开始执行,根据栈的先进后出或后进先出特点,将value从栈中弹出,然后将slot从栈中弹出,构成slot-value对,然后合约虚拟机执行当前opcode,也就是将value的值写入slot位置的存储。上述过程中,栈一般是以32字节作为一个存储单位,等于1个slot的长度,而对应的value可能小于、等于或大于32字节,则value可能占用stack中的1个单位或多个单位。For example, in a transaction that calls a contract, the setID() function in the contract is called, and the input parameter is the string "0001". When the virtual machine executes the transaction, the slot position 0x000...01 is stored as 0001. Specifically, the virtual machine can push the 32-byte slot of 0x000...01 into the stack, and then push the corresponding value into the stack. During the explanation and execution process, the OPcode of the called function is obtained from memory and starts execution. According to the first-in-last-out or last-in-first-out characteristics of the stack, the value is popped from the stack, and then the slot is popped from the stack to form a slot-value pair. Then the contract virtual machine executes the current opcode, that is, writes the value of value to the storage at the slot location. In the above process, the stack generally uses 32 bytes as a storage unit, which is equal to the length of one slot. The corresponding value may be less than, equal to, or greater than 32 bytes, and the value may occupy one or more units in the stack. unit.
例如一个调用合约的交易,调用的是该合约中的setSex()函数,入参是"1"(例如1表示男生,0表示女生),则虚拟机执行该交易的过程中,将0x000...02这个slot中存储1。For example, in a transaction that calls a contract, the setSex() function in the contract is called, and the input parameter is "1" (for example, 1 means a boy, 0 means a girl), then when the virtual machine executes the transaction, 0x000.. .02 stores 1 in this slot.
上述两个合约执行产生的结果,可以简单表示为如下:The results generated by the execution of the above two contracts can be simply expressed as follows:
0x000...00:0001                               (1)0x000...00:0001              (1)
0x000...01:1                                  (2)0x000...01:1                                    (2)
上述内容也可以如图9所示。至于0x000...01这个slot中存的是bool类型,只占1个byte,则可以将这个bool值1存入这个slot的低位的8个bytes中,如图9下方所示。The above content can also be shown in Figure 9. As for the slot 0x000...01, the bool type is stored, which only occupies 1 byte. The bool value 1 can be stored in the lower 8 bytes of this slot, as shown at the bottom of Figure 9.
虚拟机执行完毕后,虚拟机或者区块链平台可以将(1)、(2)这3个slot-value键值对中的slot转换为状态key。具体的,可以采用拼接合约地址+slot位置的方式得到状态key。例如该demo1合约的地址长度为20bytes,为0x3321dcaf8911d384 2e14a7a4 15be 2fb1a337f43e。则采用拼接合约地址+slot位置后,After the virtual machine is executed, the virtual machine or blockchain platform can convert the slots in the three slot-value pairs (1) and (2) into state keys. Specifically, the state key can be obtained by splicing the contract address + slot position. For example, the address length of the demo1 contract is 20 bytes, which is 0x3321dcaf8911d384 2e14a7a4 15be 2fb1a337f43e. Then use the splicing contract address + slot position,
(1)的状态key是:The status key of (1) is:
0x3321dcaf8911d3842e14a7a415be2fb1a337f43e00000000000000000000000000000000000000000000000000000000000000000x3321dcaf8911d3842e14a7a415be2fb1a337f43e0000000000000000000000000000000000000000000000000000000000
(2)的状态key是:The status key of (2) is:
0x3321dcaf8911d3842e14a7a415be2fb1a337f43e00000000000000000000000000000000000000000000000000000000000000010x3321dcaf8911d3842e14a7a415be2fb1a337f43e000000000000000000000000000000000000000000000000000000000000001
这些产生的状态key-value,在存入底层数据库的过程中,如图6所示,可以由构建树模块将这些key转换到如图5所示的storage trie树上。然后,构建树模块按照树的结构将状态key-value中的value将构建到这颗MPT树的某个叶子节点中。需要说明的是,如前所述,状态key分拆分成若干小段,按照从storage trie树的根至叶子节点的方向存储在顺序存储在树节点中。至于每个树节点存储状态key的哪一段,取决于该状态key与树上其它状态key的共有前缀的情况。由该叶子节点向上经中间节点直到根节点,会引起一连串的hash值变化,构建树模块会构建这些树节点的kv。进而,构建树模块将这些发生变化的树节点kv,发送至状态数据库模块,最后由状态数据库模块存储到图6所示的状态数据库中。During the process of storing these generated state key-values in the underlying database, as shown in Figure 6, these keys can be converted into the storage trie tree as shown in Figure 5 by the tree building module. Then, the tree building module constructs the value in the state key-value into a leaf node of the MPT tree according to the tree structure. It should be noted that, as mentioned above, the state key is divided into several small segments and stored in the tree nodes in order from the root of the storage trie tree to the leaf nodes. As for which segment of the state key is stored in each tree node, it depends on the common prefix between the state key and other state keys in the tree. From the leaf node upward through the intermediate node to the root node, a series of hash value changes will occur, and the tree building module will build the kv of these tree nodes. Furthermore, the building tree module sends these changed tree nodes kv to the state database module, and finally the state database module stores them in the state database shown in Figure 6.
上述过程中,读写状态变量的slot是在编译环节即固定下来的,如前所述由合约地址和状态变量的slot位置确定。状态key可以是由合约地址和slot拼接得到的。这样,运行同一个合约,读写相同的状态变量会采用相同的slot,并对应固定的状态key。In the above process, the slot for reading and writing state variables is fixed during the compilation process. As mentioned above, it is determined by the contract address and the slot position of the state variable. The state key can be obtained by concatenating the contract address and slot. In this way, running the same contract and reading and writing the same state variables will use the same slot and correspond to a fixed state key.
如前所述,一些区块链中进行合约升级,保持升级后的合约与升级前的合约是相同的合约地址。尽管如此,保障升级前后的兼容性,即升级后的新合约中的状态保持能够读取旧合约中相同状态的值的能力,仍然是具有一定挑战的。As mentioned before, some blockchains perform contract upgrades to keep the upgraded contract at the same contract address as the pre-upgraded contract. Despite this, it is still challenging to ensure compatibility before and after upgrade, that is, the state in the upgraded new contract maintains the ability to read the value of the same state in the old contract.
例如以下代码示例3是升级后的新合约的代码示例:For example, the following code example 3 is a code example of the upgraded new contract:
Figure PCTCN2022135220-appb-000003
Figure PCTCN2022135220-appb-000003
Figure PCTCN2022135220-appb-000004
Figure PCTCN2022135220-appb-000004
代码示例3.demo2的solidity代码Code example 3. solidity code of demo2
上述demo2的代码,相对于demo1来说,在demo2的第4行插入了新的状态变量age。这样,原来位于demo1中第4行的sex在demo2中顺序后移,变为了第5行。此外,demo2中在第15-17行和19-21行增加了新的状态变量相关的写、读函数,分别为setAge()和getAge()。The above code of demo2, compared to demo1, inserts a new state variable age in line 4 of demo2. In this way, the sex originally located on line 4 in demo1 is moved backward in demo2 and becomes line 5. In addition, demo2 adds new write and read functions related to state variables in lines 15-17 and 19-21, which are setAge() and getAge() respectively.
新的demo2代码,仍然会由编译器编译得到字节码。编译过程中,类似的,编译器会对上述demo2中的ID、age、sex这3个状态变量生成slot位置,分别是:The new demo2 code will still be compiled by the compiler to obtain bytecode. During the compilation process, similarly, the compiler will generate slot positions for the three state variables ID, age, and sex in the above demo2, which are:
0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00000x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00010x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001
0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00020x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0002
这样,在编译器编译得到的合约字节码中,可以分别用slot中的这3个位置来代替3个状态变量的标识,如上述3个256bits分别代替ID、age、sex。经过编译后的合约字节码在加载进虚拟机并执行时,对于ID的操作即是对0x000...00这个slot位置的操作,对于age的操作即是对0x000...01这个slot位置的操作,对于sex的操作即是对0x000...02这个slot位置的操作。In this way, in the contract bytecode compiled by the compiler, these three positions in the slot can be used to replace the identifiers of the three state variables. For example, the above three 256 bits replace ID, age, and sex respectively. When the compiled contract bytecode is loaded into the virtual machine and executed, the operation on ID is the operation on the slot position 0x000...00, and the operation on age is the operation on the slot position 0x000...01 The operation of sex is the operation of the slot position 0x000...02.
可见,对于升级后的demo2代码第5行的sex,显然即是升级前demo1代码的第4行,两者是相同含义的状态变量。而上述这种升级方式,升级后对0x000...01这个slot位置的操作,变为了age的操作,与升级前的0x000...01这个slot位置上的sex显然不符。那么,如果执行升级后的合约字节码读取age的操作,则会读取到升级前这个位置上实际是sex的值,造成了混乱。另一方面,升级后对sex的操作,变为了对0x000...01这个slot位置的操作。那么,如果执行升级后对合约字节码读取sex的操作,则由于之前在这个位置上没有值,则只能读取到默认的空值或0值,而无法读取到正确的值。It can be seen that the sex in the 5th line of the demo2 code after the upgrade is obviously the 4th line of the demo1 code before the upgrade, and both are state variables with the same meaning. In the above upgrade method, the operation of the slot position 0x000...01 after the upgrade changes to the operation of age, which is obviously inconsistent with the sex of the slot position 0x000...01 before the upgrade. Then, if you execute the operation of reading age in the upgraded contract bytecode, you will read the actual sex value at this position before the upgrade, causing confusion. On the other hand, after the upgrade, the operation of sex has changed to the operation of the slot position 0x000...01. Then, if you perform the operation of reading sex from the contract bytecode after the upgrade, since there was no value at this position before, you can only read the default null value or 0 value, but cannot read the correct value.
本申请一种检测合约升级的兼容性方案,如图10所示,包括:This application provides a compatibility solution for detecting contract upgrades, as shown in Figure 10, including:
S110:生成升级前后合约的抽象语法树。S110: Generate abstract syntax trees of contracts before and after the upgrade.
前述提到,编译器编译的过程大致可以包括根据抽象语法树进行词法/语法分析,根据符号表填充符号,语义分析和代码生成等步骤。As mentioned above, the compiler compilation process can roughly include lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation.
这里,可以对升级前后的智能合约代码根据抽象语法树进行词法/语法分析,生成升级前后的合约的抽象语法树。具体的,可以利用solidity编译器,使用”--ast-compact-json”命令,将升级前后的合约源码为输入,则可以分别生成升级前后合约的抽象语法树json(JavaScript Object Notation,JavaScript对象表示法)文件。Here, lexical/grammatical analysis can be performed on the smart contract code before and after the upgrade based on the abstract syntax tree, and the abstract syntax tree of the contract before and after the upgrade can be generated. Specifically, you can use the solidity compiler and use the "--ast-compact-json" command to input the contract source code before and after the upgrade. Then you can generate the abstract syntax tree json (JavaScript Object Notation, JavaScript object representation) of the contract before and after the upgrade respectively. law) documents.
例如上述升级前的demo1合约,根据抽象语法树进行词法/语法分析,生成升级前的合约的抽象语法树如下:For example, for the demo1 contract before the above upgrade, lexical/grammatical analysis is performed based on the abstract syntax tree, and the abstract syntax tree of the contract before the upgrade is generated as follows:
Figure PCTCN2022135220-appb-000005
Figure PCTCN2022135220-appb-000005
Figure PCTCN2022135220-appb-000006
Figure PCTCN2022135220-appb-000006
代码示例4.demo1的抽象语法树Code example 4.Abstract syntax tree of demo1
上面代码示例4是生成的升级前demo1合约的抽象语法树,并以//...标出了注释。其中,第11行的存储位置是关于slot的计算规则。上面的抽象语法树中,从第3行nodes以下,描述了合约状态变量在抽象语法树中的信息,一个状态变量相关的信息放在一个node中。关于ID和sex这两个状态变量,实际上分为了两个节点,第5-19行是关于ID的第一个node,第20-34行是关于sex的第二个node。每个node内部又包含若干信息。整体上,一个node内部的信息可以称为抽象语法树的节点信息。The above code example 4 is the generated abstract syntax tree of the demo1 contract before the upgrade, and is marked with //... comments. Among them, the storage location in line 11 is the calculation rule for slot. In the abstract syntax tree above, from the 3rd line nodes onwards, the information of the contract state variables in the abstract syntax tree is described. Information related to a state variable is placed in a node. Regarding the two state variables ID and sex, they are actually divided into two nodes. Lines 5-19 are the first node about ID, and lines 20-34 are the second node about sex. Each node contains several pieces of information. On the whole, the information inside a node can be called the node information of the abstract syntax tree.
例如上述升级后的demo2合约,根据抽象语法树进行词法/语法分析,生成升级前的合约的抽象语法树如下:For example, for the above-mentioned upgraded demo2 contract, lexical/grammatical analysis is performed based on the abstract syntax tree, and the abstract syntax tree of the contract before the upgrade is generated as follows:
Figure PCTCN2022135220-appb-000007
Figure PCTCN2022135220-appb-000007
Figure PCTCN2022135220-appb-000008
Figure PCTCN2022135220-appb-000008
代码示例5.demo2的抽象语法树Code example 5.Abstract syntax tree of demo2
上面代码示例5是生成的升级后demo2合约的抽象语法树。由于demo2代码中在原来的ID和sex之间加入了一行对声明age变量,即demo2中的第4行,因此在demo2的抽象语法树中第5-19行的ID节点信息和第35-50行的sex节点信息之间,插入了第20-34行的age节点信息。The above code example 5 is the generated abstract syntax tree of the upgraded demo2 contract. Since a line is added between the original ID and sex in the demo2 code to declare the age variable, which is line 4 in demo2, the ID node information in lines 5-19 and 35-50 in the abstract syntax tree of demo2 Between the sex node information of the rows, the age node information of rows 20-34 is inserted.
S120:解析生成的抽象语法树,顺序提取每个抽象语法树的节点信息中的基础信息。S120: Parse the generated abstract syntax tree, and sequentially extract basic information from the node information of each abstract syntax tree.
对上述代码示例4和5进行解析,即解析生成的抽象语法树,按照顺序提取每个抽象语法树的节点信息中的基础信息。这里的基础信息,至少可以包括节点顺序,进一步包括状态变量名和/或类型。Parse the above code examples 4 and 5, that is, parse the generated abstract syntax tree, and extract the basic information in the node information of each abstract syntax tree in order. The basic information here can at least include the node order, and further include the state variable name and/or type.
以下一个例子,是提取得到升级前的demo1合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息:The following example is to extract the abstract syntax tree node information including node order, state variable name and state variable type from the demo1 contract code before the upgrade:
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}
同样的,提取得到升级后的demo2合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息,如下:Similarly, extract the abstract syntax tree node information including node order, state variable name and state variable type of the upgraded demo2 contract code, as follows:
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:age{typeString:uint256,...}Node information 2: age{typeString:uint256,...}
节点信息3:sex{typeString:bool,...}Node information 3: sex{typeString:bool,...}
S130:比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。S130: Compare the basic information in the node information of the abstract syntax tree before and after the upgrade, and obtain a compatibility conclusion.
例如上面S120中升级前后的抽象语法树节点信息中的基础信息,可以得到如下对照表:For example, the basic information in the abstract syntax tree node information before and after the upgrade in S120 above can be obtained as the following comparison table:
升级前的抽象语法树节点信息中的基础信息Basic information in abstract syntax tree node information before upgrade 升级后的抽象语法树节点信息中的基础信息Basic information in the upgraded abstract syntax tree node information
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...} 节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:sex{typeString:bool,...}Node information 2: sex{typeString:bool,...} 节点信息2:age{typeString:uint256,...}Node information 2: age{typeString:uint256,...}
  节点信息3:sex{typeString:bool,...}Node information 3: sex{typeString:bool,...}
表2、升级前后的抽象语法树的节点信息对比表Table 2. Comparison table of node information of the abstract syntax tree before and after the upgrade
上述表2,可以按照顺序将升级前后的抽象语法树的节点信息置于同一行中。进而,通过逐行扫描和比对,可以判断左右是否相同。换句话说,也就是比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。当然,最好按照节点编号顺序,比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。上面表2中,在扫描到节点信息2这一行时,可以比较得到状态变量名不同,从而可以得到不兼容的结论。In Table 2 above, the node information of the abstract syntax tree before and after the upgrade can be placed in the same row in order. Furthermore, through line-by-line scanning and comparison, it can be determined whether the left and right are the same. In other words, it is to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade. Of course, it is best to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade in the order of node numbers. In Table 2 above, when the row of node information 2 is scanned, the state variable names can be compared and found to be different, and the conclusion that they are incompatible can be drawn.
这是因为,如前所述,按照slot的生成规则,是按照节点信息生成slot。进而,由合约地址和slot拼接生成状态变量的状态key。也就是说,在节点信息2这一行生成的slot是相同的,例如都是0x000...01,而与变量名并不相关。这样,同样的slot或者状态key在合约升级后根据合约逻辑产生的value是另一个不同于升级前的状态变量,一般是前后不一致的。之所以说一般是前后不一致的,是因为,如果升级前后仅仅是调整了状态变量名,涉及该状态变量的合约逻辑没有任何改变,实际上也不会引起不兼容,但是这样的合约升级情况较为少见。大多情况下调整了状态变量名后合约逻辑也会存在一定变化。所以,进一步的,可以结合状态变量类型来判断。如果比较同一行节点信息中状态变量类型发生改变,也可以得到不兼容的结论。当然,如果比较同一行节点信息中状态变量名和状态变量类型均发生改变,可以得到更强的不兼容结论。This is because, as mentioned earlier, according to the slot generation rules, slots are generated based on node information. Furthermore, the state key of the state variable is generated by concatenating the contract address and slot. In other words, the slots generated in the node information 2 line are the same, for example, 0x000...01, and are not related to the variable names. In this way, the value generated by the same slot or state key according to the contract logic after the contract upgrade is another state variable different from the state variable before the upgrade, which is generally inconsistent. The reason why it is said that it is generally inconsistent is because if the name of the state variable is only adjusted before and after the upgrade, the contract logic involving the state variable will not change, and it will not actually cause incompatibility. However, such a contract upgrade situation is more complicated. Rare. In most cases, there will be certain changes in the contract logic after adjusting the state variable name. Therefore, further, it can be judged based on the type of state variables. If the state variable type changes in the node information of the same row, incompatible conclusions can also be drawn. Of course, if you compare the state variable name and state variable type in the same row of node information, you can get a stronger incompatibility conclusion.
此外,比较升级前后的抽象语法树的节点信息中的基础信息,如果升级后的状态变量名所在节点与升级前的相同状态变量名所在抽象语法树节点的顺序不同,则有较大概率可以得出不兼容结论。这是因为,一般来说,升级前后相同的状态变量名指代相同的状态变量;那么如果升级前后相同的状态变量所在的抽象语法树节点的顺序不同,则slot一般不同,则对数据的读写会产生混乱。In addition, compare the basic information in the node information of the abstract syntax tree before and after the upgrade. If the node where the state variable name is located after the upgrade is in a different order than the abstract syntax tree node where the same state variable name is before the upgrade, there is a high probability that the draw an incompatible conclusion. This is because, generally speaking, the same state variable name before and after the upgrade refers to the same state variable; then if the order of the abstract syntax tree nodes where the same state variable is located before and after the upgrade is different, the slots are generally different, and the reading of the data Writing creates confusion.
实际上,通过S130的比对,如果结论是不兼容,还可以得到冲突的slot,例如上面表2中的节点顺序2。这样,可以将冲突的slot位置的基础信息/节点信息反馈给开发者,例如生成日志、告警之类的记录方式,或通过屏幕提示、邮件、即时消息等方式通知,以建议开发者或平台定位不兼容之处,并特别有利于开发者做出修改。In fact, through S130 comparison, if the conclusion is that they are incompatible, conflicting slots can also be obtained, such as node order 2 in Table 2 above. In this way, the basic information/node information of conflicting slot locations can be fed back to developers, such as generating logs, alarms, etc., or notified through screen prompts, emails, instant messages, etc. to advise developers or platform positioning Incompatibilities, and are particularly helpful for developers to make modifications.
还存在一种情况,升级后的状态变量与升级前相比,完全是在原有状态变量之后追加新的状态变量,则可以得到兼容的结论。例如如下示例代码:There is also a situation where, compared with the pre-upgrade state variables, new state variables are added after the original state variables, and a compatible conclusion can be drawn. For example, the following sample code:
Figure PCTCN2022135220-appb-000009
Figure PCTCN2022135220-appb-000009
代码示例6.demo2'的solidity代码Code Example 6. Solidity code for 'demo2'
可见,demo2'的代码中,相比于demo1,是在原有的ID和sex之后追加了age这一状态变量。这样,demo2'的抽象语法树如下:It can be seen that in the code of demo2', compared with demo1, the state variable age is added after the original ID and sex. In this way, the abstract syntax tree of demo2' is as follows:
Figure PCTCN2022135220-appb-000010
Figure PCTCN2022135220-appb-000010
Figure PCTCN2022135220-appb-000011
Figure PCTCN2022135220-appb-000011
代码示例7.demo2'的抽象语法树Code example 7.demo2' abstract syntax tree
按照S120,提取得到升级后的demo2'合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息,如下:According to S120, extract the abstract syntax tree node information including node order, state variable name and state variable type of the upgraded demo2' contract code, as follows:
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}
节点信息3:age{typeString:uint256,...}Node information 3: age{typeString:uint256,...}
这样,升级前后的抽象语法树节点信息中的基础信息,可以得到如下对照表:In this way, the basic information in the abstract syntax tree node information before and after the upgrade can be obtained as the following comparison table:
升级前的抽象语法树节点信息中的基础信息Basic information in abstract syntax tree node information before upgrade 升级后的抽象语法树节点信息中的基础信息Basic information in the upgraded abstract syntax tree node information
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...} 节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:sex{typeString:bool,...}Node information 2: sex{typeString:bool,...} 节点信息2:sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}
  节点信息3:age{typeString:uint256,...}Node information 3: age{typeString:uint256,...}
表3、升级前后的抽象语法树的节点信息对比表Table 3. Comparison table of node information of the abstract syntax tree before and after the upgrade
由表3可见,升级前的状态变量ID与sex在升级后的代码总状态变量名、状态变量类型和抽象语法树节点顺序都没有改变,这样,升级后的ID和sex保持了与升级前相同的slot,因此是兼容的。升级后的状态变量,是在升级前的状态变量之后新追加,则按照slot生成规则,不会影响在前的状态变量的slot,也就不会影响状态key,因此可以得到兼容的结论。显然的,这种情况按照节点编号顺序进行比较更加容易实现。As can be seen from Table 3, the state variable ID and sex before the upgrade have not changed. The total state variable name, state variable type and abstract syntax tree node order of the code after the upgrade have not changed. In this way, the ID and sex after the upgrade remain the same as before the upgrade. slot and therefore are compatible. The upgraded state variable is newly added after the pre-upgraded state variable. According to the slot generation rules, it will not affect the slot of the previous state variable, and will not affect the state key, so a compatible conclusion can be drawn. Obviously, in this case, it is easier to compare in node number order.
需要说明的是,上面例子中的ID、age、sex这些状态变量,类型分别是uint256、uint256和bool,其中uint256为256bits,即32bytes,bool类型为1bytes。这些状态变量的类型决定了数据的长度是固定的,或者说是定长的。此外,还有uint、uint8、uint128这类的类型等也是定长的。定长元素组成的确定数量的数组,也是定长的,比如uint[2],包括2个元素,每个元素都是32bytes的uint类型,因此uint[2]整体是64bytes。It should be noted that the status variables of ID, age, and sex in the above example are of type uint256, uint256, and bool respectively, where uint256 is 256 bits, that is, 32 bytes, and the bool type is 1 byte. The type of these state variables determines that the length of the data is fixed, or fixed-length. In addition, there are types such as uint, uint8, uint128, etc., which are also fixed-length. An array of a certain number of fixed-length elements is also fixed-length. For example, uint[2] includes 2 elements. Each element is a uint type of 32 bytes, so the overall uint[2] is 64 bytes.
除了定长的数据存储外,还有不定长度的,或者说是数据大小是无法预知的。这种情况,按照定长的方式在编译期间无法直接确定存储位置,而是采用了不同的方案。例如字典(mapping)这种不确定长度的数据类型。字典的存储布局是存储Key及其对应的value,每个Key对应一份存储。一个Key的对应存储位置是keccak256(key.slot),其中“.”是拼接符号,“.”之前的key是一个字典元素的key,“.”之后的slot字典名称所在slot的位置。例如demo2合约中,在ID、age、sex之后是mappping(uint256=>string)a,则这个字典的元素个数不确定,且元素中key和value的长度也不确定。demo2合约被调用若干次后,字典a的元素可能有2个,分别是:In addition to fixed-length data storage, there are also variable-length data storage, or the data size is unpredictable. In this case, the storage location cannot be directly determined during compilation according to the fixed-length method, but a different solution is adopted. For example, dictionary (mapping) is a data type of indefinite length. The storage layout of the dictionary is to store Key and its corresponding value, and each Key corresponds to one storage. The corresponding storage location of a Key is keccak256 (key.slot), where "." is the splicing symbol, the key before "." is the key of a dictionary element, and the slot dictionary name after "." is located in the position of the slot. For example, in the demo2 contract, after ID, age, and sex, there is mapping (uint256 => string) a. Then the number of elements in this dictionary is uncertain, and the lengths of key and value in the elements are also uncertain. After the demo2 contract is called several times, there may be two elements in dictionary a, which are:
a["u1"]=0x18;a["u1"]=0x18;
a["u2"]=0xac5b4fc54a5fa637d8c9853ada1430ea9203817e8a97df1f85f8e63a30f6713d7d3f68f79db22df669f5dbe17d43a16c720fe92edd5d87843ebf0b0b59;a ["u2"] = 0xac5B4FC54A5FA637D8C9853ADA1430EA9203817DF1F1F85F8E63A30F6713F68F79DB22DBE17D43A16C720EDD5D8787843 EBF0B0B59;
则在0x000...06这个slot中,可以存储字典的名称a。字典a中的第1项的key是u1,value是ox18。u1的存储位置可以是keccak256("u1".0x000...03),在这个位置起始的一个或连续多个slot中可以存储value(例如value的数据长度大于32bytes)。类似的,字典a中的第2项的key是u2,value是很长的一段16进制数。u2的存储位置可以是keccak256("u2".0x000...03),在这个位置起始的一个或连续多个slot中可以存储value(例如value的数据长度大于32bytes)。Then in the slot 0x000...06, the name a of the dictionary can be stored. The key of the first item in dictionary a is u1 and the value is ox18. The storage location of u1 can be keccak256 ("u1".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes). Similarly, the key of the second item in dictionary a is u2, and the value is a long hexadecimal number. The storage location of u2 can be keccak256 ("u2".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes).
以u1的value的长度小于32bytes,u12的value的长度大于32bytes、小于64bytes为例。Take the value length of u1 as less than 32 bytes and the value length of u12 as greater than 32 bytes and less than 64 bytes as an example.
如图11所示,u1的value的长度小于32bytes,其value可以存放于1个slot中,这个slot的位置例如是keccak256("u1",0x000...03)的值,例如是:As shown in Figure 11, the value length of u1 is less than 32 bytes, and its value can be stored in one slot. The location of this slot is, for example, the value of keccak256("u1",0x000...03), for example:
0x5b4ded6cc1629f138186f4b0795004adbed7ec13374d15ca04ec96f1491324600x5b4ded6cc1629f138186f4b0795004adbed7ec13374d15ca04ec96f149132460
u2的value的长度大于32bytes且小于64bytes,其value可以存放于2个连续的slot中,这两个slot的起始位置例如是keccak256("u2",0x000...03)的值,则这两个slot的位置是:The value length of u2 is greater than 32 bytes and less than 64 bytes. Its value can be stored in two consecutive slots. The starting positions of these two slots are, for example, the value of keccak256("u2",0x000...03). Then this The locations of the two slots are:
0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081d0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081d
0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081e0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081e
上述不定长的数据,也可以通过执行上述S110~S130的过程来判断是否兼容。例如对上述字典结构生成抽象语法树节点信息为:Whether the data of the above-mentioned variable length is compatible can also be determined by executing the above-mentioned processes of S110 to S130. For example, the abstract syntax tree node information generated for the above dictionary structure is:
Figure PCTCN2022135220-appb-000012
Figure PCTCN2022135220-appb-000012
代码示例8.字典结构的抽象语法树Code example 8. Abstract syntax tree of dictionary structure
则在执行S120中的过程中,还需要遍历不定长数据结构中的每个slot,因为这些slot的位置可能是最终的状态key的计算基础。特别的,对于u2这种value占用的多个连续slot的情况,可以以起始位置作为状态key的计算基础。During the execution of S120, each slot in the variable-length data structure needs to be traversed, because the positions of these slots may be the basis for calculating the final state key. In particular, for situations where a value like u2 occupies multiple consecutive slots, the starting position can be used as the basis for calculating the state key.
此外,还存在一些复合结构。例如结构体和字典的复合结构,如下solidity代码:In addition, some composite structures exist. For example, the composite structure of structure and dictionary is as follows solidity code:
Figure PCTCN2022135220-appb-000013
Figure PCTCN2022135220-appb-000013
代码示例9.复合结构的solidity代码Code Example 9. Solidity code for composite structures
上述代码中,在第7-10声明了一个结构体StructDemo,其包括两个元素,分别是uint256类型的c_和bytes类型的d_。然后,在第11行声明了一个字典,这个字典是uint到结构体StructDemo的映射。上述代码的结构体和字典部分,按照S110生成的抽象语法树包括:In the above code, a structure StructDemo is declared in pages 7-10, which includes two elements, c_ of uint256 type and d_ of bytes type. Then, a dictionary is declared on line 11. This dictionary is a mapping of uint to the structure StructDemo. The structure and dictionary part of the above code, the abstract syntax tree generated according to S110 includes:
Figure PCTCN2022135220-appb-000014
Figure PCTCN2022135220-appb-000014
Figure PCTCN2022135220-appb-000015
Figure PCTCN2022135220-appb-000015
代码示例10.复合结构的抽象语法树Code example 10. Abstract syntax tree of compound structure
上述抽象语法树的第6-17行是结构体StructDemo中uint256c_的节点信息,第18-31行是结构体StructDemo中bytes d_的节点信息,bytes类型也是32字节。需要注意的是,第10行和第23行都是"stateVariable":false,说明c_和d_都不是状态变量,也就都不会存储在底层数据库中。Lines 6-17 of the above abstract syntax tree are the node information of uint256c_ in the structure StructDemo, and lines 18-31 are the node information of bytes d_ in the structure StructDemo. The bytes type is also 32 bytes. It should be noted that lines 10 and 23 are both "stateVariable": false, indicating that neither c_ nor d_ is a state variable, and neither will be stored in the underlying database.
则在S120中,并不会提取"stateVariable":false的结构体元素c_和d_,因为c_和d_在这里还不会直接存储到底层数据库中。但在字典a中,"stateVariable"为true,则会提取这个抽象语法树的节点信息中的基础信息,因为字典a会存储到底层数据库中,其中的元素中包括的结构体c_和d_才会存储到底层数据库中。也就是说,结构体仅仅做了声明。这样,S120中,解析生成的抽象语法树,对于节点信息中状态变量为真的,顺序提取每个抽象语法树的节点信息中的基础信息。Then in S120, the structure elements c_ and d_ of "stateVariable":false will not be extracted, because c_ and d_ will not be directly stored in the underlying database here. But in dictionary a, "stateVariable" is true, the basic information in the node information of this abstract syntax tree will be extracted, because dictionary a will be stored in the underlying database, and the elements include structures c_ and d_ will be stored in the underlying database. In other words, the structure only declares it. In this way, in S120, the abstract syntax tree generated by parsing is used. If the state variable in the node information is true, the basic information in the node information of each abstract syntax tree is sequentially extracted.
此外,对于复合结构,例如上述字典结构中value是结构体的情况,解析生成的抽象语法树,顺序并递归的提取每个抽象语法树的节点信息中的基础信息。这是因为,虽然结构体本身的节点信息中"stateVariable"的属性是false,但作为字典中的value值,其具有自身的slot和状态key。通过顺序并递归的方式提取节点信息的基础信息,可以展开所有实际会存储到底层数据库的状态变量,才不会遗漏。递归的方式,是将嵌套的结构定义展开,从中获得包含的数据结构;如果有更多层的嵌套,则继续以递归的方式获得所包含的数据结构。In addition, for compound structures, such as the case where value is a structure in the above dictionary structure, the generated abstract syntax tree is parsed, and the basic information in the node information of each abstract syntax tree is sequentially and recursively extracted. This is because, although the "stateVariable" attribute in the node information of the structure itself is false, as a value in the dictionary, it has its own slot and state key. By extracting the basic information of node information sequentially and recursively, all state variables that will actually be stored in the underlying database can be expanded so that they are not missed. The recursive method is to expand the nested structure definition to obtain the included data structure; if there are more levels of nesting, continue to obtain the included data structure recursively.
这个例子中,假设在合约代码中初始化了2个map2_,分别为:In this example, assume that two map2_ are initialized in the contract code, respectively:
Figure PCTCN2022135220-appb-000016
Figure PCTCN2022135220-appb-000016
Figure PCTCN2022135220-appb-000017
Figure PCTCN2022135220-appb-000017
则提取得到的demo2″合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息,如下:Then extract the abstract syntax tree node information of the obtained demo2″ contract code including node order, state variable name and state variable type, as follows:
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:age{typeString:uint256,...}Node information 2: age{typeString:uint256,...}
节点信息3:sex{typeString:bool,...}Node information 3: sex{typeString:bool,...}
节点信息4:map2_{typeString:mapping(uint256=>struct demo″_.StructDemo),...}Node information 4: map2_{typeString:mapping(uint256=>struct demo″_.StructDemo),...}
节点信息5:map2_StructDemo_0:c_{typeString:uint256,...}Node information 5: map2_StructDemo_0:c_{typeString:uint256,...}
节点信息6:map2_StructDemo_0:d_{typeString:bytes,...}Node information 6: map2_StructDemo_0:d_{typeString:bytes,...}
节点信息7:map2_StructDemo_1:c_{typeString:uint256,...}Node information 7: map2_StructDemo_1:c_{typeString:uint256,...}
节点信息8:map2_StructDemo_2:d_{typeString:bytes,...}Node information 8: map2_StructDemo_2:d_{typeString:bytes,...}
如果有另外一个升级后的包括ID、age、sex、map2_的合约,提取得到的合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息,与上述一致,则兼容,反之不一致则不兼容。上述例子的slot示意如图12所示。If there is another upgraded contract including ID, age, sex, and map2_, the abstract syntax tree node information of the extracted contract code including node order, state variable name, and state variable type is consistent with the above, then it is compatible, and vice versa. Inconsistency means incompatibility. The slot diagram of the above example is shown in Figure 12.
另一个例子中,假设在合约代码中没有初始化了map2_,得到的合约代码的包含节点顺序、状态变量名和状态变量类型的抽象语法树节点信息例如为如下:In another example, assuming that map2_ is not initialized in the contract code, the resulting abstract syntax tree node information including node order, state variable name, and state variable type of the contract code is as follows:
节点信息1:ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2:age{typeString:uint256,...}Node information 2: age{typeString:uint256,...}
节点信息3:sex{typeString:bool,...}Node information 3: sex{typeString:bool,...}
节点信息4:map2_{typeString:mapping(uint256=>struct demo″_.StructDemo),...}Node information 4: map2_{typeString:mapping(uint256=>struct demo″_.StructDemo),...}
则合约升级前后,在比较节点信息2时,需要递归的比较嵌套的struct结构。如果升级前后分别提取的抽象语法树节点信息中,节点信息4及其中嵌套的结构均一致则兼容(假设节点信息1、2、3均一致),反之不一致则不兼容。Then before and after the contract upgrade, when comparing node information 2, the nested struct structure needs to be compared recursively. If the abstract syntax tree node information extracted before and after the upgrade, node information 4 and the structures nested in it are consistent, then it is compatible (assuming that node information 1, 2, and 3 are all consistent), otherwise it is incompatible.
升级的合约写法需要满足一定的规范才能使得升级前后兼容,即升级后的新合约中的状态,需要保持能够读取旧合约中相同状态的值的能力。用户在编写升级的合约时,往往忽略这些规范,从而导致升级后的合约出现数据丢失、错乱等严重问题。通过上述例子,可以实现基于抽象语法树的solidity合约升级存储数据兼容性检测方案。The upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts. Through the above example, a solidity contract upgrade storage data compatibility detection solution based on abstract syntax trees can be implemented.
以下介绍本申请一种合约升级的兼容性的检测装置,包括:抽象语法树生成单元,用于生成升级前后合约的抽象语法树;提取单元,用于解析生成的抽象语法树,并顺序提取每个抽象语法树的节点信息中的基础信息;比较单元,用于比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。The following is an introduction to a contract upgrade compatibility detection device of this application, which includes: an abstract syntax tree generation unit, used to generate abstract syntax trees of contracts before and after the upgrade; an extraction unit, used to parse the generated abstract syntax tree and sequentially extract each The basic information in the node information of an abstract syntax tree; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
所述提取单元解析生成的抽象语法树,对于节点信息中状态变量为真的,顺序提取每个抽象语法树的节点信息中的基础信息。The extraction unit parses the generated abstract syntax tree, and if the state variable in the node information is true, sequentially extracts the basic information in the node information of each abstract syntax tree.
所述抽象语法树生成单元对升级前后的智能合约代码根据抽象语法树进行词法/语法分析,生成升级前后的合约的抽象语法树。The abstract syntax tree generation unit performs lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generates the abstract syntax tree of the contract before and after the upgrade.
所述基础信息包括抽象语法树的节点顺序,进一步还包括状态变量名和/或类型。The basic information includes the node order of the abstract syntax tree, and further includes state variable names and/or types.
所述比较单元比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。The comparison unit compares basic information in node information with the same node number in the abstract syntax tree before and after the upgrade.
所述比较单元按照节点编号顺序,比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。The comparison unit compares the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade according to the node number sequence.
比较单元比较过程中,如果比较得到状态变量名不同,则不兼容;或,如果比较得到状态变量类型不同,则不兼容;或,如果状态变量名和状态变量类型均不同,则不兼容。During the comparison process of the comparison unit, if the state variable names obtained by comparison are different, it is incompatible; or if the state variable types obtained by comparison are different, it is incompatible; or if the state variable names and state variable types are both different, it is incompatible.
所述比较单元比较升级后的状态变量是在升级前状态变量之后追加新的状态变量的,判断为兼容。The comparison unit compares the upgraded state variable with a new state variable added after the pre-upgraded state variable, and determines that it is compatible.
对于复合结构,所述提取单元解析生成的抽象语法树,顺序并递归的提取每个抽象语法树的节点信息中的基础信息。For compound structures, the extraction unit parses the generated abstract syntax tree, and sequentially and recursively extracts basic information in the node information of each abstract syntax tree.
所述检测装置还包括反馈单元,对于比较单元的比较结果为不兼容的,所述反馈单元反馈冲突的slot位置的基础信息/节点信息。The detection device also includes a feedback unit, where the comparison result of the comparison unit is incompatible, and the feedback unit feeds back the basic information/node information of the conflicting slot position.
以下介绍本申请一种客户端实施例,包括:处理器,存储器,存储有程序,其中在所述处理器执行所述程序时,执行上述任一实施例所述的方法,以实现譬如检测合约升级的兼容性等目的。The following introduces a client embodiment of the present application, which includes: a processor, a memory, and a program stored therein. When the processor executes the program, the method described in any of the above embodiments is executed to implement, for example, contract detection. Upgrade compatibility and other purposes.
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关 等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。In the 1990s, improvements in a technology could be clearly distinguished as hardware improvements (for example, improvements in circuit structures such as diodes, transistors, switches, etc.) or software improvements (improvements in method processes). However, with the development of technology, many improvements in today's method processes can be regarded as direct improvements in hardware circuit structures. Designers almost always obtain the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that an improvement of a method flow cannot be implemented using hardware entity modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic functions are determined by the user programming the device. Designers can program themselves to "integrate" a digital system on a PLD, instead of asking chip manufacturers to design and produce dedicated integrated circuit chips. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is mostly implemented using "logic compiler" software, which is similar to the software compiler used in program development and writing, and before compilation The original code must also be written in a specific programming language, which is called Hardware Description Language (HDL), and HDL is not just one kind, but there are many, such as ABEL (Advanced Boolean Expression Language) , AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., are currently the most commonly used The two are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should also know that by simply logically programming the method flow using the above-mentioned hardware description languages and programming it into the integrated circuit, the hardware circuit that implements the logical method flow can be easily obtained.
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor. , logic gates, switches, Application Specific Integrated Circuit (ASIC), programmable logic controllers and embedded microcontrollers. Examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, For Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic. Those skilled in the art also know that in addition to implementing the controller in the form of pure computer-readable program code, the controller can be completely programmed with logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded logic by logically programming the method steps. Microcontroller, etc. to achieve the same function. Therefore, this controller can be considered as a hardware component, and the devices included therein for implementing various functions can also be considered as structures within the hardware component. Or even, the means for implementing various functions can be considered as structures within hardware components as well as software modules implementing the methods.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为服务器系统。当然,本申请不排除随着未来计算机技术的发展,实现上述实施例功能的计算机例如可以为个人计算机、膝上型计算机、车载人机交互设备、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。The systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a server system. Of course, this application does not rule out that with the development of computer technology in the future, the computer that implements the functions of the above embodiments may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular phone, a camera phone, a smart phone, or a personal digital assistant. , media player, navigation device, email device, game console, tablet, wearable device, or a combination of any of these devices.
虽然本说明书一个或多个实施例提供了如实施例或流程图所述的方法操作步骤,但基于常规或者无创造性的手段可以包括更多或者更少的操作步骤。实施例中列举的步骤顺序仅仅为众多步骤执行顺序中的一种方式,不代表唯一的执行顺序。在实际中的装置或终端产品执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至为分布式数据处理环境)。术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、产品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、产品或者设备所固有的要素。在没有更多限制的情况下,并不排除在包括所述要素的过程、方法、产品或者设备中还存在另外的相同或等同要素。例如若使用到第一,第二等词语用来表示名称,而并不表示任何特定的顺序。Although one or more embodiments of this specification provide method operation steps as described in the embodiments or flow charts, more or fewer operation steps may be included based on conventional or non-inventive means. The sequence of steps listed in the embodiment is only one way of executing the sequence of many steps, and does not represent the only execution sequence. When the actual device or terminal product is executed, it may be executed sequentially or in parallel according to the methods shown in the embodiments or figures (for example, a parallel processor or a multi-thread processing environment, or even a distributed data processing environment). The terms "comprises," "comprises" or any other variation thereof are intended to cover a non-exclusive inclusion such that a process, method, product or apparatus including a list of elements includes not only those elements but also others not expressly listed elements, or also elements inherent to the process, method, product or equipment. Without further limitation, it does not exclude the presence of additional identical or equivalent elements in a process, method, product or apparatus including the stated elements. For example, if the words "first" and "second" are used to express names, they do not indicate any specific order.
为了描述的方便,描述以上装置时以功能分为各种模块分别描述。当然,在实施本说明书一个或多个时可以把各模块的功能在同一个或多个软件和/或硬件中实现,也可以将实现同一功能的模块由多个子模块或子单元的组合实现等。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。For the convenience of description, when describing the above device, the functions are divided into various modules and described separately. Of course, when implementing one or more of this specification, the functions of each module can be implemented in the same or multiple software and/or hardware, or the modules that implement the same function can be implemented by a combination of multiple sub-modules or sub-units, etc. . The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
本公开是参照根据本公开实施例的方法、装置(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储、石墨烯存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information. Information may be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory. (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape, magnetic tape storage, graphene storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device. As defined in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
本领域技术人员应明白,本说明书一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。It should be understood by those skilled in the art that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, one or more embodiments of the present description may employ a computer program implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. Product form.
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本本说明书一个或多个实施例,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。One or more embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. One or more embodiments of the present description may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本说明书的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。Each embodiment in this specification is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment. In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of this specification. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.
以上所述仅为本说明书一个或多个实施例的实施例而已,并不用于限制本本说明书一个或多个实施例。对于本领域技术人员来说,本说明书一个或多个实施例可以有各种更改和变化。凡在本说明书的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在权利要求范围之内。The above descriptions are only examples of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. To those skilled in the art, various modifications and changes may be made to one or more embodiments of this specification. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this specification shall be included in the scope of the claims.

Claims (21)

  1. 一种检测合约升级的兼容性的方法,包括:A method to detect the compatibility of contract upgrades, including:
    生成升级前后合约的抽象语法树;Generate abstract syntax trees of contracts before and after the upgrade;
    解析生成的抽象语法树,顺序提取每个抽象语法树的节点信息中的基础信息;Parse the generated abstract syntax tree and sequentially extract the basic information from the node information of each abstract syntax tree;
    比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。Compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
  2. 如权利要求1所述的方法,所述解析生成的抽象语法树,顺序提取每个抽象语法树的节点信息中的基础信息,包括:The method according to claim 1, said parsing the generated abstract syntax tree and sequentially extracting basic information from the node information of each abstract syntax tree, including:
    解析生成的抽象语法树,对于节点信息中状态变量为真的,顺序提取每个抽象语法树的节点信息中的基础信息。Parse the generated abstract syntax tree, and if the state variables in the node information are true, sequentially extract the basic information in the node information of each abstract syntax tree.
  3. 如权利要求1所述的方法,所述生成升级前后合约的抽象语法树,包括:The method according to claim 1, generating abstract syntax trees of contracts before and after the upgrade includes:
    对升级前后的智能合约代码根据抽象语法树进行词法/语法分析,生成升级前后的合约的抽象语法树。Perform lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generate the abstract syntax tree of the contract before and after the upgrade.
  4. 如权利要求1所述的方法,所述基础信息包括抽象语法树的节点顺序,进一步还包括状态变量名和/或类型。The method of claim 1, wherein the basic information includes the node order of the abstract syntax tree, and further includes state variable names and/or types.
  5. 如权利要求1所述的方法,所述比较升级前后的抽象语法树的节点信息中的基础信息,包括:The method according to claim 1, wherein the basic information in the node information of the abstract syntax tree before and after the upgrade is compared, including:
    比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。Compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade.
  6. 如权利要求5所述的方法,按照节点编号顺序,比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。The method according to claim 5, according to the node number sequence, compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade.
  7. 如权利要求5或6所述的方法,A method as claimed in claim 5 or 6,
    如果比较得到状态变量名不同,则不兼容;或,If the comparison results in different state variable names, it is incompatible; or,
    如果比较得到状态变量类型不同,则不兼容;或,If the comparison results in different state variable types, it is incompatible; or,
    如果状态变量名和状态变量类型均不同,则不兼容。If both the state variable name and the state variable type are different, they are incompatible.
  8. 如权利要求5或6所述的方法,升级后的状态变量是在升级前状态变量之后追加新的状态变量的,判断为兼容。According to the method of claim 5 or 6, if the upgraded state variable is a new state variable added after the pre-upgraded state variable, it is determined to be compatible.
  9. 如权利要求1所述的方法,所述解析生成的抽象语法树,顺序提取每个抽象语法树的节点信息中的基础信息,包括:The method according to claim 1, said parsing the generated abstract syntax tree and sequentially extracting basic information from the node information of each abstract syntax tree, including:
    对于复合结构,解析生成的抽象语法树,顺序并递归的提取每个抽象语法树的节点信息中的基础信息。For compound structures, the generated abstract syntax tree is parsed, and the basic information in the node information of each abstract syntax tree is extracted sequentially and recursively.
  10. 如权利要求1所述的方法,如果不兼容,还包括反馈冲突的slot位置的基础信息/节点信息。The method according to claim 1, if incompatible, further comprising feeding back the basic information/node information of the conflicting slot location.
  11. 一种合约升级的兼容性的检测装置,包括:A compatibility detection device for contract upgrades, including:
    抽象语法树生成单元,用于生成升级前后合约的抽象语法树;The abstract syntax tree generation unit is used to generate the abstract syntax tree of the contract before and after the upgrade;
    提取单元,用于解析生成的抽象语法树,并顺序提取每个抽象语法树的节点信息中的基础信息;The extraction unit is used to parse the generated abstract syntax tree and sequentially extract the basic information in the node information of each abstract syntax tree;
    比较单元,用于比较升级前后的抽象语法树的节点信息中的基础信息,得到兼容性结论。The comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
  12. 如权利要求11所述的检测装置,所述提取单元解析生成的抽象语法树,对于节点信息中状态变量为真的,顺序提取每个抽象语法树的节点信息中的基础信息。The detection device according to claim 11, the abstract syntax tree generated by the extraction unit parses, and if the state variable in the node information is true, the basic information in the node information of each abstract syntax tree is sequentially extracted.
  13. 如权利要求11所述的检测装置,所述抽象语法树生成单元对升级前后的智能合约代码根据抽象语法树进行词法/语法分析,生成升级前后的合约的抽象语法树。The detection device according to claim 11, wherein the abstract syntax tree generation unit performs lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generates the abstract syntax tree of the contract before and after the upgrade.
  14. 如权利要求11所述的检测装置,所述基础信息包括抽象语法树的节点顺序,进一步还包括状态变量名和/或类型。The detection device according to claim 11, the basic information includes the node order of the abstract syntax tree, and further includes the name and/or type of the state variable.
  15. 如权利要求11所述的检测装置,所述比较单元比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。The detection device according to claim 11, wherein the comparison unit compares basic information in node information with the same node number in the abstract syntax tree before and after the upgrade.
  16. 如权利要求15所述的检测装置,所述比较单元按照节点编号顺序,比较升级前后的抽象语法树中相同节点编号的节点信息中的基础信息。The detection device according to claim 15, wherein the comparison unit compares the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade according to the node number sequence.
  17. 如权利要求15或16所述的检测装置,所述比较单元The detection device according to claim 15 or 16, the comparison unit
    如果比较得到状态变量名不同,则不兼容;或,If the comparison results in different state variable names, it is incompatible; or,
    如果比较得到状态变量类型不同,则不兼容;或,If the comparison results in different state variable types, it is incompatible; or,
    如果状态变量名和状态变量类型均不同,则不兼容。If both the state variable name and the state variable type are different, they are incompatible.
  18. 如权利要求15或16所述的检测装置,所述比较单元比较升级后的状态变量是在升级前状态变量之后追加新的状态变量的,判断为兼容。The detection device according to claim 15 or 16, wherein the comparison unit compares the upgraded state variable with a new state variable added after the pre-upgraded state variable, and determines that the state variable is compatible.
  19. 如权利要求11所述的检测装置,The detection device according to claim 11,
    对于复合结构,所述提取单元解析生成的抽象语法树,顺序并递归的提取每个抽象语法树的节点信息中的基础信息。For compound structures, the extraction unit parses the generated abstract syntax tree, and sequentially and recursively extracts basic information in the node information of each abstract syntax tree.
  20. 如权利要求11所述的检测装置,还包括反馈单元,对于所述比较单元的比较结果为不兼容的,所述反馈单元反馈冲突的slot位置的基础信息/节点信息。The detection device according to claim 11, further comprising a feedback unit, the comparison result of the comparison unit is incompatible, and the feedback unit feeds back the basic information/node information of the conflicting slot position.
  21. 一种客户端,包括:A client that includes:
    处理器,processor,
    存储器,存储有程序,其中在所述处理器执行所述程序时,执行上述权利要求1-10中任一项所述的方法。The memory stores a program, wherein when the processor executes the program, the method described in any one of claims 1-10 is performed.
PCT/CN2022/135220 2022-09-14 2022-11-30 Method and apparatus for detecting compatibility of contract upgrading WO2024055437A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211117310.2A CN115454475A (en) 2022-09-14 2022-09-14 Method and device for detecting compatibility of contract upgrading
CN202211117310.2 2022-09-14

Publications (1)

Publication Number Publication Date
WO2024055437A1 true WO2024055437A1 (en) 2024-03-21

Family

ID=84302804

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135220 WO2024055437A1 (en) 2022-09-14 2022-11-30 Method and apparatus for detecting compatibility of contract upgrading

Country Status (2)

Country Link
CN (1) CN115454475A (en)
WO (1) WO2024055437A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154311A1 (en) * 2009-12-18 2011-06-23 Michael Acker Generating a where-used objects list for updating data
CN110532176A (en) * 2019-07-31 2019-12-03 平安科技(深圳)有限公司 A kind of formalization verification method, electronic device and the storage medium of intelligence contract
CN111581077A (en) * 2020-04-08 2020-08-25 腾讯科技(深圳)有限公司 Intelligent contract testing method and device
CN112631656A (en) * 2021-01-06 2021-04-09 中山大学 Intelligent contract optimization method and device based on source code
CN114611074A (en) * 2022-03-09 2022-06-10 河海大学 Method, system, equipment and storage medium for obfuscating source code of solid language

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154311A1 (en) * 2009-12-18 2011-06-23 Michael Acker Generating a where-used objects list for updating data
CN110532176A (en) * 2019-07-31 2019-12-03 平安科技(深圳)有限公司 A kind of formalization verification method, electronic device and the storage medium of intelligence contract
CN111581077A (en) * 2020-04-08 2020-08-25 腾讯科技(深圳)有限公司 Intelligent contract testing method and device
CN112631656A (en) * 2021-01-06 2021-04-09 中山大学 Intelligent contract optimization method and device based on source code
CN114611074A (en) * 2022-03-09 2022-06-10 河海大学 Method, system, equipment and storage medium for obfuscating source code of solid language

Also Published As

Publication number Publication date
CN115454475A (en) 2022-12-09

Similar Documents

Publication Publication Date Title
US11157560B2 (en) System and method for managing graph data
US8286132B2 (en) Comparing and merging structured documents syntactically and semantically
Burckhardt et al. Cloud types for eventual consistency
US20170124166A1 (en) Dynamic Field Data Translation to Support High Performance Stream Data Processing
US7559052B2 (en) Meta-model for associating multiple physical representations of logically equivalent entities in messaging and other applications
US20140207826A1 (en) Generating xml schema from json data
CN106462425A (en) Complex constants
CN102037446A (en) Dynamic collection attribute-based computer programming language methods
WO2024045382A1 (en) Implementation of reflective mechanism in blockchain
WO2024055437A1 (en) Method and apparatus for detecting compatibility of contract upgrading
CN117234517A (en) Interface parameter verification method, device, equipment and storage medium
CN116308347A (en) Transaction grouping method in blockchain and blockchain link point
CN108710504A (en) Database operation method and device
CN112287032B (en) Block chain data storage method and device and block chain link points
Lempsink et al. Type-safe diff for families of datatypes
WO2024179085A1 (en) Method for grouping transactions in blockchain and blockchain node
CN116302359A (en) Block chain transaction execution method and block chain link point
CN116302358A (en) Block chain transaction execution method and block chain link point
CN116361337A (en) Block chain exchange read-write set generation method and block chain link point
CN116996200A (en) Resource processing method in block chain and block link point
CN115328540A (en) Mapping method of XML document and JAVA class for application framework
CN116450756A (en) Transaction grouping method in blockchain and blockchain link point
CN116382710A (en) Method for distributing contracts in block chain and block chain link point
JP4120879B2 (en) Program generation system and method and program thereof
CN115168303A (en) Storage algorithm model based on complex data serialization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22958624

Country of ref document: EP

Kind code of ref document: A1