WO2024055437A1

WO2024055437A1 - Method and apparatus for detecting compatibility of contract upgrading

Info

Publication number: WO2024055437A1
Application number: PCT/CN2022/135220
Authority: WO
Inventors: 曹蓉
Original assignee: 蚂蚁区块链科技(上海)有限公司
Priority date: 2022-09-14
Filing date: 2022-11-30
Publication date: 2024-03-21
Also published as: CN115454475A

Abstract

One or more embodiments of the present description provide a method and apparatus for detecting compatibility of contract upgrading, and a client, which are applied to the field of blockchains. The method for detecting compatibility of contract upgrading comprises: generating abstract syntax trees of a contract before and after upgrading; parsing the generated abstract syntax trees, and sequentially extracting basic information in node information of each abstract syntax tree; and comparing the basic information in the node information of the abstract syntax trees before and after upgrading to obtain a compatibility conclusion.

Description

A method and device for detecting compatibility of contract upgrades

Technical field

The embodiments of this specification belong to the field of blockchain technology, and particularly relate to a method and device for detecting the compatibility of contract upgrades.

Background technique

Blockchain is a new application model of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. In the blockchain system, data blocks are combined into a chained data structure in a chronological manner and are cryptographically guaranteed to be an untamperable and unforgeable distributed ledger. Due to the characteristics of blockchain, such as decentralization, non-tamperable information, and autonomy, blockchain has also received more and more attention and applications.

Contents of the invention

The purpose of this disclosure is to provide a method and device for detecting the compatibility of contract upgrades, including: a method for detecting the compatibility of contract upgrades, including: generating abstract syntax trees of contracts before and after the upgrade; parsing the generated abstract syntax trees, Sequentially extract the basic information in the node information of each abstract syntax tree; compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.

A device for detecting compatibility of contract upgrades, including: an abstract syntax tree generation unit for generating abstract syntax trees of contracts before and after the upgrade; an extraction unit for parsing the generated abstract syntax trees and sequentially extracting each abstract syntax tree The basic information in the node information; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.

A client includes: a processor and a memory storing a program, wherein when the processor executes the program, the above method is executed.

The upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts. Through the above example, a contract upgrade storage data compatibility detection solution based on solidity and other types of abstract syntax trees can be implemented.

Description of drawings

In order to explain the technical solutions of the embodiments of this specification more clearly, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some of the embodiments recorded in this specification. , for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.

Figure 1 is a schematic diagram of deploying smart contracts in an embodiment;

Figure 2 is a schematic diagram of calling a smart contract in an embodiment;

Figure 3 is a schematic diagram of a block storage structure in an embodiment;

Figure 4 is a schematic diagram of a block storage structure in an embodiment;

Figure 5 is a schematic diagram of an MPT tree in an embodiment;

Figure 6 is a schematic diagram of the modules involved in the transaction processing process and the relationship between CPU, memory and disk in an embodiment;

Figure 7 is a schematic diagram of the EVM virtual machine module involved in the transaction processing process in an embodiment;

Figure 8 is a schematic diagram of the slot structure in an embodiment;

Figure 9 is a schematic diagram of the slot structure in an embodiment;

Figure 10 is a flow chart of a method for detecting compatibility of contract upgrades in an embodiment;

Figure 11 is a schematic diagram of the slot structure in an embodiment;

Figure 12 is a schematic diagram of the slot structure in an embodiment.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of this specification. Obviously, the described The embodiments are only some of the embodiments of this specification, but not all of the embodiments. Based on the embodiments in this specification, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of this specification.

Blockchains are generally divided into three types: Public Blockchain, Private Blockchain and Consortium Blockchain. In addition, there are many types of combinations, such as private chain + alliance chain, alliance chain + public chain and other different combinations. Among them, the most decentralized one is the public chain. Participants who join the public chain can read data records on the chain, participate in transactions, and compete for the accounting rights of new blocks. Moreover, each participant (reflected as a participant's node on the blockchain) can freely join and exit the network and perform related operations. On the contrary, the private chain has the writing permission of the network controlled by an organization or institution, and the data reading permission is regulated by the organization. Simply put, a private chain can be a weakly centralized system with strict restrictions and few participating nodes. This type of blockchain is more suitable for internal use within specific organizations. The alliance chain is a blockchain between the public chain and the private chain, which can achieve "partial decentralization". Each node in the alliance chain usually has a corresponding entity or organization; participants join the network through authorization and form a stakeholder alliance to jointly maintain the operation of the blockchain.

Whether it is a public chain, a private chain or a consortium chain, in addition to supporting the transfer of native assets on the blockchain between accounts, it can also provide smart contract functions. Smart contracts on the blockchain are contracts that can be triggered and executed by transactions on the blockchain system. Smart contracts can be defined in the form of code.

Smart contracts allow users to create and invoke some complex logic in the blockchain network. This is the biggest challenge that distinguishes programmable blockchain from original blockchain technology. As the core of a programmable blockchain is the virtual machine (EVM), each blockchain node can run the EVM. EVM is a Turing-complete virtual machine, which means that various complex logic can be implemented through it. When users publish and call smart contracts in the blockchain, they run on the EVM. In fact, the virtual machine directly runs the virtual machine code (virtual machine bytecode, hereinafter referred to as "bytecode"). Smart contracts deployed on the blockchain can be in the form of bytecode.

For example, as shown in Figure 1, after Bob sends a transaction containing smart contract creation information to the blockchain network, the EVM of node 1 can execute the transaction and generate the corresponding contract instance. "0x6f8ae93..." in Figure 1 represents the address of this contract. The data field of the transaction can store bytecode, and the to field of the transaction is an empty account. After the nodes reach an agreement through the consensus mechanism, the contract is successfully created, and subsequent users can call this contract.

After the contract is created, a contract account corresponding to the smart contract appears on the blockchain and has a specific address. The contract code and account storage will be saved in the contract account. The behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract. In other words, smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain.

As mentioned above, the data field containing the transaction that creates the smart contract can store the bytecode of the smart contract. Bytecode consists of a series of bytes, each byte can identify an operation. Based on various considerations such as development efficiency and readability, developers can choose a high-level language to write smart contract code instead of writing bytecode directly. The smart contract code written in a high-level language is compiled by a compiler to generate bytecode, which can then be deployed on the blockchain. Blockchain supports many high-level languages, such as Solidity, Serpent, LLL language, etc.

Take the Solidity language as an example. Contracts written in it are very similar to classes in object-oriented programming languages. A variety of members can be declared in a contract, including state variables, functions, function modifiers, events, etc. State variables are values stored in a smart contract's account storage and are used to save the state of the contract.

The following is code example 1 of a simple smart contract written in Solidity language:

Code example 1.SimpleStorage code

In the above code example 1, the second line declares the character (string) state variable storedData, and the third line declares the event (event). The content stored in this event is the address of the initiator of the call and the string s. Lines 4-7 define the set function, and the input parameter is the string s. The operations performed by the set function include setting the input parameters to the state variable storedData and generating an event. The content of the event includes the address of the initiator of the call and the string s.

As mentioned before, state variables will eventually be stored in the database. The generated events are generally in the following form:

[topic1][topic2]...[topicn][data]

Here, the first topic, topic1, is generally the default value, for example, it is the identifier of the receipt, which can be a hash value obtained by sequentially splicing the event name, event parameter type, etc. topic2~topicn, whether each topic exists depends on whether the Indexed modification is added when defining the parameter. Otherwise, the value of this parameter will be a topic in the receipt, and those without Indexed modification will generally be placed in the data. In the example in code example 1 above, when the event is declared in line 3, the two parameters address from and s are not modified by Indexed and are generally placed in data. The code in line 6 sets the data content [msg.sender, s] in the event through the stored() event. In this way, for the event operated on line 6, the overall form is:

[topic1: event identification] [data: msg.sender, s]

Lines 8-10 define the get function. The operation of this function includes returning the value of the storedData of the query. returns(string) indicates the type of return value, and the constant modifier indicates that the function cannot modify the value of the state variable in the contract.

In addition, as shown in Figure 2, after Bob sends a transaction containing smart contract call information to the blockchain network, the EVM of node 1 can execute the transaction and generate the corresponding contract instance. The from field of the transaction in Figure 2 is the address of the account that initiated the call to the smart contract. "0x6f8ae93..." in the to field represents the address of the called smart contract. The data field of the transaction saves the method and parameters for calling the smart contract. In addition, a value field can also be included to represent the value of the ether in the transaction. After calling the smart contract, the value of storedData may change. Subsequently, a client can view the current value of storedData through a certain blockchain node (such as node 6 in Figure 2).

Smart contracts can be executed independently on each node in the blockchain network in a prescribed manner. All execution records and data are saved on the blockchain, so when such a transaction is completed, it is stored on the blockchain and cannot be tampered with. , Transaction vouchers that will not be lost.

As mentioned before, the storedData in the above example is the state variable, which is stored in the account storage of the smart contract. In various blockchain networks that introduce smart contracts, accounts can usually include two types:

Contract account: stores the executed smart contract code and the value of the state in the smart contract code. It can usually only be activated through external account calls;

Externally owned account: A user's account, such as an Ethereum owner's account.

The design of external accounts and contract accounts is actually the mapping of account addresses to account status. The status of the account usually includes Nonce, Balance, Storage root, CodeHash and other fields. Nonce and Balance exist in both external accounts and contract accounts. CodeHash and Storage root attributes are generally only valid on contract accounts.

Nonce: Counter. For external accounts, this number can represent the number of transactions sent from the account address; for contract accounts, it can be the number of contracts created by the account.

Balance: The number of ethers owned by this address.

Storage root: The hash of the root node of an MPT tree. This MPT tree organizes the storage of state variables of contract accounts.

CodeHash: The hash value of the smart contract code. For contract accounts, this is the hash value of the smart contract; for external accounts, since smart contracts are not included, the CodeHash field can generally be an empty string/all 0 string.

MPT's full name is Merkle Patricia Tree, which is a tree structure that combines Merkle Tree (Merkle tree) and Patricia Tree (compressed prefix tree, a more space-saving Trie tree, dictionary tree). Merkle Tree, the Merkle tree algorithm calculates a Hash value for each transaction, and then connects the two to calculate the Hash again, all the way to the top-level Merkle root. Some blockchain networks use an improved MPT tree, such as a 16-fork tree structure, which is often referred to as an MPT tree.

The data structure of the MPT tree includes a state trie. The state tree contains the key-value pair (key and value pair, also written as key-value, referred to as k-v or kv) corresponding to the storage content of each account in the blockchain network. The "key" in the state tree can be a 160-bit identifier (such as the address of a blockchain account or a part of the hash value of the address, hereafter collectively referred to as the account address). This account address is distributed from the state tree. The root node starts in the storage of leaf nodes. The "values" in the state tree are generated by encoding the blockchain account's information (using the Recursive-Length Prefix encoding (RLP) method). As mentioned before, for external accounts, the values include nonce and balance; for contract accounts, the values include nonce, balance, codehash, and storageroot.

Contract accounts are used to store status related to smart contracts. After the smart contract is deployed on the blockchain, a corresponding contract account will be generated. This contract account generally has some states, which are defined by the state variables in the smart contract and generate new values when the smart contract is executed. The smart contract usually refers to a contract defined in the form of code in the blockchain environment that can automatically execute the terms. Once an event triggers a clause in the contract (execution conditions are met), the code can be executed automatically. In the blockchain, the relevant status of the contract is stored in the storage trie. The hash value of the root node of the storage trie is stored in the above-mentioned storageroot, thereby locking all the status of the contract to the contract account through hash. The storage trie is also an MPT tree structure, which stores the key-value mapping from state address to state value. Part of the information from the root node of the storage trie tree to the leaf nodes is arranged sequentially to store the address of a state, and the value of the state is stored in the leaf node.

As shown in Figure 3, in some blockchain data storage, the block header of each block includes several fields, such as the previous block hash previous_Hash (Prev Hash in the figure), the random number Nonce (in some blockchains The Nonce in the system is not a random number, or the Nonce in the block header is not enabled in some blockchain systems), timestamp Timestamp, block number Block Num, state root hash State_Root, transaction root hash Transaction_Root, receipt root hash Hope Receipt_Root et al. Among them, the Prev Hash in the block header of the next block (such as block N+1) points to the previous block (such as block N), which is the hash value of the previous block. In this way, the next block locks the previous block through the block header on the blockchain. Among them, State_Root, Transaction_Root and Receipt_Root respectively lock the state collection, transaction collection and receipt collection. The state collection, transaction collection and receipt collection organize states, transactions and receipts in the form of trees respectively. Generally, it can be the same tree structure or different tree structures. For example, in the above-mentioned blockchain network using MPT structure, the same MPT structure is used. In some tree structures including smart contract state collections, a two-level MPT structure is included: the leaf nodes of the upper-level MPT structure include two types: external accounts and contract accounts; each contract account includes the next-level MPT structure, the leaf nodes of the next level include the value of the state in the contract account.

Figure 4 is a schematic structural diagram of blockchain data storage. As shown in Figure 3, state_root is the hash value of the root of the MPT tree composed of the status of all accounts in the current block, that is, the point pointing to state_root is a state trie in the form of an MPT. The root node of this MPT tree is generally an extension node (Extension Node) or a branch node (Branch Node). What is stored in state_root is generally the hash value of this root node. The root node can be connected to one or more layers of Extension Node/Branch Node below. These multi-layer tree nodes can be collectively called intermediate nodes (Internal Node). A part of the value in each node from the root node of this MPT to the leaf node can be concatenated in order to form the account address and serve as the key. The account information stored in the leaf node is the value corresponding to the account address. In this way, the key- value key-value pair. This key can also be a part after sha3 (Address), that is, a part of the hash value of the account address (the hash algorithm uses the sha3 algorithm, for example), and its stored value value can be rlp (Account), which is the rlp encoding of the account information. The account information is a four-tuple consisting of [nonce, balance, storageRoot, codeHash]. As mentioned before, for external accounts, there are generally only two items, nonce and balance, and the storageRoot and codeHash fields store empty strings/all 0 strings by default. In other words, the external account does not store the contract, nor does it store the state variables generated after the contract is executed. Contract accounts generally include Nonce, Balance, Storage root, and CodeHash. Among them, Nonce is the transaction counter of the contract account; Balance is the account balance; Storage root corresponds to another MPT, through which Storage root can be linked to contract-related status information; CodeHash is the hash value of the contract code. Whether it is an external account or a contract account, its account information is generally located in a separate leaf node (Leaf Node). From the Extension Node/Branch Node of the root node to the Leaf Node of each account, there may be several branch nodes and extension nodes in the middle.

The state trie can be a tree in the form of MPT, which is generally a 16-fork tree, that is, each layer can have up to 16 child nodes. For Extension Node, it is used to store common prefixes. It generally has one child node, and this child node can be Branch Node. For a Branch Node, it can have up to 16 child nodes, which may include Extension Node and/or Leaf Node.

Among them, for a contract account in the state trie, its storage_Root points to another tree in the form of MPT, which stores the data of state variables involved in contract execution. The tree in MPT form pointed to by this storage_Root is Storage Trie, which is the hash value of the root node of Storage Trie. Generally, this Storage Trie tree also stores key-value pairs. The key indicates the address of the state variable. Its value can be the result of processing the state variable declaration position in the contract (a value counting from 0) after certain rules, such as sha3 (the position where the state variable is declared), or sha3 (contract name + location of state variable declaration). value is used to store the value of a state variable (for example, an RLP-encoded value). A part of the data stored on the path from the root node to the leaf node through the intermediate node is connected to form the key, and the value is stored in the leaf node. As mentioned earlier, this Storage trie can also be a tree in the form of MPT, which is generally a 16-fork tree, that is, for Branch Node, it can have up to 16 child nodes, and these child nodes may include Extension Node and/or Leaf Node. For Extension Node, it generally can have 1 child node, and this child node can be Branch Node or Leaf Node.

For example, the Leaf Node Account P of the state Trie in Figure 4 is a contract account, and its Storage Root locks all states in the contract storage. These states are organized into MPT trees, and the tree structure is such as the Storage trie linked to the Storage Root. In the Storage trie of this link, take Leaf Node State Variable N as an example. For example, if it is the value of storedData in the aforementioned contract code example, its key is sha3 (the declaration location of storedData will be detailed later), and its value is s( For the sake of simplicity, the encoding format of value is omitted here, such as RLP, which will be similar later and will not be described again). Among them, the key value is distributed sequentially from the root node to the leaf node of the storage Trie (that is, Leaf Node Variable N).

For another example, Leaf Node Account C in the state Trie in Figure 4 is an external account, and its key is sha3 (Address C), which is the hash value of the address of account C (the hash algorithm uses the sha3 algorithm, for example). The stored value value can be (Account), where the account information Account is a tuple composed of [nonce, balance]. As mentioned before, since Account C is an external account, its account information is nonce and balance (codehash and storage root are omitted here, similar below). For example, if an external account has a nonce of 20 and a balance of 4550, then the leaf node Leaf Node State Variable C stores nonce=20 and balance=4550. The address of Account C is key, and its values are sequentially distributed from the root node to the leaf node of the state Trie (ie, Leaf Node Variable C).

These states, including k-v of external accounts and k-v of contract accounts, are ultimately stored in the database. The storage in the database does not directly store the status of these accounts, that is, it does not directly store the k-v of these accounts, but stores the k-v value of each tree node itself.

As shown in the example of Figure 5, in the upper-level MPT structure, for leaf node A1, through a7 of the shared nibble in the root node A8 (Extension Node) - slot 1 of the intermediate node A7 (Branch Node) - leaf node The key-end 1335 in A1 is combined sequentially to form the key of the leaf node, which is a711335. Balance=45.0ETH and Nonce=n1 are stored in the leaf node. For leaf node A2, through a7 of shared nibbles in root node A8 (Extension Node) - slot 7 of intermediate node A7 (Branch Node) - d3 of shared nibbles in node A6 (Extension Node) - intermediate node A5 (Branch Node) Slot 3 in - 7 of the key-end in leaf node A2 are combined sequentially to form the key of the leaf node, which is a77d337. Balance=1.00WEI and Nonce=n2 are stored in the leaf node. For leaf node A3, a7 of the shared nibble in the root node A8 (Extension Node) - slot f of the intermediate node A7 (Branch Node) - 9365 of the key-end in the leaf node A3 are combined sequentially to form the key of the leaf node. , which is a7f9365. Balance=1.1ETH and Nonce=n3 are stored in this leaf node. For leaf node A4, through a7 of shared nibbles in root node A8 (Extension Node) - slot 7 of intermediate node A7 (Branch Node) - d3 of shared nibbles in node A6 (Extension Node) - intermediate node A5 (Branch Node) Slot 9 in - 7 of the key-end in leaf node A4 are combined sequentially to form the key of the leaf node, which is a77d397. Balance=0.12ETH, Nonce=n4, CodeHash=c1, Storage are stored in the leaf node. root=s1. s1 can be H(A10), which is the hash to the root node A10 of the next level tree. Among them, the leaf nodes of A1, A2 and A3 store the information of the external account, and the leaf node of A4 stores the information of the contract account. For a contract account, it contains the next-level MPT, which constitutes a Storage Trie, which is used to store the state variables in the contract account.

As shown in the example of Figure 5, in the next-level MPT structure, for leaf node A11, slot 3 in the root node A10 (Branch Node) - 35b2e4 of the key-end in the leaf node A11 are sequentially combined to form The key of the leaf node is 335b2e4, and "Zhang San_A=20" is stored in the leaf node. For example, it means that the share of type A digital assets defined in the contract belonging to Zhang San is 20, that is, Zhang San's A The balance of class assets is 20. For leaf node A12, through slot 7 in the root node A10 (Branch Node) - c25988 of the key-end in the leaf node A12, they are combined sequentially to form the key of the leaf node, which is 7c25988, and stored in the leaf node " "Johnny_B=20", for example, means that the share of type B digital assets defined in the contract that belongs to John is 50, that is, the balance of John's B-type assets is 50. For leaf node A15, through slot f in root node A10 (Branch Node) - a of the shared nibble in intermediate node A13 (Extension Node) - slot 6 of intermediate node A14 (Branch Node) - key in leaf node A15 - The be33 of end is combined sequentially to form the key of the leaf node, which is fa6be33, and "storedData=s" is stored in the leaf node. For leaf node A16, through slot f in root node A10 (Branch Node) - a of the shared nibble in intermediate node A13 (Extension Node) - slot 9 of intermediate node A14 (Branch Node) - key in leaf node A16 - The 9365 of end are combined sequentially to form the key of the leaf node, which is fa99365. "Wang Wu_A=35" is stored in the leaf node, for example, it indicates the share of the A type digital assets defined in the contract that belongs to Wang Wu. is 35, that is, the balance of Wang Wu’s Class A assets is 35.

In the node composition of the above MPT tree, the prefix prefix is used to indicate the tree node type. For example, 0 indicates an Extension Node containing an even number of shared nibbles (shared nibbles), and 1 indicates an Extension Node containing an odd number of shared nibble(s). Use 2 to represent a Leaf Node containing an even number of nibbles, and use 3 to represent a Leaf Node containing an odd number of nibble(s).

In the above node composition, the hash value of the entire content of the next tree node is filled in the corresponding position of the previous tree node. In the database, the key-value mapping of each tree node is actually stored, where value includes the content stored in the tree node, and the corresponding key is the hash value of the overall content of the tree node. In this way, the tree nodes k-v actually stored in the database are as follows:

KeyKey	Value(逻辑示意，且持久化存储中需要经过RLP编码)Value (logical representation, and RLP encoding is required in persistent storage)
H(A9)H(A9)	Prev Hash:,Nonce:,Timestamp:,Block Num:,State Root:H(A8),Transaction Root:,Receipt Root:,…Prev Hash:,Nonce:,Timestamp:,Block Num:,State Root:H(A8),Transaction Root:,Receipt Root:,…
H(A8)H(A8)	prefix:0,shared nibble(s):a7,next node:H(A7)prefix:0,shared nibble(s):a7,next node:H(A7)
H(A7)H(A7)	0:,1:H(A1),2:,3:,4:,5:,6:,7:H(A6),8:,9:,a:,b:,c:,d:,e:,f:H(A3),value:0:,1:H(A1),2:,3:,4:,5:,6:,7:H(A6),8:,9:,a:,b:,c:,d:, e:,f:H(A3),value:
H(A1)H(A1)	prefix:2,Key-end:1335,balance:45.0ETH,nonce:n1prefix:2,Key-end:1335,balance:45.0ETH,nonce:n1
H(A6)H(A6)	value:prefix:0,shared nibble(s):d3,next node:H(A5)value:prefix:0,shared nibble(s):d3,next node:H(A5)
H(A3)H(A3)	prefix:2,Key-end:9365,balance:1.1ETH,nonce:n3prefix:2,Key-end:9365,balance:1.1ETH,nonce:n3
H(A5)H(A5)	0:,1:,2:,3:H(A2),4:,5:,6:,7:,8:,9:H(A4),a:,b:,c:,d:,e:,f:,value:0:,1:,2:,3:H(A2),4:,5:,6:,7:,8:,9:H(A4),a:,b:,c:,d:, e:,f:,value:
H(A2)H(A2)	prefix:0,Key-end:7,balance:1.00WEI,nonce:n2prefix:0,Key-end:7,balance:1.00WEI,nonce:n2
H(A4)H(A4)	prefix:0,Key-end:7,balance:0.12ETH,nonce:n4,codehash:c1,storage root:H(10)prefix:0,Key-end:7,balance:0.12ETH,nonce:n4,codehash:c1,storage root:H(10)
H(A10)H(A10)	0:,1:,2:,3:H(A11),4:,5:,6:,7:H(A12),8:,9:,a:,b:,c:,d:,e:,f:H(A13),value:0:,1:,2:,3:H(A11),4:,5:,6:,7:H(A12),8:,9:,a:,b:,c:,d:, e:,f:H(A13),value:
H(A11)H(A11)	prefix:2,Key-end:35b2e4,value:张三_A＝20prefix:2,Key-end:35b2e4,value:Zhang San_A＝20
H(A12)H(A12)	prefix:2,Key-end:c25988,value:李四_B＝50prefix:2,Key-end:c25988,value:李思_B＝50
H(A13)H(A13)	prefix:1,shared nibble(s):a,next node:H(A14)prefix:1,shared nibble(s):a,next node:H(A14)
H(A14)H(A14)	0:,1:,2:,3:,4:,5:,6:H(A15),7:,8:,9:H(A16),a:,b:,c:,d:,e:,f:,value:0:,1:,2:,3:,4:,5:,6:H(A15),7:,8:,9:H(A16),a:,b:,c:,d:, e:,f:,value:
H(A15)H(A15)	prefix:2,Key-end:be33,value:storedData＝sprefix:2,Key-end:be33,value:storedData=s
H(A16)H(A16)	prefix:2,Key-end:9365,value:王五_A＝50prefix:2,Key-end:9365,value:王五_A＝50

Table 1. Tree nodes k-v actually stored in the database

In Table 1 above, H() is used to represent hash calculation. In this way, the hash value of the next tree node is anchored in the previous tree node. Through such layers of hashing, the root hash of the entire state trie tree is obtained, and the root hash is locked into the state root field of the block header.

In some blockchain systems, the code of the blockchain platform can include P2P (Peer to Peer, point-to-point) module, consensus module, execution module and storage module. P2P is a form of computer network. Different from common web networks, P2P is decentralized and decentralized. The P2P module can complete the distributed dissemination of data. For blockchain nodes, data can be transmitted and received in a point-to-point manner through the P2P module. Different participants can establish a distributed blockchain network through deployed nodes. The ledger constructed using a chain block structure is stored on each node (or on most nodes, such as consensus nodes) in the distributed blockchain network. This is also called decentralization (or multi-centering). )'s distributed ledger. Such a blockchain system needs to solve the problem of consistency and correctness of respective ledger data on multiple decentralized (or multi-centered) nodes. Each node runs the same blockchain platform program. Under the design of certain fault-tolerance requirements, the consensus module can ensure that all loyal nodes have the same transactions, thereby ensuring that all loyal nodes have consistent execution results for the same transactions, and The transaction and execution results are packaged to generate blocks. The current mainstream consensus mechanisms include: Proof of Work (POW), Proof of Stake (POS), Delegated Proof of Stake (DPOS), Practical Byzantine Fault Tolerance (PBFT) ) algorithm, Honey Badger Byzantine Fault Tolerance (HoneyBadgerBFT) algorithm, etc. During the consensus process, the consensus module can generally also generate the timestamp of the block corresponding to the current transaction set, etc. The execution module can execute transactions, including ordinary transfer transactions and transactions involving contracts, before or after the consensus module completes consensus. For transactions involving contracts, the execution module can introduce a virtual machine to execute the code of the smart contract, such as a virtual machine (Ethereum Virtual Machine, EVM), thereby shielding the differences in hardware configuration and software environment of each node through EVM to ensure that each node The process and results of executing smart contracts are the same, and the sandbox environment is used to prevent the execution of smart contracts from affecting the blockchain platform code, other programs or operating systems on the host. For a consortium chain, the nodes can determine the transaction content and transaction sequence in a transaction set through the consensus module, and then output a deterministic transaction set of consensus results to the execution module. The execution module generates execution results by executing ordinary transfer transactions/transactions involving contracts and sends them to the storage module. The storage module can be responsible for storing execution results to the node's local persistent storage medium.

As shown in Figure 6, a blockchain node physically includes CPU, memory, disk, etc. The blockchain platform code executed by this blockchain node can include P2P module, consensus module, execution module and storage module. The function implementation of the P2P module, consensus module and execution module generally requires the participation of CPU and memory. The storage module can include a building tree module, a block header generation module, a WAL (Write Ahead Log) module, and a state database module. Among them, the building tree module is used to build a tree (such as an MPT tree) based on the state k-v passed in by the execution module, such as the aforementioned state trie and storage trie, so as to obtain the k-v of the tree node, which generally requires the participation of CPU and memory. The block header generation module is used to generate the block header based on the root node of the tree built by the building tree module and some other data (such as the previous block hash, timestamp, block number, etc.), which generally requires the participation of CPU and memory. The WAL module is used to persistently store the tree nodes k-v generated by the building tree module before writing them to the state database module, to prevent the tree nodes k-v generated by the building tree module from being written into the state database module due to Data loss caused by situations such as power outages, and recovering data when this occurs generally requires the involvement of CPU, memory, and disk. The state database module is used to store the tree nodes k-v in Table 1 built by the building tree module on the persistent storage device; since the tree node data will eventually be written to the persistent storage medium (such as the disk in the figure), so In addition to CPU and memory, the state database module generally requires the participation of disk.

In terms of storage structure, the above-mentioned Merkle tree structure, such as the above-mentioned MPT and Libra's SMT (Sparse Merkle Tree, similar to MPT), is located in the building tree module in the form of the corresponding relationship in Table 1 above. and stored in memory. Among them, the upper Merkle tree is a prefix tree (dictionary tree), which can organize the data and obtain a unique Merkle root for the organized data. The leaf nodes can save the state value, and the root node to the intermediate node to the leaf node implement lexicographic indexing of the state key. These tree nodes are encoded as Key according to certain rules, and their contents are encoded as Value, and are eventually stored in the underlying database. Most databases use LSM (Log-Structured Merge-Tree, log-structured merge tree) NoSQL Key-Value DB (DataBase, database; Key-Value DB is also referred to as KVDB), which is located in the state database module and is saved on disk . Specifically, the databases are levelDB and Libra's RocksDB. Both KVDBs are based on the LSM storage engine.

As mentioned above, the transaction that creates the smart contract is sent to the blockchain. After consensus, each node of the blockchain can execute the transaction. At this time, a contract account corresponding to the smart contract appears on the blockchain (including, for example, the account's Identity, the contract's hash value Codehash, and the root StorageRoot of the contract storage), and has a specific address. The contract code and account storage can be saved. In the storage of the contract account, as shown in Figure 7. The behavior of a smart contract is controlled by the contract code, and the account storage of the smart contract saves the state of the contract. In other words, smart contracts enable virtual accounts containing contract code and account storage (Storage) to be generated on the blockchain. For contract deployment transactions or contract update transactions, the value of Codehash will be generated or changed. Subsequently, the blockchain node can receive a transaction request to call the deployed smart contract. The transaction request can include the address of the called contract, the function in the called contract and the input parameters. Generally, after the transaction request passes consensus, each node of the blockchain can independently execute the specified smart contract call.

The left side of Figure 7 shows an example of a smart contract written in solidity and its compilation and execution process. The smart contract is compiled by a compiler to generate bytecode. The solc in the picture is solidity's command line compiler. Smart contracts written through solidity can be compiled through the command line tool solc with parameters, thereby generating bytecode that can be run on the EVM. After the process of deploying the contract in Figure 1 and Figure 2 above, the smart contract can be successfully created on the blockchain. After deploying the contract, a contract account corresponding to the smart contract is generated on the blockchain. The contract account includes, for example, the contract counter Nonce, the balance of the account, the hash value of the contract bytecode Codehash, the root StorageRoot of the contract storage, etc. The contract will have a specific address on the chain, which is the contract address.

This contract address is, for example, calculated by hashing the address of the external account where the contract is deployed and its counter nonce. Specifically, for example, sha3(rlp.encode([address_sender,nonce])) (rlp is an encoding format as mentioned above. It can be replaced by other encoding formats in different blockchains without even re-encoding, so rlp will be omitted later. ). Among them, sha3 is a type of hash algorithm, such as the commonly used algorithm such as keccak256. As mentioned earlier, rlp represents an encoding format, and rlp.encode([address_sender,nonce]) represents rlp encoding the content in parentheses. The [address_sender, nonce] in parentheses indicates the sequential concatenation of the two fields address_sender of the external account where the contract is deployed and its counter nonce. For example, using the keccak256 algorithm, you can get a hash value with a length of 256 bits. Based on this hash value, you can get the address of the deployed contract on the blockchain (for example, take the first 20 bytes). 256bits is 32bytes. The balance of the account can be set to the default value of 0 or when the deployment is completed. The hash value of the contract bytecode, Codehash, can be calculated by the blockchain platform by hashing the contract bytecode. The root StorageRoot of contract storage can be a default value or a hash value calculated based on the root node of the underlying storage Trie. This generally depends on whether initialization operations are performed in the deployed contract, such as executing the constructor in the contract. If the deployed contract contains a constructor, it generally includes the work of initializing some state variables that will eventually be stored in the underlying database. This initialization work can be performed in the virtual machine. After initializing the state variables, as mentioned above, an MPT tree can be constructed, so that the root node of the MPT tree can be obtained, and then the hash value of the root node can be obtained. If the deployed contract does not contain a constructor, the specific function does not need to be executed. Instead, the blockchain platform gives StorageRoot a default value, such as a hash value of empty content.

After the contract is deployed, as mentioned above, it can be called later. As shown in Figure 2, after Bob initiates a transaction that calls the smart contract to the blockchain network, the contract is executed, thereby setting the state variable to the string "hello". Similar to Figure 2, Alice initiates a transaction that calls the contract, thereby reading the value of the state variable through the execution of the contract.

As mentioned above, smart contracts usually define contracts in the form of code in a blockchain environment that can automatically execute terms. The terms are usually related to business-level logic. Therefore, the contract code as a whole reflects the business logic. As the business develops and changes, the business logic may change, and at this time the contract code also needs to be adjusted. In addition, the code of the contract may have loopholes and need to be repaired, or the upgrade of the language version in which the contract is written will also bring about upgrade requirements for the contract. In the above situations, the deployed contract usually needs to be upgraded.

One way to upgrade a contract is to deploy a new contract. The new contract will have a different address than the original contract. Specifically, as mentioned above, the contract address can be calculated by hashing the address of the external account where the contract is deployed and its counter nonce, for example, sha3([address_sender,nonce]). Contracts deployed by different contract deployers have different contract addresses; even for the same contract deployer, since a new transaction was initiated when upgrading the contract, the nonce as the transaction counter has changed. Therefore, the newly deployed contract's The address will also change. In this way, the storage of the new contract is also different from the old contract. The state variables in the contract storage of the new contract can only be set and read from the block where the new contract is deployed, but the state in the old contract cannot be accessed.

In some alliance chains, contract upgrades need to ensure:

First, the upgraded contract maintains the same contract address as the contract before the upgrade. In this way, the same contract storage space can be maintained after the upgrade as before the upgrade, historical data will not be lost, and users do not have to change the contract address entered when calling the contract.

Second, it is also necessary to ensure compatibility before and after the upgrade, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract.

In the first point mentioned above, the generation rules of contract account addresses can be set to be independent of nonce. Furthermore, they can be set to be independent of the deployer. For example, the address of the contract account can be determined by the name of the contract, such as sha3([name_contract]). This ensures that as long as the contract has the same name, its address on the blockchain will be the same. As mentioned above, contract accounts generally include Nonce, Balance, Storage root, and CodeHash, where CodeHash is the hash value of the contract bytecode. The bytecode of the upgraded contract is different from the bytecode before the upgrade. Therefore, the CodeHash in the contract account will change after the contract is upgraded. That is, the Codehash in the contract account will generally be updated after the contract is upgraded.

The second point above is essentially the same state variable, such as state variable r. If the key changes after the contract is upgraded, the value of r will be read starting from the block of the upgraded contract. Since the key changes, it cannot be read correctly. Take the value of r before upgrading the contract.

For example, the following code example 2 is the code example of the old contract before the upgrade:

Code example 2. solidity code of demo1

In the above demo1 code, two state variables ID and sex are declared in lines 3-4 respectively. Among these two state variables: ID is of unit256 type, which is 32 bytes in solidity; sex is of bool type, which is 1 bytes. If these two variables are before the function, they will generally be used as persistent storage state variables, that is, they will be stored in the underlying database.

In the above demo1 code, the setID() function is defined on lines 6-8. The public after the setID() function is used as a modifier to indicate that the setID() function serves as an internal/external interface function. There is a unit256 type parameter x in the setID() function. unit256 represents an unsigned integer of 256 bits, with a length of 32 bytes. In the function body, assign the value of parameter x to ID to implement the externally provided interface, and set the parameter x input by the user to the value of the state variable ID. The getID() function is defined on lines 10-12. The view after the getID() function indicates that the function can only read state variables but cannot modify them.

In the above demo1 code, the setSex() function and the getSex() function are defined in lines 14-16 and 18-20 respectively. The meaning of the code is similar to the above and will not be described again.

In the above demo1 code, lines 22-24 define the version() function, and the return value of this function is uint256 type. The operation in the function body is to return 1. Users can call this function, which returns the version of the current contract, here version is 1.

As mentioned before, state variables that need to be stored persistently are generally a paired key-value structure. The key represents the address of the state variable, and the value represents the value of the state variable. In the above code of demo1, two state variables ID and sex are declared in the header, and each state variable will have a key. It should be noted that the space occupied by these two state variables is fixed, namely 32bytes and 1bytes.

Each contract generally has its own storage space. This storage space is virtual, and the capacity can be a very large array, such as an array with 2 ²⁵⁶ elements, numbered from 0 to 2 ²⁵⁶ -1. Each element can occupy a certain length, such as 32 bytes. Each element is called a slot here, as shown in Figure 8. The values of the two state variables, ID and sex, can be stored in

slots

0 and 1, for example. It should be noted that the total storage space of ^2,256 slots is the total capacity of the virtual space. In other words, unused slots will not occupy the actual storage space of the underlying database.

As mentioned before, the demo1 contract written in a high-level language such as solidity is compiled by a compiler to generate bytecode.

The execution of the contract can be shown in Figure 7. For example, a transaction that calls a contract in Figure 2 is sent to the blockchain network, and after consensus, each node can execute the transaction. The to field of the transaction indicates the address of the called contract. Any node can find the storage of the contract account based on the address of the contract, and then can read the Codehash from the storage of the contract account, and then find the corresponding contract bytecode based on the Codehash. The node can load the bytecode of the contract from storage into the virtual machine. Then, the interpreter (Interpreter) interprets and executes it, including parsing the bytecode of the called contract (Parse, such as parsing Push, Add, SGET, SSTORE, Pop, etc.) to obtain the operation code (OPcode) and function, and Store these OPcodes in the memory space (memory) opened by the virtual machine (alloc in the figure; after the program execution is completed, the corresponding memory release operation, such as Free in the figure), and also obtain the jump position of the called function in the memory space. (JumpCode). Generally, after calculating the gas required to execute the contract and the gas is sufficient, jump to the corresponding address of Memory to obtain the OPcode of the called function and start execution, and calculate the data operated by the OPcode of the called function (Data Computation), push/pull operations such as stack (Stack) to complete data calculation. During this process, you may also need some context information of the contract, such as the block number, the information of the initiator of the calling contract, etc. This information can be obtained from the Context (Get operation). Finally, the generated state is stored in the database storage (Storage) by calling the storage interface. It should be noted that during the process of contract creation, certain functions in the contract may also be executed, such as functions for initialization operations. At this time, the code will also be parsed, jump instructions generated, stored in Memory, and operations on Stack wait.

Through the above process, the virtual machine loads and executes the bytecode of the contract, which may generate status and/or read status, thereby requiring access to the underlying database. The virtual machine needs to easily access the underlying KV database. To access the KV database, you can generally use pointer-like data access capabilities. For example, if you need to read the value corresponding to a key from the KV database, you need to know the key of this data before accessing it.

As previously described with respect to Figures 5, 6 and corresponding text, for write operations, the execution module (including the virtual machine therein) executes the contract to generate kv, and for read operations, the execution module (including the virtual machine therein) executes the contract to generate k. This k is the key generated by the execution module or blockchain platform, here called the state key. In the tree building module, this state key needs to be built into the MPT tree to obtain a series of tree node keys from the MPT root node-intermediate node-leaf node. If it is a read operation, the corresponding value can be found in the state database module according to the tree node key. If it is a write operation, a series of tree node key-values from the MPT root node-intermediate node-leaf node are generated, and these tree node kv are written to the state database module in an append manner.

When the virtual machine executes the contract bytecode, the position of the same state variable needs to be fixed, so that the key generated by the contract execution is fixed. This fixation is usually fixed after the contract code is determined. Therefore, the location of this fixed state variable is usually determined during the compiler compilation process and is not directly related to the virtual machine.

The compiler compilation process can roughly include steps such as lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation. Among them, during the lexical/grammatical analysis based on the abstract syntax tree, the position information of the contract's state variables can be generated. For example, the two state variables ID and sex in the above demo1 are located at 0 and 1 respectively, which can correspond to the following two positions in the aforementioned slot:

0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001

The positions of the two slots above are each 256 bits, which is represented by 32 bytes in hexadecimal (0x). The four consecutive hexadecimal numbers in each segment separated by spaces after 0x above represent 2 bytes, so there are 16 such segments in total.

In this way, in the contract bytecode compiled by the compiler, these two positions in the slot can be used to replace the identifiers of the two state variables, such as the above two 256 bits replacing ID and sex respectively. In this way, when the data type is a fixed-size value, the storage location can be pre-allocated for each data to be stored according to the field sort order during compilation. This is equivalent to specifying a fixed data pointer in advance.

When the virtual machine loads and executes the contract bytecode, the operation on the ID is the operation on 0x000...00 (that is, the position 0 of the above slot, with ellipses replacing the many 0s in the middle). Similarly, the operation of sex is the operation of the slot position 0x000...01

For example, in a transaction that calls a contract, the setID() function in the contract is called, and the input parameter is the string "0001". When the virtual machine executes the transaction, the slot position 0x000...01 is stored as 0001. Specifically, the virtual machine can push the 32-byte slot of 0x000...01 into the stack, and then push the corresponding value into the stack. During the explanation and execution process, the OPcode of the called function is obtained from memory and starts execution. According to the first-in-last-out or last-in-first-out characteristics of the stack, the value is popped from the stack, and then the slot is popped from the stack to form a slot-value pair. Then the contract virtual machine executes the current opcode, that is, writes the value of value to the storage at the slot location. In the above process, the stack generally uses 32 bytes as a storage unit, which is equal to the length of one slot. The corresponding value may be less than, equal to, or greater than 32 bytes, and the value may occupy one or more units in the stack. unit.

For example, in a transaction that calls a contract, the setSex() function in the contract is called, and the input parameter is "1" (for example, 1 means a boy, 0 means a girl), then when the virtual machine executes the transaction, 0x000.. .02 stores 1 in this slot.

The results generated by the execution of the above two contracts can be simply expressed as follows:

0x000...00:0001 (1)

0x000...01:1 (2)

The above content can also be shown in Figure 9. As for the slot 0x000...01, the bool type is stored, which only occupies 1 byte. The bool value 1 can be stored in the lower 8 bytes of this slot, as shown at the bottom of Figure 9.

After the virtual machine is executed, the virtual machine or blockchain platform can convert the slots in the three slot-value pairs (1) and (2) into state keys. Specifically, the state key can be obtained by splicing the contract address + slot position. For example, the address length of the demo1 contract is 20 bytes, which is 0x3321dcaf8911d384 2e14a7a4 15be 2fb1a337f43e. Then use the splicing contract address + slot position,

The status key of (1) is:

0x3321dcaf8911d3842e14a7a415be2fb1a337f43e0000000000000000000000000000000000000000000000000000000000

The status key of (2) is:

0x3321dcaf8911d3842e14a7a415be2fb1a337f43e000000000000000000000000000000000000000000000000000000000000001

During the process of storing these generated state key-values in the underlying database, as shown in Figure 6, these keys can be converted into the storage trie tree as shown in Figure 5 by the tree building module. Then, the tree building module constructs the value in the state key-value into a leaf node of the MPT tree according to the tree structure. It should be noted that, as mentioned above, the state key is divided into several small segments and stored in the tree nodes in order from the root of the storage trie tree to the leaf nodes. As for which segment of the state key is stored in each tree node, it depends on the common prefix between the state key and other state keys in the tree. From the leaf node upward through the intermediate node to the root node, a series of hash value changes will occur, and the tree building module will build the kv of these tree nodes. Furthermore, the building tree module sends these changed tree nodes kv to the state database module, and finally the state database module stores them in the state database shown in Figure 6.

In the above process, the slot for reading and writing state variables is fixed during the compilation process. As mentioned above, it is determined by the contract address and the slot position of the state variable. The state key can be obtained by concatenating the contract address and slot. In this way, running the same contract and reading and writing the same state variables will use the same slot and correspond to a fixed state key.

As mentioned before, some blockchains perform contract upgrades to keep the upgraded contract at the same contract address as the pre-upgraded contract. Despite this, it is still challenging to ensure compatibility before and after upgrade, that is, the state in the upgraded new contract maintains the ability to read the value of the same state in the old contract.

For example, the following code example 3 is a code example of the upgraded new contract:

Code example 3. solidity code of demo2

The above code of demo2, compared to demo1, inserts a new state variable age in line 4 of demo2. In this way, the sex originally located on line 4 in demo1 is moved backward in demo2 and becomes line 5. In addition, demo2 adds new write and read functions related to state variables in lines 15-17 and 19-21, which are setAge() and getAge() respectively.

The new demo2 code will still be compiled by the compiler to obtain bytecode. During the compilation process, similarly, the compiler will generate slot positions for the three state variables ID, age, and sex in the above demo2, which are:

0x0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0002

In this way, in the contract bytecode compiled by the compiler, these three positions in the slot can be used to replace the identifiers of the three state variables. For example, the above three 256 bits replace ID, age, and sex respectively. When the compiled contract bytecode is loaded into the virtual machine and executed, the operation on ID is the operation on the slot position 0x000...00, and the operation on age is the operation on the slot position 0x000...01 The operation of sex is the operation of the slot position 0x000...02.

It can be seen that the sex in the 5th line of the demo2 code after the upgrade is obviously the 4th line of the demo1 code before the upgrade, and both are state variables with the same meaning. In the above upgrade method, the operation of the slot position 0x000...01 after the upgrade changes to the operation of age, which is obviously inconsistent with the sex of the slot position 0x000...01 before the upgrade. Then, if you execute the operation of reading age in the upgraded contract bytecode, you will read the actual sex value at this position before the upgrade, causing confusion. On the other hand, after the upgrade, the operation of sex has changed to the operation of the slot position 0x000...01. Then, if you perform the operation of reading sex from the contract bytecode after the upgrade, since there was no value at this position before, you can only read the default null value or 0 value, but cannot read the correct value.

This application provides a compatibility solution for detecting contract upgrades, as shown in Figure 10, including:

S110: Generate abstract syntax trees of contracts before and after the upgrade.

As mentioned above, the compiler compilation process can roughly include lexical/syntactic analysis based on the abstract syntax tree, filling symbols based on the symbol table, semantic analysis, and code generation.

Here, lexical/grammatical analysis can be performed on the smart contract code before and after the upgrade based on the abstract syntax tree, and the abstract syntax tree of the contract before and after the upgrade can be generated. Specifically, you can use the solidity compiler and use the "--ast-compact-json" command to input the contract source code before and after the upgrade. Then you can generate the abstract syntax tree json (JavaScript Object Notation, JavaScript object representation) of the contract before and after the upgrade respectively. law) documents.

For example, for the demo1 contract before the above upgrade, lexical/grammatical analysis is performed based on the abstract syntax tree, and the abstract syntax tree of the contract before the upgrade is generated as follows:

Code example 4.Abstract syntax tree of demo1

The above code example 4 is the generated abstract syntax tree of the demo1 contract before the upgrade, and is marked with //... comments. Among them, the storage location in line 11 is the calculation rule for slot. In the abstract syntax tree above, from the 3rd line nodes onwards, the information of the contract state variables in the abstract syntax tree is described. Information related to a state variable is placed in a node. Regarding the two state variables ID and sex, they are actually divided into two nodes. Lines 5-19 are the first node about ID, and lines 20-34 are the second node about sex. Each node contains several pieces of information. On the whole, the information inside a node can be called the node information of the abstract syntax tree.

For example, for the above-mentioned upgraded demo2 contract, lexical/grammatical analysis is performed based on the abstract syntax tree, and the abstract syntax tree of the contract before the upgrade is generated as follows:

Code example 5.Abstract syntax tree of demo2

The above code example 5 is the generated abstract syntax tree of the upgraded demo2 contract. Since a line is added between the original ID and sex in the demo2 code to declare the age variable, which is line 4 in demo2, the ID node information in lines 5-19 and 35-50 in the abstract syntax tree of demo2 Between the sex node information of the rows, the age node information of rows 20-34 is inserted.

S120: Parse the generated abstract syntax tree, and sequentially extract basic information from the node information of each abstract syntax tree.

Parse the above code examples 4 and 5, that is, parse the generated abstract syntax tree, and extract the basic information in the node information of each abstract syntax tree in order. The basic information here can at least include the node order, and further include the state variable name and/or type.

The following example is to extract the abstract syntax tree node information including node order, state variable name and state variable type from the demo1 contract code before the upgrade:

Node information 1: ID{typeString:uint256,...}

Node information 2: sex{typeString:bool,...}

Similarly, extract the abstract syntax tree node information including node order, state variable name and state variable type of the upgraded demo2 contract code, as follows:

Node information 1: ID{typeString:uint256,...}

Node information 2: age{typeString:uint256,...}

Node information 3: sex{typeString:bool,...}

S130: Compare the basic information in the node information of the abstract syntax tree before and after the upgrade, and obtain a compatibility conclusion.

For example, the basic information in the abstract syntax tree node information before and after the upgrade in S120 above can be obtained as the following comparison table:

升级前的抽象语法树节点信息中的基础信息Basic information in abstract syntax tree node information before upgrade	升级后的抽象语法树节点信息中的基础信息Basic information in the upgraded abstract syntax tree node information
节点信息1：ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}	节点信息1：ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2：sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}	节点信息2：age{typeString:uint256,...}Node information 2: age{typeString:uint256,...}
	节点信息3：sex{typeString:bool,...}Node information 3: sex{typeString:bool,...}

Table 2. Comparison table of node information of the abstract syntax tree before and after the upgrade

In Table 2 above, the node information of the abstract syntax tree before and after the upgrade can be placed in the same row in order. Furthermore, through line-by-line scanning and comparison, it can be determined whether the left and right are the same. In other words, it is to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade. Of course, it is best to compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade in the order of node numbers. In Table 2 above, when the row of node information 2 is scanned, the state variable names can be compared and found to be different, and the conclusion that they are incompatible can be drawn.

This is because, as mentioned earlier, according to the slot generation rules, slots are generated based on node information. Furthermore, the state key of the state variable is generated by concatenating the contract address and slot. In other words, the slots generated in the node information 2 line are the same, for example, 0x000...01, and are not related to the variable names. In this way, the value generated by the same slot or state key according to the contract logic after the contract upgrade is another state variable different from the state variable before the upgrade, which is generally inconsistent. The reason why it is said that it is generally inconsistent is because if the name of the state variable is only adjusted before and after the upgrade, the contract logic involving the state variable will not change, and it will not actually cause incompatibility. However, such a contract upgrade situation is more complicated. Rare. In most cases, there will be certain changes in the contract logic after adjusting the state variable name. Therefore, further, it can be judged based on the type of state variables. If the state variable type changes in the node information of the same row, incompatible conclusions can also be drawn. Of course, if you compare the state variable name and state variable type in the same row of node information, you can get a stronger incompatibility conclusion.

In addition, compare the basic information in the node information of the abstract syntax tree before and after the upgrade. If the node where the state variable name is located after the upgrade is in a different order than the abstract syntax tree node where the same state variable name is before the upgrade, there is a high probability that the draw an incompatible conclusion. This is because, generally speaking, the same state variable name before and after the upgrade refers to the same state variable; then if the order of the abstract syntax tree nodes where the same state variable is located before and after the upgrade is different, the slots are generally different, and the reading of the data Writing creates confusion.

In fact, through S130 comparison, if the conclusion is that they are incompatible, conflicting slots can also be obtained, such as node order 2 in Table 2 above. In this way, the basic information/node information of conflicting slot locations can be fed back to developers, such as generating logs, alarms, etc., or notified through screen prompts, emails, instant messages, etc. to advise developers or platform positioning Incompatibilities, and are particularly helpful for developers to make modifications.

There is also a situation where, compared with the pre-upgrade state variables, new state variables are added after the original state variables, and a compatible conclusion can be drawn. For example, the following sample code:

Code Example 6. Solidity code for 'demo2'

It can be seen that in the code of demo2', compared with demo1, the state variable age is added after the original ID and sex. In this way, the abstract syntax tree of demo2' is as follows:

Code example 7.demo2' abstract syntax tree

According to S120, extract the abstract syntax tree node information including node order, state variable name and state variable type of the upgraded demo2' contract code, as follows:

Node information 1: ID{typeString:uint256,...}

Node information 2: sex{typeString:bool,...}

Node information 3: age{typeString:uint256,...}

In this way, the basic information in the abstract syntax tree node information before and after the upgrade can be obtained as the following comparison table:

升级前的抽象语法树节点信息中的基础信息Basic information in abstract syntax tree node information before upgrade	升级后的抽象语法树节点信息中的基础信息Basic information in the upgraded abstract syntax tree node information
节点信息1：ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}	节点信息1：ID{typeString:uint256,...}Node information 1: ID{typeString:uint256,...}
节点信息2：sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}	节点信息2：sex{typeString:bool,...}Node information 2: sex{typeString:bool,...}
	节点信息3：age{typeString:uint256,...}Node information 3: age{typeString:uint256,...}

Table 3. Comparison table of node information of the abstract syntax tree before and after the upgrade

As can be seen from Table 3, the state variable ID and sex before the upgrade have not changed. The total state variable name, state variable type and abstract syntax tree node order of the code after the upgrade have not changed. In this way, the ID and sex after the upgrade remain the same as before the upgrade. slot and therefore are compatible. The upgraded state variable is newly added after the pre-upgraded state variable. According to the slot generation rules, it will not affect the slot of the previous state variable, and will not affect the state key, so a compatible conclusion can be drawn. Obviously, in this case, it is easier to compare in node number order.

It should be noted that the status variables of ID, age, and sex in the above example are of type uint256, uint256, and bool respectively, where uint256 is 256 bits, that is, 32 bytes, and the bool type is 1 byte. The type of these state variables determines that the length of the data is fixed, or fixed-length. In addition, there are types such as uint, uint8, uint128, etc., which are also fixed-length. An array of a certain number of fixed-length elements is also fixed-length. For example, uint[2] includes 2 elements. Each element is a uint type of 32 bytes, so the overall uint[2] is 64 bytes.

In addition to fixed-length data storage, there are also variable-length data storage, or the data size is unpredictable. In this case, the storage location cannot be directly determined during compilation according to the fixed-length method, but a different solution is adopted. For example, dictionary (mapping) is a data type of indefinite length. The storage layout of the dictionary is to store Key and its corresponding value, and each Key corresponds to one storage. The corresponding storage location of a Key is keccak256 (key.slot), where "." is the splicing symbol, the key before "." is the key of a dictionary element, and the slot dictionary name after "." is located in the position of the slot. For example, in the demo2 contract, after ID, age, and sex, there is mapping (uint256 => string) a. Then the number of elements in this dictionary is uncertain, and the lengths of key and value in the elements are also uncertain. After the demo2 contract is called several times, there may be two elements in dictionary a, which are:

a["u1"]＝0x18;

a ["u2"] = 0xac5B4FC54A5FA637D8C9853ADA1430EA9203817DF1F1F85F8E63A30F6713F68F79DB22DBE17D43A16C720EDD5D8787843 EBF0B0B59;

Then in the slot 0x000...06, the name a of the dictionary can be stored. The key of the first item in dictionary a is u1 and the value is ox18. The storage location of u1 can be keccak256 ("u1".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes). Similarly, the key of the second item in dictionary a is u2, and the value is a long hexadecimal number. The storage location of u2 can be keccak256 ("u2".0x000...03), and value can be stored in one or multiple consecutive slots starting from this location (for example, the data length of value is greater than 32 bytes).

Take the value length of u1 as less than 32 bytes and the value length of u12 as greater than 32 bytes and less than 64 bytes as an example.

As shown in Figure 11, the value length of u1 is less than 32 bytes, and its value can be stored in one slot. The location of this slot is, for example, the value of keccak256("u1",0x000...03), for example:

0x5b4ded6cc1629f138186f4b0795004adbed7ec13374d15ca04ec96f149132460

The value length of u2 is greater than 32 bytes and less than 64 bytes. Its value can be stored in two consecutive slots. The starting positions of these two slots are, for example, the value of keccak256("u2",0x000...03). Then this The locations of the two slots are:

0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081d

0x90191b3f1d96c216c6a6637b9c8498bc25cc907afe246d611b3a8bf727bc081e

Whether the data of the above-mentioned variable length is compatible can also be determined by executing the above-mentioned processes of S110 to S130. For example, the abstract syntax tree node information generated for the above dictionary structure is:

Code example 8. Abstract syntax tree of dictionary structure

During the execution of S120, each slot in the variable-length data structure needs to be traversed, because the positions of these slots may be the basis for calculating the final state key. In particular, for situations where a value like u2 occupies multiple consecutive slots, the starting position can be used as the basis for calculating the state key.

In addition, some composite structures exist. For example, the composite structure of structure and dictionary is as follows solidity code:

Code Example 9. Solidity code for composite structures

In the above code, a structure StructDemo is declared in pages 7-10, which includes two elements, c_ of uint256 type and d_ of bytes type. Then, a dictionary is declared on line 11. This dictionary is a mapping of uint to the structure StructDemo. The structure and dictionary part of the above code, the abstract syntax tree generated according to S110 includes:

Code example 10. Abstract syntax tree of compound structure

Lines 6-17 of the above abstract syntax tree are the node information of uint256c_ in the structure StructDemo, and lines 18-31 are the node information of bytes d_ in the structure StructDemo. The bytes type is also 32 bytes. It should be noted that lines 10 and 23 are both "stateVariable": false, indicating that neither c_ nor d_ is a state variable, and neither will be stored in the underlying database.

Then in S120, the structure elements c_ and d_ of "stateVariable":false will not be extracted, because c_ and d_ will not be directly stored in the underlying database here. But in dictionary a, "stateVariable" is true, the basic information in the node information of this abstract syntax tree will be extracted, because dictionary a will be stored in the underlying database, and the elements include structures c_ and d_ will be stored in the underlying database. In other words, the structure only declares it. In this way, in S120, the abstract syntax tree generated by parsing is used. If the state variable in the node information is true, the basic information in the node information of each abstract syntax tree is sequentially extracted.

In addition, for compound structures, such as the case where value is a structure in the above dictionary structure, the generated abstract syntax tree is parsed, and the basic information in the node information of each abstract syntax tree is sequentially and recursively extracted. This is because, although the "stateVariable" attribute in the node information of the structure itself is false, as a value in the dictionary, it has its own slot and state key. By extracting the basic information of node information sequentially and recursively, all state variables that will actually be stored in the underlying database can be expanded so that they are not missed. The recursive method is to expand the nested structure definition to obtain the included data structure; if there are more levels of nesting, continue to obtain the included data structure recursively.

In this example, assume that two map2_ are initialized in the contract code, respectively:

Then extract the abstract syntax tree node information of the obtained demo2″ contract code including node order, state variable name and state variable type, as follows:

Node information 1: ID{typeString:uint256,...}

Node information 2: age{typeString:uint256,...}

Node information 3: sex{typeString:bool,...}

Node information 4: map2_{typeString:mapping(uint256＝>struct demo″_.StructDemo),...}

Node information 5: map2_StructDemo_0:c_{typeString:uint256,...}

Node information 6: map2_StructDemo_0:d_{typeString:bytes,...}

Node information 7: map2_StructDemo_1:c_{typeString:uint256,...}

Node information 8: map2_StructDemo_2:d_{typeString:bytes,...}

If there is another upgraded contract including ID, age, sex, and map2_, the abstract syntax tree node information of the extracted contract code including node order, state variable name, and state variable type is consistent with the above, then it is compatible, and vice versa. Inconsistency means incompatibility. The slot diagram of the above example is shown in Figure 12.

In another example, assuming that map2_ is not initialized in the contract code, the resulting abstract syntax tree node information including node order, state variable name, and state variable type of the contract code is as follows:

Node information 1: ID{typeString:uint256,...}

Node information 2: age{typeString:uint256,...}

Node information 3: sex{typeString:bool,...}

Then before and after the contract upgrade, when comparing node information 2, the nested struct structure needs to be compared recursively. If the abstract syntax tree node information extracted before and after the upgrade, node information 4 and the structures nested in it are consistent, then it is compatible (assuming that

node information

1, 2, and 3 are all consistent), otherwise it is incompatible.

The upgraded contract writing method needs to meet certain specifications to make the upgrade compatible before and after, that is, the state in the upgraded new contract needs to maintain the ability to read the value of the same state in the old contract. Users often ignore these specifications when writing upgraded contracts, resulting in serious problems such as data loss and confusion in the upgraded contracts. Through the above example, a solidity contract upgrade storage data compatibility detection solution based on abstract syntax trees can be implemented.

The following is an introduction to a contract upgrade compatibility detection device of this application, which includes: an abstract syntax tree generation unit, used to generate abstract syntax trees of contracts before and after the upgrade; an extraction unit, used to parse the generated abstract syntax tree and sequentially extract each The basic information in the node information of an abstract syntax tree; the comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.

The extraction unit parses the generated abstract syntax tree, and if the state variable in the node information is true, sequentially extracts the basic information in the node information of each abstract syntax tree.

The abstract syntax tree generation unit performs lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generates the abstract syntax tree of the contract before and after the upgrade.

The basic information includes the node order of the abstract syntax tree, and further includes state variable names and/or types.

The comparison unit compares basic information in node information with the same node number in the abstract syntax tree before and after the upgrade.

The comparison unit compares the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade according to the node number sequence.

During the comparison process of the comparison unit, if the state variable names obtained by comparison are different, it is incompatible; or if the state variable types obtained by comparison are different, it is incompatible; or if the state variable names and state variable types are both different, it is incompatible.

The comparison unit compares the upgraded state variable with a new state variable added after the pre-upgraded state variable, and determines that it is compatible.

For compound structures, the extraction unit parses the generated abstract syntax tree, and sequentially and recursively extracts basic information in the node information of each abstract syntax tree.

The detection device also includes a feedback unit, where the comparison result of the comparison unit is incompatible, and the feedback unit feeds back the basic information/node information of the conflicting slot position.

The following introduces a client embodiment of the present application, which includes: a processor, a memory, and a program stored therein. When the processor executes the program, the method described in any of the above embodiments is executed to implement, for example, contract detection. Upgrade compatibility and other purposes.

In the 1990s, improvements in a technology could be clearly distinguished as hardware improvements (for example, improvements in circuit structures such as diodes, transistors, switches, etc.) or software improvements (improvements in method processes). However, with the development of technology, many improvements in today's method processes can be regarded as direct improvements in hardware circuit structures. Designers almost always obtain the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that an improvement of a method flow cannot be implemented using hardware entity modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic functions are determined by the user programming the device. Designers can program themselves to "integrate" a digital system on a PLD, instead of asking chip manufacturers to design and produce dedicated integrated circuit chips. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is mostly implemented using "logic compiler" software, which is similar to the software compiler used in program development and writing, and before compilation The original code must also be written in a specific programming language, which is called Hardware Description Language (HDL), and HDL is not just one kind, but there are many, such as ABEL (Advanced Boolean Expression Language) , AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., are currently the most commonly used The two are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should also know that by simply logically programming the method flow using the above-mentioned hardware description languages and programming it into the integrated circuit, the hardware circuit that implements the logical method flow can be easily obtained.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (eg, software or firmware) executable by the (micro)processor. , logic gates, switches, Application Specific Integrated Circuit (ASIC), programmable logic controllers and embedded microcontrollers. Examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, For Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic. Those skilled in the art also know that in addition to implementing the controller in the form of pure computer-readable program code, the controller can be completely programmed with logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded logic by logically programming the method steps. Microcontroller, etc. to achieve the same function. Therefore, this controller can be considered as a hardware component, and the devices included therein for implementing various functions can also be considered as structures within the hardware component. Or even, the means for implementing various functions can be considered as structures within hardware components as well as software modules implementing the methods.

The systems, devices, modules or units described in the above embodiments may be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a server system. Of course, this application does not rule out that with the development of computer technology in the future, the computer that implements the functions of the above embodiments may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular phone, a camera phone, a smart phone, or a personal digital assistant. , media player, navigation device, email device, game console, tablet, wearable device, or a combination of any of these devices.

Although one or more embodiments of this specification provide method operation steps as described in the embodiments or flow charts, more or fewer operation steps may be included based on conventional or non-inventive means. The sequence of steps listed in the embodiment is only one way of executing the sequence of many steps, and does not represent the only execution sequence. When the actual device or terminal product is executed, it may be executed sequentially or in parallel according to the methods shown in the embodiments or figures (for example, a parallel processor or a multi-thread processing environment, or even a distributed data processing environment). The terms "comprises," "comprises" or any other variation thereof are intended to cover a non-exclusive inclusion such that a process, method, product or apparatus including a list of elements includes not only those elements but also others not expressly listed elements, or also elements inherent to the process, method, product or equipment. Without further limitation, it does not exclude the presence of additional identical or equivalent elements in a process, method, product or apparatus including the stated elements. For example, if the words "first" and "second" are used to express names, they do not indicate any specific order.

For the convenience of description, when describing the above device, the functions are divided into various modules and described separately. Of course, when implementing one or more of this specification, the functions of each module can be implemented in the same or multiple software and/or hardware, or the modules that implement the same function can be implemented by a combination of multiple sub-modules or sub-units, etc. . The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

The disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.

These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.

These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

Memory may include non-permanent storage in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer-readable media includes both persistent and non-volatile, removable and non-removable media that can be implemented by any method or technology for storage of information. Information may be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory. (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape, magnetic tape storage, graphene storage or other magnetic storage devices or any other non-transmission medium can be used to store information that can be accessed by a computing device. As defined in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

It should be understood by those skilled in the art that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, one or more embodiments of the present description may employ a computer program implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. Product form.

One or more embodiments of this specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. One or more embodiments of the present description may also be practiced in distributed computing environments where tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.

Each embodiment in this specification is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment. In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of this specification. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.

The above descriptions are only examples of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. To those skilled in the art, various modifications and changes may be made to one or more embodiments of this specification. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this specification shall be included in the scope of the claims.

Claims

A method to detect the compatibility of contract upgrades, including:

Generate abstract syntax trees of contracts before and after the upgrade;

Parse the generated abstract syntax tree and sequentially extract the basic information from the node information of each abstract syntax tree;

Compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
The method according to claim 1, said parsing the generated abstract syntax tree and sequentially extracting basic information from the node information of each abstract syntax tree, including:

Parse the generated abstract syntax tree, and if the state variables in the node information are true, sequentially extract the basic information in the node information of each abstract syntax tree.
The method according to claim 1, generating abstract syntax trees of contracts before and after the upgrade includes:

Perform lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generate the abstract syntax tree of the contract before and after the upgrade.
The method of claim 1, wherein the basic information includes the node order of the abstract syntax tree, and further includes state variable names and/or types.
The method according to claim 1, wherein the basic information in the node information of the abstract syntax tree before and after the upgrade is compared, including:

Compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade.
The method according to claim 5, according to the node number sequence, compare the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade.
A method as claimed in claim 5 or 6,

If the comparison results in different state variable names, it is incompatible; or,

If the comparison results in different state variable types, it is incompatible; or,

If both the state variable name and the state variable type are different, they are incompatible.
According to the method of claim 5 or 6, if the upgraded state variable is a new state variable added after the pre-upgraded state variable, it is determined to be compatible.
The method according to claim 1, said parsing the generated abstract syntax tree and sequentially extracting basic information from the node information of each abstract syntax tree, including:

For compound structures, the generated abstract syntax tree is parsed, and the basic information in the node information of each abstract syntax tree is extracted sequentially and recursively.
The method according to claim 1, if incompatible, further comprising feeding back the basic information/node information of the conflicting slot location.
A compatibility detection device for contract upgrades, including:

The abstract syntax tree generation unit is used to generate the abstract syntax tree of the contract before and after the upgrade;

The extraction unit is used to parse the generated abstract syntax tree and sequentially extract the basic information in the node information of each abstract syntax tree;

The comparison unit is used to compare the basic information in the node information of the abstract syntax tree before and after the upgrade to obtain a compatibility conclusion.
The detection device according to claim 11, the abstract syntax tree generated by the extraction unit parses, and if the state variable in the node information is true, the basic information in the node information of each abstract syntax tree is sequentially extracted.
The detection device according to claim 11, wherein the abstract syntax tree generation unit performs lexical/grammatical analysis on the smart contract code before and after the upgrade based on the abstract syntax tree, and generates the abstract syntax tree of the contract before and after the upgrade.
The detection device according to claim 11, the basic information includes the node order of the abstract syntax tree, and further includes the name and/or type of the state variable.
The detection device according to claim 11, wherein the comparison unit compares basic information in node information with the same node number in the abstract syntax tree before and after the upgrade.
The detection device according to claim 15, wherein the comparison unit compares the basic information in the node information with the same node number in the abstract syntax tree before and after the upgrade according to the node number sequence.
The detection device according to claim 15 or 16, the comparison unit

If the comparison results in different state variable names, it is incompatible; or,

If the comparison results in different state variable types, it is incompatible; or,

If both the state variable name and the state variable type are different, they are incompatible.
The detection device according to claim 15 or 16, wherein the comparison unit compares the upgraded state variable with a new state variable added after the pre-upgraded state variable, and determines that the state variable is compatible.
The detection device according to claim 11,

For compound structures, the extraction unit parses the generated abstract syntax tree, and sequentially and recursively extracts basic information in the node information of each abstract syntax tree.
The detection device according to claim 11, further comprising a feedback unit, the comparison result of the comparison unit is incompatible, and the feedback unit feeds back the basic information/node information of the conflicting slot position.
A client that includes:

processor,

The memory stores a program, wherein when the processor executes the program, the method described in any one of claims 1-10 is performed.