WO2020258855A1

WO2020258855A1 - Blockchain-based hierarchical storage method and apparatus, and electronic device

Info

Publication number: WO2020258855A1
Application number: PCT/CN2020/072026
Authority: WO
Inventors: 陆钟豪; 卓海振; 俞本权
Original assignee: 创新先进技术有限公司
Priority date: 2019-06-28
Filing date: 2020-01-14
Publication date: 2020-12-30
Also published as: CN110334154B; CN110334154A; TW202101249A; TWI737157B

Abstract

A blockchain-based hierarchical storage method. Data nodes on a Merkle state tree organized by account state data of a blockchain are stored in a database in key-value pairs; a key is a two-tuple composed of a node ID and a block number; the database comprises multi-level data storage; and the block number indicates a block in which a data node is located when data is updated. The method comprises: when any level of target data storage meets a data migration condition, determining a block number interval to be migrated in the target data storage; determining a migration critical value on the basis of the block number interval; iteratively traversing two-tuples stored in the target data storage to find a target data node the block number of which in the two-tuples is less than the migration critical value; finding whether a data node which has the same node ID as the target data node and the block number of which in the two-tuples is greater than the target data node and less than the migration critical value is stored in the target data storage; and if yes, then migrating the target data node to the next level of data storage.

Description

Block chain-based hierarchical storage method and device, and electronic equipment

Technical field

One or more embodiments of this specification relate to the field of blockchain technology, and in particular to a method and device for hierarchical storage based on blockchain, and electronic equipment.

Background technique

Blockchain technology, also known as distributed ledger technology, is an emerging technology in which several computing devices participate in "bookkeeping" and jointly maintain a complete distributed database. Because the blockchain technology has the characteristics of decentralization, openness and transparency, each computing device can participate in database records, and the rapid data synchronization between computing devices, the blockchain technology has been widely used in many fields. To apply.

Summary of the invention

This specification proposes a hierarchical storage method based on the blockchain. The data nodes on the Merkle state tree organized into account state data of the blockchain are stored in the database in the form of Key-Value pairs; among them, The key of the Key-Value key-value pair is a two-tuple consisting of the node ID of the data node and the block number marked for the data node; the database includes multi-level data storage; the block number indicates The block where the data node is located when the data is updated; the method includes:

When any level of target data storage in the database meets the data migration condition, determine the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage;

Determine the migration threshold value based on the block number interval; wherein the migration threshold value is greater than the block number threshold value of the block number interval;

Iteratively traverse the two-tuple of data nodes stored in the target data store, and search for a target data node whose block number in the two-tuple is less than the migration threshold;

According to the two-tuple of the target data node, it is further searched whether the same node ID as the target data node is stored in the target data storage, and the block number in the two-tuple is greater than the Target data node and a data node smaller than the migration threshold; if yes, migrate the target data node to the next level of the target data storage; otherwise, keep the target in the target data storage Data node.

Optionally, the determining a migration threshold based on the block number interval includes:

If the right interval of the block number interval is an open interval, determining the right end value of the block number interval as the migration critical value;

If the right interval of the block number interval is a closed interval, the sum of the right end point value of the block number interval and the block number increment step of the blockchain is determined as the migration critical value.

Optionally, the search is to find whether the target data storage stores the same node ID as the target data node, and the block number in the two-tuple is greater than the target data node and smaller than the migration The critical value data nodes include:

Look up the node ID stored in the target data store that has the same node ID as the target data node, and the block number in the two-tuple is the largest block number among the block numbers that are less than the migration threshold Data node

It is further determined whether the found block number in the two-tuple of the data node is greater than the block number in the two-tuple of the target data node.

Optionally, the method further includes:

Determine the data node on the Merkle state tree of the latest block with data update; construct a Key-Value key-value pair for the data node on the Merkle state tree of the latest block, and combine the Key-Value key value Store the highest level data stored in the database;

Wherein, the key of the Key-Value key-value pair is a two-tuple composed of the block number of the latest block and the node ID of the data node; the Value of the Key-Value key-value pair is the data The data content contained in the node.

Optionally, the Merkle tree is a variant of the Merkle tree incorporating the tree structure of the Trie dictionary tree; the node ID of the data node is a character prefix corresponding to the path from the root node of the Merkle tree to the data node.

Optionally, the Merkle state tree is a Merkle Patricia Tree state tree.

Optionally, the database is a LevelDB database; or a database based on the LevelDB architecture;

Optionally, the database is a Rocksdb database based on the LevelDB architecture.

Optionally, the read and write performance of the storage medium corresponding to the multi-level data storage has performance differences; wherein, the read and write performance of the storage medium corresponding to the high-level data storage is higher than that of the low-level data storage. Read and write performance of storage media.

This specification also proposes a hierarchical storage device based on the blockchain. The data nodes on the Merkle state tree organized by the account state data of the blockchain are stored in the database in the form of Key-Value pairs; wherein The key of the Key-Value key-value pair is a two-tuple consisting of the node ID of the data node and the block number marked for the data node; the database includes multi-level data storage; the block number Indicate the block where the data node is located when the data is updated; the device includes:

The determining module, when any level of target data storage in the database meets the data migration condition, determines the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage; based on the The block number interval determines the migration threshold; wherein the migration threshold value is greater than the block number threshold in the block number interval;

A search module that iteratively traverses the two-tuple of data nodes stored in the target data storage, and searches for a target data node whose block number in the two-tuple is less than the migration threshold;

The migration module, based on the two-tuple of the target data node, further searches whether the target data storage stores the same node ID as the target data node, and the block number in the two-tuple Data nodes larger than the target data node and smaller than the migration threshold; if yes, migrate the target data node to the next level data storage of the target data storage; otherwise, keep it in the target data storage The target data node.

Optionally, the determining module:

Optionally, the migration module further:

Look up the node ID stored in the target data store that has the same node ID as the target data node, and the block number in the two-tuple is the largest block number among the block numbers that are less than the migration threshold Data node; further determining whether the block number in the two-tuple of the data node found is greater than the block number in the two-tuple of the target data node.

Optionally, the device further includes:

The storage module determines the data node whose data is updated on the Merkle state tree of the latest block; constructs a Key-Value key-value pair for the data node whose data is updated on the Merkle state tree of the latest block, and combines the Key- Value key-value pairs are stored in the highest level data storage in the database;

Optionally, the Merkle state tree is a Merkle Patricia Tree state tree.

In the above technical solution, it is possible to prune the data nodes on the Merkle state tree stored in the database, remove the data nodes that record historical state data from the Merkle state tree, and migrate to the next level of data storage, and Continue to store and retain data nodes that record the latest state data in the data storage at this level, and then complete the hierarchical storage of the Merkle state tree stored in the database.

Description of the drawings

FIG. 1 is a schematic diagram of organizing account state data of a blockchain into an MPT state tree according to an exemplary embodiment;

Fig. 2 is a schematic diagram of node multiplexing on an MPT state tree provided by an exemplary embodiment;

Figure 3 is a flowchart of a block chain-based hierarchical storage method provided by an exemplary embodiment;

Fig. 4 is a schematic structural diagram of an electronic device provided by an exemplary embodiment;

Fig. 5 is a block diagram of a block chain-based hierarchical storage device provided by an exemplary embodiment.

Detailed ways

Here, exemplary embodiments will be described in detail, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with one or more embodiments of this specification. On the contrary, they are merely examples of devices and methods consistent with some aspects of one or more embodiments of this specification as detailed in the appended claims.

It should be noted that in other embodiments, the steps of the corresponding method may not be executed in the order shown and described in this specification. In some other embodiments, the method includes more or fewer steps than described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; and multiple steps described in this specification may also be combined into a single step in other embodiments. description.

Blockchain is generally divided into three types: Public Blockchain, Private Blockchain and Consortium Blockchain. In addition, there are many types of combinations, such as private chain + alliance chain, alliance chain + public chain and other different combinations. The most decentralized one is the public chain. The public chain is represented by Bitcoin and Ethereum. Participants who join the public chain can read the data records on the chain, participate in transactions, and compete for the accounting rights of new blocks.

Moreover, each participant (ie, node) can freely join and exit the network, and perform related operations. The private chain is the opposite. The write permission of the network is controlled by an organization or institution, and the data read permission is regulated by the organization. In simple terms, the private chain can be a weakly centralized system with strict restrictions and few participating nodes. This type of blockchain is more suitable for internal use by specific institutions.

Based on the basic characteristics of the blockchain, the blockchain is usually composed of several blocks. A time stamp corresponding to the creation time of the block is recorded in these blocks, and all the blocks strictly follow the time stamp recorded in the block to form a time-ordered data chain.

For the real data generated in the physical world, it can be constructed into a standard transaction format supported by the block chain, and then published to the block chain. The node devices in the block chain will make a consensus, and after reaching a consensus , The node device as the accounting node in the blockchain will package the transaction into the block and carry out persistent storage in the blockchain.

In the field of blockchain, an important concept is the account; taking Ethereum as an example, Ethereum usually divides accounts into external accounts and contract accounts; external accounts are accounts directly controlled by users; and contracts The account is created by the user through an external account and contains the contract code (ie smart contract).

Of course, for some blockchain projects derived from the Ethereum architecture (such as the Ant blockchain), the account types supported by the blockchain can also be further extended, which is not particularly limited in this specification.

For accounts in the blockchain, a structure is usually used to maintain the account status of the account. When the transaction in the block is executed, the status of the account related to the transaction in the blockchain usually changes.

Taking Ethereum as an example, the structure of an account usually includes fields such as Balance, Nonce, Code, and storage. among them:

The Balance field is used to maintain the current account balance of the account;

The Nonce field is used for the number of transactions of the account; it is a counter used to ensure that each transaction can be processed and can only be processed once, effectively avoiding replay attacks.

The code field is used to maintain the contract code of the account; in actual applications, the code field usually only maintains the hash value of the contract code; therefore, the code field is usually also called the codehash field. For external accounts, this field is empty.

The storage field is used to maintain the storage of the account (empty by default). In practical applications, the storage field only maintains the root node of the MPT (Merkle Patricia Trie) tree constructed based on the storage content of the account; therefore, the storage field is usually also called the storageRoot field.

Among them, for the external account, the code field and storage field shown above are null values.

Most blockchain projects usually use Merkle trees; or, based on the data structure of Merkle trees, to store and maintain data. Take Ethereum as an example. Ethereum uses the MPT tree (a variant of Merkle tree) as a form of data organization to organize and manage important data such as account status and transaction information.

Ethereum has designed three MPT trees for the data that needs to be stored and maintained in the blockchain, namely the MPT state tree, the MPT transaction tree and the MPT receipt tree.

The MPT state tree is the MPT tree organized into account state data (state) of all accounts in the blockchain; the MPT transaction tree is the MPT tree organized into the transaction data (transaction) in the block; the MPT receipt tree is The transaction receipt (receipt) corresponding to each transaction generated after the transaction in the block is executed is organized into an MPT tree. The hash values of the root nodes of the MPT state tree, MPT transaction tree, and MPT receipt tree shown above will all be added to the block header.

Among them, the MPT transaction tree and the MPT receipt tree correspond to the blocks, and each block has its own MPT transaction tree and MPT receipt tree. The MPT state tree is a global MPT tree, which does not correspond to a specific block, but covers the account state data of all accounts in the blockchain.

The organized MPT transaction tree, MPT receipt tree, and MPT state tree will eventually be stored in a Key-Value database (for example, LevelDB) that uses a multi-level data storage structure.

The above-mentioned database with multi-level storage structure can usually be divided into n-level data storage; for example, each level of data storage can be set to L0, L1, L2, L3...L(n-1) in sequence; for the above For data storage at all levels in the database, the smaller the level number is, the higher the level is; for example, L0 stores the latest blocks of data, L1 stores the latest blocks of data, and so on.

Among them, the read and write performance of storage media corresponding to all levels of data storage may also generally have performance differences; the read and write performance of storage media corresponding to high-level data storage (that is, with a smaller level number) can be higher than that of low-level data storage The read and write performance of the storage medium corresponding to the data storage.

For example, in practical applications, high-level data storage can use storage media with higher read and write performance; while low-level data storage can use storage media with low unit cost and larger capacity.

In practical applications, as the block height increases, the data stored in the database will contain a lot of historical data; moreover, the smaller the block number, the longer the data in the block, and the less important it is. Therefore, in order to reduce the overall storage cost, it is usually necessary to "discriminately treat" data with different block heights;

For example, data in a block with a smaller block number can be stored on a storage medium with a lower cost; and data in a block with a larger block number can be stored on a storage medium with a higher cost.

When performing hierarchical storage of data such as MPT transaction tree, MPT receipt tree and MPT state tree stored in the database, because the MPT transaction tree and MPT receipt tree correspond to each block, they are actually "inter-block-independent" Data; therefore, it is easy to carry out hierarchical storage for the MPT transaction tree and MPT receipt tree; for example, the hierarchical storage can be completed by directly migrating data according to the block number of the node on the MPT transaction tree and MPT receipt tree.

Based on this, this manual will not specifically elaborate on the hierarchical storage of the MPT transaction tree and the MPT receipt tree, but will focus on the hierarchical storage of the MPT state tree.

Please refer to Fig. 1. Fig. 1 is a schematic diagram of organizing the account state data of the blockchain into an MPT state tree shown in this specification.

The MPT tree is an improved Merkle tree variant that combines the advantages of the Merkle tree and the Trie dictionary tree (also called the prefix tree).

The MPT tree usually includes three types of data nodes, which are leaf nodes, extension nodes, and branch nodes.

Leaf node, expressed as a key-value pair of [key, value], where key is a special hexadecimal code.

The extended node is also a key-value pair of [key, value], but the value here is the hash value (hash pointer) of other nodes. In other words, link to other nodes through the hash pointer.

Branch node, because the key in the MPT tree is encoded into a special hexadecimal representation, plus the final value, the branch node is a list of length 17, and the first 16 elements correspond to 16 of the keys Possible hexadecimal characters (a character corresponds to a nibble). If there is a [key, value] pair that terminates at this branch node, the last element represents a value value, that is, the branch node can be either the end of the search path or the intermediate node of the path.

Suppose the account state data that needs to be organized into the MTP state tree is shown in Table 1 below:

Table 1

In Table 1, the account address is a string of hexadecimal characters. The account state state is a structure composed of the aforementioned Balance, Nonce, Code, and storage fields.

Finally, the MPT state tree organized according to the account state data in Table 1 is shown in Figure 1. As shown in Figure 1, the MPT state tree organized according to the account state data in Table 1 is composed of 4 leaf nodes, 2 branch nodes, and 2 expansion nodes.

In Figure 1, the prefix field is a prefix field shared by the extended node and the leaf node. The value of the prefix field can be used to indicate the node type in practical applications.

The value of the prefix field is 0, which means an extended node that contains an even number of nibbles; as mentioned above, nibble means a half byte and consists of a 4-bit binary. A nibble can correspond to a character that composes the account address.

The value of the prefix field is 1, which means that the extended node contains an odd number of nibble(s);

The value of the prefix field is 2, which means that the leaf node contains an even number of nibbles;

The value of the prefix field is 3, which means that the leaf node contains an odd number of nibble(s).

Since the branch node is a prefix node of a parallel single nibble, the branch node does not have the above prefix field.

The Shared nibble field in the extended node corresponds to the key value of the key-value pair contained in the extended node and represents the common character prefix between account addresses; for example, all account addresses in the above table have the common character prefix a7. The Next Node field is filled with the hash value (hash pointer) of the next node.

The hexadecimal character 0～f field in the branch node corresponds to the key value of the key-value pair contained in the branch node; if the branch node is an intermediate node on the search path of the account address on the MPT tree, then the branch node The Value field can be empty. The 0～f fields are used to fill the hash value of the next node.

The Key-end in the leaf node corresponds to the key value of the key-value pair contained in the leaf node and represents the last few characters of the account address. The key value of each node on the search path from the root node to the leaf node constitutes a complete account address. The Value field of the leaf node is filled with account status data corresponding to the account address; for example, the structure composed of the aforementioned Balance, Nonce, Code, and storage fields can be numbered and then filled into the Value field of the leaf node.

Further, the node on the MPT state tree as shown in Figure 1 is finally stored in the database in the form of Key-Value key-value pairs;

Among them, when the node on the MPT state tree is stored in the database, the key in the key-value pair of the node on the MPT state tree is the hash value of the data content contained in the node; the key value of the node on the MPT state tree The Value in the pair is the data content contained in the node.

That is, when the node on the MPT state tree is stored in the database, the hash value of the data content contained in the node can be calculated (that is, the hash calculation is performed on the entire node), and the calculated hash value is used as the key. The data content contained in the node is used as the value to generate Key-Value key-value pairs; then, the generated Key-Value key-value pairs are stored in the database.

Because the node on the MPT state tree is stored based on the hash value of the data content contained in the node and the data content contained in the node as the value; therefore, when you need to query the node on the MPT state tree, you can usually base it on the node The hash value of the contained data content is used as the key for content addressing. With "content addressing", for some nodes with "repetitive content", they can usually be "multiplexed" to save storage space for data storage.

As shown in Fig. 2, Fig. 2 is a schematic diagram of node multiplexing on an MPT state tree shown in this specification.

In practical applications, every time the blockchain generates a newest block, after the transaction in the latest block is executed, the account status of the account related to these executed transactions in the blockchain will usually follow. Variety;

For example, when a "transfer transaction" in the block is executed, the balances of the transferor account and transferee account related to the "transfer transaction" (that is, the value of the Balance field of these accounts) are usually also Will change accordingly.

After the transaction of the node device in the latest block generated by the blockchain is completed, because the account status in the current blockchain has changed, the node device needs to determine the current account status data of all accounts in the blockchain. Construct an MPT tree to maintain the latest status of all accounts in the blockchain.

That is, whenever a newest block is generated in the blockchain, and after the transaction in the latest block is executed, the account status in the blockchain changes, and the node device needs to be based on the latest updates of all accounts in the blockchain Rebuild an MPT tree with the account status data of

In other words, every block in the blockchain has a corresponding MPT state tree; the MPT state tree maintains that after the transactions in the block are executed, all accounts in the blockchain are up to date The status of the account.

It should be noted that the execution of a transaction in the latest block may only cause the account status of some accounts to change; therefore, when updating the MPT state tree, it does not need to be based on all the current accounts in the blockchain. To rebuild a complete MPT state tree, only need to update the node corresponding to the account whose state has changed on the basis of the MPT state tree corresponding to the block before the latest block. . As for the nodes in the MPT state tree corresponding to the accounts whose account status has not changed, since these nodes are data updates, the corresponding nodes on the MPT state tree corresponding to the block before the latest block can be directly reused, namely can.

As shown in Figure 2, assume that the account status data in Table 1 is the latest account status of all accounts on the blockchain after the transaction in Block N is executed; the MPT status tree organized based on the account status data in Table 1 , Still shown in Figure 1.

Suppose that when the transaction in Block N+1 is executed, the account status of the account address "a7f9365" in Table 1 above is updated from "state3" to "state5"; at this time, the MPT status is updated in Block N+1 When tree, it is not necessary to reconstruct an MPT state tree based on the current state data of all accounts in the blockchain after the transaction in Block N+1 is executed.

Refer to Figure 2. In this case, only the MPT tree corresponding to Block N (that is, the MPT state tree shown in Figure 1), "key-end" is the Value in the leaf node of "9365", by "State3" is updated to "state5" and continues to update the hash pointers of all nodes on the path from the root node to the leaf node; that is, when the leaf node on the MPT state tree is updated, due to the overall hash of the leaf node If the value is updated, the hash pointers of all nodes on the path from the root node to the leaf node will also be updated accordingly. For example, please continue to refer to Figure 2. In addition to updating the Value value in the leaf node whose "key-end" is "9365", you also need to update the f field of the previous branch node (Branch Node) of the leaf node. , The hash pointer that points to the leaf node; further, you can continue to trace back to the root node, and continue to update the branch node’s previous root node (Root Extension Node) filled in the "Next Node" field, pointing to the branch node The hash pointer.

In addition to the above updated nodes, other nodes that have not been updated can directly reuse the corresponding nodes on the MPT state tree of Block N;

Among them, because the MPT tree corresponding to Block N needs to be retained as historical data eventually; therefore, when Block N+1 updates the MPT state tree, the updated nodes are not the original MPT state tree corresponding to Block N. On the basis of the node, directly modify and update, but re-create these updated nodes on the MPT tree corresponding to Block N+1.

That is, for the MPT state tree corresponding to Block N+1, it is actually only necessary to recreate a small number of updated nodes. For other nodes that have not been updated, you can directly reuse the corresponding MPT state tree corresponding to Block N. Node.

For example, as shown in Figure 2, for the MPT state tree corresponding to Block N+1, in fact only a small number of updated nodes need to be recreated; for example, in Figure 2, only one extended node as the root node and one Branch node and a leaf node; for nodes that have not been updated, you can add a hash pointer to the corresponding node on the MPT state tree corresponding to Block N to the re-created nodes on the MPT state tree to complete the node replication use. The pre-updated nodes in the MPT state tree corresponding to Block N will be stored as historical account state data; for example, the leaf node whose "key-end" is "9365" and Value is "state3" shown in Figure 2 , Will be retained as historical data. In the above example, the content update of a small number of nodes in the MPT state tree of Block N+1 can be used to "reuse" most of the nodes in the previous block Block N as an example. In actual applications, the MPT state tree of Block N+1 may also add nodes to the previous block Block N.

In this case, although the newly added node cannot be reused directly from the MPT tree of the previous block Block N, it may be "multiplexed" from the MPT state tree of the earlier block;

For example, the new node on the MPT state tree of Block N+1, although it appeared on the MPT state tree of Block N, appears on the MPT state tree of earlier blocks; for example, it appears in Block N-1. On the MPT state tree; therefore, the newly added node on the MPT state tree of Block N+1 can directly reuse the corresponding node on the MPT state tree of Block N-1.

It can be seen that there are two kinds of "multiplexing" situations in the node reuse of the MPT state tree:

In one case, if only a small number of nodes in the MPT state tree of a block undergo content updates, most nodes in the previous block can be "reused";

Another situation is that if the MPT state tree of a block has a new node compared to the MPT state tree of the previous block, the corresponding node on the MPT state tree of the earlier block can be "multiplexed".

However, through node multiplexing, although the storage space of the database can be saved, there may be complex multiplexing relationships between the nodes on the MPT state tree of each block. The node on the MPT state tree of each block, All may be reused by the next block, or several consecutive blocks after the next block; therefore, this complex node multiplexing relationship is bound to cause difficulties in the hierarchical storage of the MPT state tree.

For example, when some nodes need to be used as historical data to migrate from this level of data storage to the next level of data storage, these nodes may be copied by the next block; even several blocks after the next block. It is impossible to accurately predict which nodes these nodes will be reused; therefore, this will lead to the inability to accurately prune the nodes on the MPT state tree stored in the database; among them, the so-called pruning is Refers to clearing the multiplexing relationship between nodes on the MPT state tree of each block, removing the nodes that record historical state data from the MPT state tree, and retaining the node that records the latest state data. In this case, it is obvious that the demand for hierarchical storage cannot be met.

Based on this, this specification proposes a method for hierarchical storage of a Merkle state tree composed of blockchain account state data.

In the implementation, the account state data of the blockchain can still be organized into a Merkle state tree, and then the data nodes on the Merkle state tree are carried out in the form of Key-Value key-value pairs in a database with a multi-level data storage structure Storage; For example, the data structure of the MPT tree can still be used to organize the account state data of the blockchain into the MPT state tree; and for the nodes on the Merkle state tree stored in the form of Key-Value pairs in the database In other words, in the key of the Key-Value key-value pair corresponding to these nodes, each node can be marked with a block number, which is specifically used to indicate the block number of the block where the node is updated;

For example, in one embodiment, after the transaction in the latest block generated by the blockchain is executed, it can be determined based on the execution result of the transaction in the latest block that the data occurred on the Merkle state tree of the latest block The updated node; among them, the node where the data is updated usually includes the node where the value is updated and the new node. After determining the node whose data has been updated on the Merkle state tree of the latest block, you can mark the block number of the latest block in the key of the Key-Value key-value pair corresponding to these nodes. , Indicating that the node has updated data in the block corresponding to the block number.

Among them, when the data nodes on the Merkle state tree are stored in the database in the form of Key-Value key-value pairs, the hash value of the content contained in the data node may no longer be used as the key of the Key-Value key-value pair, and The key is a two-tuple composed of the node ID of the data node and the block number that can indicate the block where the data node is located when the data is updated.

Further, when any level of target data storage in the database meets the data migration conditions; for example, when the storage capacity of the target data storage reaches a threshold; first, it can be determined that the target data storage needs to be migrated to the next level of data storage The block number interval corresponding to the node of, and the migration threshold value is determined based on the block number interval; wherein, the migration threshold value is greater than the block number threshold value of the block number interval;

For example, in one embodiment, if the right interval of the block number interval is an open interval, the right end value of the block number interval is determined as the migration threshold; for example, suppose the block number interval is [a, b), then b is determined as the above migration critical value; if the right interval of the block number interval is a closed interval, the right end value of the block number interval is incremented by the block number of the blockchain The long sum is determined as the above migration threshold; for example, assuming that the block number interval is [a, b], the block number of the block chain is incremented by 1 (that is, the block number in the block chain is 1 When it is an incremental step size (compactly increasing), then b+1 is determined as the above-mentioned migration critical value.

When the migration critical value is determined, iteratively traverse the two-tuple of data nodes stored in the target data storage to find the target data node whose block number in the two-tuple is less than the migration critical value;

When the target data node is found, according to the above-mentioned two-tuple of the target data node, continue to address in the target data storage, and further search whether the target data storage is stored, which has the same Node ID; and, the block number in the above-mentioned two-tuple is greater than the above-mentioned target data node, and is less than the data node of the migration threshold;

If yes, you can migrate the target data node to the next level data storage of the target data store; for example, write the found target data node to the next level data store of the target data store, and if the write succeeds After that, the target data node is cleared from the target data storage; otherwise, the target data node can be kept in the aforementioned target data storage.

Through the above technical solutions, the nodes on the Merkle state tree stored in the database can be accurately pruned, and the nodes that record historical state data are removed from the Merkle state tree, migrated to the next level of data storage, and stored in this The node that records the latest state data will continue to be stored and retained in the hierarchical data storage, thereby completing hierarchical storage of the Merkle state tree stored in the database.

Please refer to FIG. 3, which is a flowchart of a block chain-based hierarchical storage method provided by an exemplary embodiment. The method is applied to blockchain node equipment; wherein, the data nodes on the Merkle state tree organized into the account state data of the blockchain are stored in the database in the form of Key-Value key-value pairs; wherein, The key of the Key-Value key-value pair is a two-tuple composed of the node ID of the data node and the block number marked for the data node; the database includes multi-level data storage; the block number indicates where The block where the data node is located when the data is updated; the method includes the following steps:

Step 302: When any level of target data storage in the database meets the data migration condition, determine the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage;

Step 304: Determine a migration threshold value based on the block number interval; wherein the migration threshold value is greater than the block number threshold value of the block number interval;

Step 306: Iteratively traverse the two-tuple of data nodes stored in the target data store, and search for a target data node whose block number in the two-tuple is less than the migration threshold;

Step 308, according to the two-tuple of the target data node, further search whether the target data storage has the same node ID as the target data node and the block number in the two-tuple Data nodes larger than the target data node and smaller than the migration threshold; if yes, migrate the target data node to the next level data storage of the target data storage; otherwise, keep it in the target data storage The target data node.

The above-mentioned database may specifically be a Key-Value database with a multi-level data storage structure (stored in comparison; for example, in the illustrated embodiment, the above-mentioned database may be a LevelDB database; or, a database based on the LevelDB architecture ; For example, Rocksdb database is a typical database based on LevelDB database architecture.

The account state data in the blockchain can be organized into the data structure of the Merkle state tree and stored in the above-mentioned database; for example, the above-mentioned Merkle state tree may be an MPT tree, and the data structure of the MPT tree may be used to divide the block The account state data of the chain is organized into an MPT state tree.

The following uses the data structure of the MPT tree to organize the account state data in the blockchain into an MPT state tree as an example to describe the technical solutions of this specification in detail;

Among them, it should be emphasized that using the data structure of the MPT tree to organize the account state data in the blockchain is only exemplary.

In practical applications, for blockchain projects derived from the Ethereum architecture, in addition to improved versions of Merkle trees such as MPT trees, other forms of Merkle trees similar to MPT trees can also be used. Merkle tree variants with tree structure will not be listed one by one in this specification.

In this manual, the user client connected to the blockchain can package the data into a standard transaction format supported by the blockchain, and then publish it to the blockchain; and the node devices in the blockchain can be based on The consensus algorithm of, together with other node devices, agrees on these transactions that the user client publishes to the blockchain to generate the latest block for the blockchain;

Among them, the consensus algorithms supported in the blockchain are usually divided into consensus algorithms in which node devices need to compete for the accounting rights of each round of accounting cycles, and pre-election of accounting nodes for each round of accounting cycles (no need to compete Accounting rights) consensus algorithm.

For example, the former is represented by consensus algorithms such as Proof of Work (POW), Proof of Stake (POS), and Delegated Proof of Stake (DPOS); the latter is represented by Practical Byzantine fault tolerance (Practical Byzantine Fault Tolerance, PBFT) and other consensus algorithms are representative.

In the blockchain network that adopts consensus algorithms such as Proof of Work (POW), Proof of Stake (POS), and Delegated Proof of Stake (DPOS), nodes competing for the right to bookkeeping The device can execute the transaction after receiving the transaction. One of the node devices competing for the right to book may win this round and become the bookkeeping node. The accounting node can package the received transaction with other transactions and generate the latest block, and send the generated latest block to other node devices for consensus.

For blockchain networks that adopt consensus algorithms such as Practical Byzantine Fault Tolerance (PBFT), the node devices with the right to book accounts have been agreed before this round of bookkeeping. Therefore, after the node device receives the transaction, if it is not the billing node of the current round, it can send the transaction to the billing node.

For this round of accounting nodes, the transaction can be executed during or before the process of packaging the transaction with other transactions and generating the latest block. After the accounting node packs the transaction together with other transactions to generate a new block, it can send the generated latest block or the block header of the latest block to other node devices for consensus.

As mentioned above, no matter which of the consensus algorithms shown above is adopted by the blockchain, this round of accounting nodes can package the received transactions and generate the latest block, and the generated latest block or the latest block The header of the block is sent to other node devices for consensus verification. If other node devices receive the latest block or the block header of the latest block, and there is no problem after verification, the latest block can be appended to the end of the original blockchain to complete the blockchain accounting process.

In this manual, after the node device in the blockchain executes the transactions packaged in the latest block generated by consensus, the status of the account related to these executed transactions in the blockchain will usually change accordingly. ; Therefore, after the transaction packaged in the latest block is executed, the node device can organize the data structure of the MPT state tree according to the latest account status data of all accounts in the blockchain.

Among them, according to the latest account status data of all accounts in the blockchain, when the MPT state tree is organized, the method of multiplexing the nodes on the MPT tree corresponding to the block before the latest block can still be used as shown in Figure 2 , I won’t repeat it in this manual.

When the node device organizes the MPT state tree according to the latest account state data of all accounts in the blockchain, the data nodes on the MPT state tree can be used in the form of Key-Value key-value pairs in a multi-level data storage structure Stored in a Key-Value database.

For example, in practical applications, the data nodes on the MPT state tree corresponding to the latest several blocks can be stored in the highest level L0 data storage in the aforementioned database by default. The data nodes on the MPT state tree corresponding to the next several blocks can be stored in the next highest level L1 data storage in the above-mentioned database; and so on. Among them, the number of blocks corresponding to the MPT state tree stored in each level of data storage is not particularly limited in this specification; for example, it can be specified that the MPT state of the latest N blocks in the highest level L0 data storage can be stored Tree, the next highest level L1 data storage MPT state tree of the next new N blocks; and so on.

In this specification, for the data node on the constructed MPT state tree, the block number can be marked; where, the block number marked by the data node on the MPT state tree is specifically used to indicate when the data node is updated. Block

For example, as shown in Figure 2, take the node where data update occurs on the MPT state tree corresponding to Block N+1 as an example. The block number marked for these nodes is N+1 to indicate that these nodes are in Block N+ After the transaction in 1 is executed, a data update occurs.

During implementation, the node device can start a "block status update thread" locally to maintain and update the node status on the MPT status tree. After the transaction in the latest block generated by the blockchain is executed, the "block status update thread" can determine the node where the data update occurred on the MPT state tree corresponding to the latest block; among them, the node where the data update occurred, It usually contains the updated node and the newly added node.

For example, in implementation, you can directly determine the node where the data update occurs on the MPT state tree corresponding to the latest block based on the execution result of the transaction in the latest block; or, you can also determine the node on the MPT state tree. Whether it is a multiplexed node, to determine the node whose data update occurred on the MPT state tree corresponding to the latest block; for example, if it is a multiplexed node, it indicates that the node is a node that has not undergone data update;

After determining the node whose data has been updated on the MPT state tree corresponding to the latest block, the “block state update thread” can mark the node whose data has been updated with the block number of the latest block, indicating that the node is in The block corresponding to this block number has been updated.

In this way, whenever the blockchain produces a newest block, the "block status update thread" can promptly mark the node whose data has been updated on the MPT state tree corresponding to the latest block, Block number, so that by traversing each node on the MPT state tree stored in the above database, it is possible to know the block where the node is located when the data is updated through the block number marked for each node; because of the MPT state tree The value value of each node (especially the leaf node on the MPT tree) after the most recent data update can usually indicate the latest state of the node; therefore, by viewing the block number marked for each node, you can know each node The latest state of the database is generated in which block; so this mechanism of marking the block number for the node can provide a basis for the data migration of the above-mentioned database at all levels of data storage.

Among them, in this specification, the "block status update thread" can specifically mark the above-mentioned block number for the node on the MPT state tree in the key of the Key-Value key-value pair corresponding to the node on the MPT state tree.

After determining the node whose data has been updated on the MPT state tree corresponding to the latest block, the "block state update thread" can construct Key-Value key-value pairs for these nodes whose data has been updated, and build Key-Value In the key of the key-value pair, mark the block number for the aforementioned node; then store the constructed Key-Value key in the aforementioned database.

Among them, when the data nodes on the MPT state tree are stored in the database in the form of Key-Value key-value pairs, the hash value of the content contained in the aforementioned data node may no longer be used as the key of the aforementioned Key-Value key-value pair, and The two-tuple composed of the node ID of the data node and the block number that can indicate the block where the data node is located when the data node is updated is used as the key of the Key-Value key-value pair. The Value of the aforementioned Key-Value key-value pair can still be the data content contained in the data node. For example, the above two-tuple can be recorded as [nid, h], nid represents NodeID; represents the block number marked by the node.

That is, in this specification, the hash value of the data content contained in the data node will no longer be used as the key of the data node; instead, the node ID of the data node and the block that can indicate the data node when the data is updated are used. The two-tuple composed of the block number is used as the key of the data node.

In this way, the addressing method used by the above-mentioned database will also change accordingly; for the above-mentioned database, the hash value of the data content contained in the node will no longer be used as the key for content addressing; instead, , Will use the node ID and the block number of the block where the data node is located when the data is updated for addressing.

Among them, it should be noted that the node ID of the above data node specifically refers to the unique identifier of the data node in the MPT state tree;

For example, in practical applications, a variant of Merkle tree that incorporates a tree structure of Trie dictionary tree, such as MPT tree, usually uses a character prefix corresponding to the path from the root node of the MPT tree to the data node; for example, For any leaf node shown in Figure 1 and Figure 2, the character prefix corresponding to the path from the root node to the leaf node is usually the account address corresponding to the leaf node; that is, the node ID of the leaf node on the MPT tree, That is, the account address corresponding to the leaf node.

Of course, in actual applications, in addition to using the character prefix corresponding to the path from the root node of the MPT tree to the data node as the node ID of the data node, the node ID of the above data node can also be specifically when constructing the MPT tree. The node number set for each data node that can uniquely distinguish each data node is not particularly limited in this specification.

In this specification, the node device can also start a "migration thread" locally, which is used to migrate the node data on the MPT state tree stored in the data storage at all levels in the database to the lower data storage.

The above-mentioned "migration thread" can specifically execute a timed task, which can periodically determine whether the data storage at all levels in the database meets the preset data migration conditions;

Among them, the data migration conditions of the various levels of data storage in the aforementioned database can be set based on actual data migration requirements, which are not particularly limited in this specification;

For example, in practical applications, the data migration condition of the data storage at all levels of the database may be specifically that the storage capacity of the data storage at all levels reaches the threshold; or it may also be the block corresponding to the data stored in the data storage at all levels The number reaches the threshold.

The above-mentioned "migration thread" determines that any level of target data storage in the above-mentioned database meets the data migration conditions, then the "migration thread" can perform data migration processing for the target data storage, and the part stored in the target data storage The MPT state tree of the block is used as historical data to migrate to the next level of data storage.

In implementation, when the target data storage meets the data migration conditions, the "migration thread" can determine that the target data storage needs to be migrated to the block number interval corresponding to the node of the next level data storage;

Among them, the block number range of the node corresponding to the node that needs to be migrated to the next level of data storage in the target data storage can be based on the block number of the next block after the previous data migration occurred, and the target data storage can be migrated once The maximum number of blocks to be moved is determined; for example, suppose the next block after the last data migration is Block N, and the number of blocks that can be migrated for the target data storage at one time is 30, then the above block number range can be specifically [N, N+29]; Or, it can be [N, N+30).

After the above “migration thread” determines that the target data storage needs to be migrated to the block number interval corresponding to the node of the next level data storage, the migration threshold value of this data migration can be determined based on the block number interval ; Wherein, the migration threshold may specifically be a block number threshold greater than the block number interval;

It should be noted that in practical applications, the right interval of the block number interval may be an open interval or a closed interval; and when the migration threshold is determined based on the block number interval, the block number When the right interval of the interval is an open interval or a closed interval, the determined migration threshold may also have a certain difference.

In the illustrated embodiment, if the right interval of the block number interval is an open interval, when the migration threshold is determined based on the block number interval, the right end value of the block number interval may be used as Determined as the above migration threshold;

For example, assuming that the block number interval is [a, b), the migration threshold determined based on the block number interval is the right end value b of the block number interval.

In another embodiment shown, if the right interval of the block number interval is a closed interval, when the migration threshold is determined based on the block number interval, the right end value of the block number interval may be The sum of the incremental step size of the block number of the blockchain is determined as the above migration threshold;

For example, suppose the block number interval mentioned above is [a, b] and the block number increment step of the blockchain is 1, then the migration threshold determined based on the block number interval is the right end of the block number interval The value b+1.

Among them, the block number of the above-mentioned block chain is incremented by a step length, which can usually be 1. That is, the block number of the block chain is densely incremented according to the step length; for example, the block number is incremented according to 1, 2, 3, 4...increment.

Of course, in practical applications, the increment step size of the block number of the blockchain can also be an integer greater than 1. For example, the block number can also be incremented in the order of 1, 3, 5, 7..., in this specification It is not particularly limited.

Further, after the “migration thread” determines the migration threshold of this data migration based on the block number interval, the target data stored in the target data can be stored in the same block number interval according to the migration threshold. The MPT state tree of the corresponding block is used as historical data to migrate to the next level of data storage.

Specifically, the aforementioned "migration thread" can iteratively traverse the keys of each node stored in the target data store; that is, iteratively traverse the node ID of each node stored in the target data store and the marked block number. The above-mentioned two-tuple of to find the target node whose block number in the above-mentioned two-tuple is less than the migration threshold among all the nodes stored in the target data storage.

After finding a target node whose block number in the two-tuple is less than the migration threshold, the "migration thread" can continue to address the target data store according to the two-tuple of the target node, Further find out whether the above target data storage is stored, which has the same node ID as the above target node (that is, has the same character prefix); and, the block number in the above two-tuple is greater than the above target node and less than the above migration threshold Value of node;

If it exists, it indicates that the account status represented by the target node is not the latest account status; at this time, the target node can be used as the historical data of the next level of data storage migrated to the target data storage, the above "migration thread" The found target node can be migrated to the next level data storage of the target data storage;

For example, in implementation, the aforementioned "migration thread" can copy the target node and store the copied target node to the next level of data storage, and then after the copied target node is successfully stored to the next level of data storage, Then remove the target node from the above target data storage.

On the contrary, if it does not exist, it indicates that the account status represented by the target node is the latest account status; in this case, the target node will continue to be retained in the target data storage.

For example, suppose the above block number interval is [a, b), then the migration threshold determined based on the block number interval is the right end value b of the block number interval; in this case, Suppose the above-mentioned "migration thread" finds a nodeA in the target data store, and the above-mentioned two-tuple of this nodeA is (nid, i); where i<b; then, the above-mentioned "migration thread" can be based on the two-tuple (nid, i) Addressing, and further find out whether the nodeB whose binary group is (nid, j) is stored in the above-mentioned target data storage; where b>j>i; if there is nodeB, nodeA can be migrated to the above-mentioned target data storage The next level of data storage; otherwise, the nodeA will continue to be reserved in the target data storage.

The above-mentioned "migration thread" can iteratively traverse the above-mentioned two-tuple of data nodes stored in the above-mentioned target data store, and iteratively execute the above-mentioned migration process, and finally can only be retained in the above-mentioned target data store. The node corresponding to the largest block number whose block number is less than the above migration threshold; and all nodes other than this node will be migrated to the next level of data storage as historical data.

Since the node corresponding to the largest block number whose block number in the above two-tuple is less than the migration threshold value stored in the above target data storage, it usually represents the latest state of the node after the latest update, so through this This method is equivalent to only keeping the latest state of each node in the above target data storage.

Among them, it should be noted that because in this specification, the key of the Key-Value key-value pair of the node on the MPT state tree is no longer the hash value of the data content contained in the node; therefore, the “migration thread” mentioned above is The above-mentioned two-tuple of the above-mentioned target node continues to be used as a query index, and when addressing in the above-mentioned target data storage, content addressing is no longer used.

In an embodiment shown, the above-mentioned database can provide the function of "querying the maximum key less than a certain value"; the so-called "querying the maximum key less than a certain value" function specifically refers to the function of "querying the maximum key less than a certain value". The two-tuple key composed of] is used as the query index for addressing, and the process of finding the two-tuple key composed of [nid, h _max ] stored in the database; among them, h _max is the binary of all nodes stored in the database In the group, the maximum block number among the block numbers smaller than h. That is, when determining h _max , you can first find out all the binary groups with block numbers less than h stored in the database; and then take the maximum value of the block numbers in these found binary groups as the above h _max .

In this case, when the “migration thread” continues to address in the target data store according to the above-mentioned two-tuple of the target node, it can be based on the node ID (denoted as nid) of the target node and the migration threshold. The value (denoted as h) is spliced into a two-tuple [nid, h], and then the spliced two-tuple [nid, h] is used as the query index to further find out whether the two-tuple [nid] is stored in the target data store. , H _max ].

When the two-tuple [nid, h _max ] is found, it can be further determined whether h _max is greater than the block number in the two-tuple of the target node; if it is, it can be determined that the target data storage and The above-mentioned target node has the same node ID, and the block number in the above-mentioned two-tuple is greater than the above-mentioned target node and the node whose migration threshold is smaller than the above-mentioned migration threshold; on the contrary, the block number in the above-mentioned two-tuple of the above-mentioned target node is The maximum block number that is less than the migration threshold. At this time, it can be determined that the target data storage does not store the same node ID as the target node, and the block number in the two-tuple is greater than the target node and smaller than the above The node of the migration threshold.

In the above technical solution, due to the storage in the target data storage, among the nodes whose block numbers in the two-tuple are less than the migration threshold, only the node corresponding to the largest block number in the two-tuple can continue to remain in The target data is stored, and other nodes will be migrated to the next level of data storage as historical data;

Therefore, after completing the data migration in the above manner, it is equivalent to pruning the node on the MPT state tree stored in the database, removing the node that records the historical state data from the MPT state tree and moving to the next level Data storage, and continue to store and retain the node that records the latest state data in the data storage at this level, and then complete the hierarchical storage of the MPT state tree stored in the database.

Among them, it should be noted that the processing actions corresponding to the aforementioned "block status update thread" and the aforementioned "migration thread" can be executed concurrently; that is, the aforementioned "block status update thread" and the aforementioned "migration thread" can be used for MPT The same node on the state tree performs concurrent contention processing. Therefore, in practical applications, in order to avoid the “block status update thread” and the above “migration thread” from concurrently competing for the same node, which may cause processing errors, technical means can be used to ensure that at any time, only There can be one thread to access the same node in the MPT state tree.

Among them, it is guaranteed that at any one time, only one thread can access the specific technical means used by the same node on the MPT state tree, which is not particularly limited in this specification; for example, in implementation, a mutual exclusion lock can be used Technology, single-thread technology to achieve, will not be detailed in this manual.

Corresponding to the foregoing method embodiment, this application also provides an embodiment of an apparatus.

Corresponding to the foregoing method embodiment, this specification also provides an embodiment of a blockchain-based hierarchical storage device.

The embodiment of the blockchain-based hierarchical storage device in this specification can be applied to electronic equipment. The device embodiments can be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a logical device, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory through the processor of the electronic device where it is located.

From a hardware perspective, as shown in Figure 4, it is a hardware structure diagram of the electronic equipment where the blockchain-based hierarchical storage device is located in this specification, except for the processor, memory, network interface, and non- In addition to the volatile memory, the electronic device in which the device is located in the embodiment may also include other hardware according to the actual function of the electronic device, which will not be repeated here.

Fig. 5 is a block diagram of a hierarchical storage device based on blockchain according to an exemplary embodiment of this specification.

Please refer to FIG. 5, the block chain-based hierarchical storage device 50 can be applied to the electronic device shown in FIG. 4, and the account state data of the block chain is organized into data nodes on the Merkle state tree to The form of the Key-Value key-value pair is stored in the database; wherein the key of the Key-Value key-value pair is a two-tuple composed of the node ID of the data node and the block number marked for the data node; The database includes multi-level data storage; the block number indicates the block where the data node is located when the data update occurs; the device 50 includes:

The determining module 501, when any level of target data storage in the database meets the data migration conditions, determine the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage; The block number interval determines a migration critical value; wherein the migration critical value is greater than the block number critical value of the block number interval;

The searching module 502, iteratively traverses the two-tuple of data nodes stored in the target data store, and searches for the target data node whose block number in the two-tuple is less than the migration threshold;

The migration module 503, based on the two-tuple of the target data node, further searches whether the target data storage stores the same node ID as the target data node, and the block in the two-tuple Data nodes whose numbers are greater than the target data node and less than the migration threshold; if yes, migrate the target data node to the next level of the target data storage; otherwise, in the target data storage Keep the target data node.

In this specification, the determining module 501:

In this specification, the migration module 503 further:

In this specification, the device 50 further includes:

The storage module (not shown in Figure 5) determines the data node on the Merkle state tree of the latest block that has data update; constructs the Key-Value key value for the data node on the Merkle state tree of the latest block. Yes, and store the Key-Value key-value pair in the highest-level data store in the database;

In this specification, the Merkle tree is a variant of the Merkle tree incorporating the tree structure of the Trie dictionary tree; the node ID of the data node is the character prefix corresponding to the path from the root node of the Merkle tree to the data node .

In this specification, the Merkle state tree is the Merkle Patricia Tree state tree.

In this specification, the database is a LevelDB database; or a database based on the LevelDB architecture;

In this manual, the database is Rocksdb database based on LevelDB architecture.

In this specification, the read and write performance of the storage medium corresponding to the multi-level data storage has performance differences; among them, the read and write performance of the storage medium corresponding to the high-level data storage is higher than that of the low-level data storage. The read and write performance of the storage medium.

The systems, devices, modules, or units illustrated in the above embodiments may be specifically implemented by computer chips or entities, or implemented by products with certain functions. A typical implementation device is a computer. The specific form of the computer can be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email receiving and sending device, and a game control A console, a tablet computer, a wearable device, or a combination of any of these devices.

In a typical configuration, the computer includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.

The memory may include non-permanent memory in computer readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of computer readable media.

Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology. The information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

It should also be noted that the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, product or equipment including a series of elements not only includes those elements, but also includes Other elements that are not explicitly listed, or include elements inherent to this process, method, commodity, or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, commodity or equipment including the element.

The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims may be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown to achieve the desired result. In certain embodiments, multitasking and parallel processing are also possible or may be advantageous.

The terms used in one or more embodiments of this specification are only for the purpose of describing specific embodiments, and are not intended to limit one or more embodiments of this specification. The singular forms of "a", "said" and "the" used in one or more embodiments of this specification and the appended claims are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and/or" used herein refers to and includes any or all possible combinations of one or more associated listed items.

It should be understood that, although the terms first, second, third, etc. may be used in one or more embodiments of this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of one or more embodiments of this specification, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information. Depending on the context, the word "if" as used herein can be interpreted as "when" or "when" or "in response to determination".

The above descriptions are only preferred embodiments of one or more embodiments of this specification, and are not used to limit one or more embodiments of this specification. All within the spirit and principle of one or more embodiments of this specification, Any modification, equivalent replacement, improvement, etc. made should be included in the protection scope of one or more embodiments of this specification.

Claims

A hierarchical storage method based on blockchain, the data nodes on the Merkle state tree organized into account state data of the blockchain are stored in the database in the form of Key-Value key-value pairs; wherein, the Key -The key of the Value key-value pair is a two-tuple composed of the node ID of the data node and the block number marked for the data node; the database includes multi-level data storage; the block number indicates the data The block where the node is located when the data is updated; the method includes:

When any level of target data storage in the database meets the data migration condition, determine the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage;

Determine the migration threshold value based on the block number interval; wherein the migration threshold value is greater than the block number threshold value of the block number interval;

Iteratively traverse the two-tuple of data nodes stored in the target data store, and search for a target data node whose block number in the two-tuple is less than the migration threshold;

According to the two-tuple of the target data node, it is further searched whether the same node ID as the target data node is stored in the target data storage, and the block number in the two-tuple is greater than the Target data node and a data node smaller than the migration threshold; if yes, migrate the target data node to the next level of the target data storage; otherwise, keep the target in the target data storage Data node.
The method according to claim 1, wherein the determining a migration threshold value based on the block number interval comprises:

If the right interval of the block number interval is an open interval, determining the right end value of the block number interval as the migration critical value;

If the right interval of the block number interval is a closed interval, the sum of the right end point value of the block number interval and the block number increment step of the blockchain is determined as the migration critical value.
The method according to claim 1, wherein the searching whether the target data storage stores the same node ID as the target data node, and the block number in the two-tuple is greater than the target data node The data nodes that are smaller than the migration threshold include:

Look up the node ID stored in the target data store that has the same node ID as the target data node, and the block number in the two-tuple is the largest block number among the block numbers that are less than the migration threshold Data node

It is further determined whether the found block number in the two-tuple of the data node is greater than the block number in the two-tuple of the target data node.
The method according to claim 1, further comprising:

Determine the data node on the Merkle state tree of the latest block whose data has been updated; construct a Key-Value key-value pair for the data node on the Merkle state tree of the latest block, and combine the Key-Value key value Store the highest level data stored in the database;

Wherein, the key of the Key-Value key-value pair is a two-tuple composed of the block number of the latest block and the node ID of the data node; the Value of the Key-Value key-value pair is the data The data content contained in the node.
The method according to claim 1 or 4, wherein the Merkle tree is a variant of the Merkle tree incorporating the tree structure of the Trie dictionary tree; the node ID of the data node is from the root node of the Merkle tree to the data node The character prefix corresponding to the path.
The method according to claim 5, wherein the Merkle state tree is a Merkle Patricia Tree state tree.
The method according to claim 1, wherein the database is a LevelDB database; or a database based on the LevelDB architecture.
The method according to claim 7, wherein the database is a Rocksdb database based on the LevelDB architecture.
According to the method of claim 1, the read and write performance of the storage medium corresponding to the multi-level data storage has performance differences; wherein the read and write performance of the storage medium corresponding to the data storage with a higher level is higher than that of a lower level. The data storage corresponds to the read and write performance of the storage medium.
A hierarchical storage device based on a block chain, the data nodes on the Merkle state tree organized into account state data of the block chain are stored in the database in the form of Key-Value key-value pairs; wherein, the Key -The key of the Value key-value pair is a two-tuple composed of the node ID of the data node and the block number marked for the data node; the database includes multi-level data storage; the block number indicates the data The block where the node is located when the data is updated; the device includes:

The determining module, when any level of target data storage in the database meets the data migration condition, determines the block number interval corresponding to the data node to be migrated to the next level of data storage in the target data storage; based on the The block number interval determines the migration threshold; wherein the migration threshold value is greater than the block number threshold in the block number interval;

A search module that iteratively traverses the two-tuple of data nodes stored in the target data storage, and searches for a target data node whose block number in the two-tuple is less than the migration threshold;

The migration module, based on the two-tuple of the target data node, further searches whether the target data storage stores the same node ID as the target data node, and the block number in the two-tuple Data nodes larger than the target data node and smaller than the migration threshold; if yes, migrate the target data node to the next level data storage of the target data storage; otherwise, keep it in the target data storage The target data node.
The device according to claim 10, the determining module:

If the right interval of the block number interval is an open interval, determining the right end value of the block number interval as the migration critical value;

If the right interval of the block number interval is a closed interval, the sum of the right end point value of the block number interval and the block number increment step of the blockchain is determined as the migration critical value.
The device according to claim 10, the migration module further:

Look up the node ID stored in the target data store that has the same node ID as the target data node, and the block number in the two-tuple is the largest block number among the block numbers that are less than the migration threshold Data node; further determining whether the block number in the two-tuple of the data node found is greater than the block number in the two-tuple of the target data node.
The device according to claim 10, further comprising:

The storage module determines the data node whose data is updated on the Merkle state tree of the latest block; constructs a Key-Value key-value pair for the data node whose data is updated on the Merkle state tree of the latest block, and combines the Key- Value key-value pairs are stored in the highest level data storage in the database;

Wherein, the key of the Key-Value key-value pair is a two-tuple composed of the block number of the latest block and the node ID of the data node; the Value of the Key-Value key-value pair is the data The data content contained in the node.
The device according to claim 10 or 13, wherein the Merkle tree is a variant of the Merkle tree incorporating the tree structure of the Trie dictionary tree; the node ID of the data node is from the root node of the Merkle tree to the data node The character prefix corresponding to the path.
The apparatus according to claim 14, wherein the Merkle state tree is a Merkle Patricia Tree state tree.
The apparatus according to claim 10, wherein the database is a LevelDB database; or a database based on the LevelDB architecture.
The device according to claim 16, wherein the database is a Rocksdb database based on the LevelDB architecture.
The device according to claim 10, wherein the read and write performance of the storage medium corresponding to the multi-level data storage has performance differences; wherein the read and write performance of the storage medium corresponding to the data storage with a higher level is higher than that of a lower level. The data storage corresponds to the read and write performance of the storage medium.
An electronic device including:

processor;

A memory for storing processor executable instructions;

Wherein, the processor executes the executable instruction to implement the method according to any one of claims 1-9.
A computer-readable storage medium having computer instructions stored thereon, characterized in that, when the instructions are executed by a processor, the steps of the method according to any one of claims 1-9 are realized.