CN115221176A - Block chain data storage method and device and electronic equipment - Google Patents

Block chain data storage method and device and electronic equipment Download PDF

Info

Publication number
CN115221176A
CN115221176A CN202210908663.8A CN202210908663A CN115221176A CN 115221176 A CN115221176 A CN 115221176A CN 202210908663 A CN202210908663 A CN 202210908663A CN 115221176 A CN115221176 A CN 115221176A
Authority
CN
China
Prior art keywords
node
nodes
data
tree
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210908663.8A
Other languages
Chinese (zh)
Inventor
陆钟豪
俞本权
卓海振
任充慧
田世坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ant Blockchain Technology Shanghai Co Ltd
Original Assignee
Ant Blockchain Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ant Blockchain Technology Shanghai Co Ltd filed Critical Ant Blockchain Technology Shanghai Co Ltd
Priority to CN202210908663.8A priority Critical patent/CN115221176A/en
Publication of CN115221176A publication Critical patent/CN115221176A/en
Priority to PCT/CN2022/135539 priority patent/WO2024021419A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

A block chain data storage method is characterized in that key-value key value pairs of block chain data are stored in a database in the form of root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; the method comprises the following steps: acquiring a key-value key value pair of block chain data to be stored; converting the key-value key value pair of the block chain data into a root node, a middle node and a leaf node on a logical tree structure; caching at least part of the root node and the intermediate node to a storage medium supporting overlay write data; and generating a data record for recording the modification update details of at least part of the nodes, and writing the data record and other nodes except the at least part of the nodes in the root node, the middle node and the leaf nodes into a database for persistent storage.

Description

Block chain data storage method and device and electronic equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of blockchain technologies, and in particular, to a method and an apparatus for storing blockchain data, and an electronic device.
Background
The block chain technology, also called as distributed ledger technology, is an emerging technology in which a plurality of node devices participate in accounting together, and a complete distributed database is stored and maintained together. For node devices of a blockchain, blockchain data which needs to be stored and maintained generally includes blockchain data and account status data corresponding to blockchain accounts in the blockchain; the tile data may further include tile header data, tile transaction data in the tile, and transaction receipts corresponding to the tile transaction data in the tile, etc.
When storing the various blockchain data shown above, the blockchain node device may generally organize the various blockchain data into a logical tree structure in the form of key-value key value pairs to store in the database. For example, in practical applications, the blockchain data may be organized in the form of key-value pairs and stored in a database as a Merkle tree.
Disclosure of Invention
The present specification proposes a method for storing blockchain data, where key-value key value pairs of the blockchain data are stored in a database in the form of root nodes, intermediate nodes, and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in the keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with a node on the upper layer through a hash value of the node; the method comprises the following steps:
acquiring a key-value key value pair of block chain data to be stored;
converting the key-value key value pairs of the block chain data into a root node, an intermediate node and a leaf node on a logical tree structure;
caching at least part of the root nodes and the intermediate nodes to a storage medium supporting overlay type write data so as to update the modification of the at least part of the nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
The present specification also provides a method for storing blockchain data, where key-value key value pairs of the blockchain data are stored in a database in the form of root nodes, intermediate nodes, and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with the node on the upper layer through the hash value of the node; at least part of the root nodes and the intermediate nodes are cached in a storage medium supporting overlay write data, and modification updating is carried out in the storage medium; and a data record for recording the modification update details for the at least some nodes, and other nodes except the at least some nodes in the root node, the intermediate nodes and the leaf nodes are stored in the database in a persistent manner; the method comprises the following steps:
determining whether the at least a portion of nodes cached in the storage medium satisfy a persistent storage condition;
if the at least part of the nodes cached in the storage medium meet the persistent storage condition, writing the at least part of the nodes cached in the storage medium into the database for persistent storage. The present specification also provides a device for storing blockchain data, where key-value key value pairs of the blockchain data are stored in a database in the form of a root node, a middle node, and a leaf node on a logical tree structure; the root node and the intermediate node are used for storing characters in the keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with the node on the upper layer through the hash value of the node; the method comprises the following steps:
the acquisition module is used for acquiring a key-value key value pair of block chain data to be stored;
the conversion module is used for converting the key-value key value pairs of the block chain data into root nodes, intermediate nodes and leaf nodes on a logical tree structure;
the storage module caches at least part of the root nodes and the intermediate nodes to a storage medium supporting overlay write data so as to perform modification updating on the at least part of the nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
The present specification also provides a device for storing blockchain data, where key-value key value pairs of the blockchain data are stored in a database in the form of a root node, a middle node, and a leaf node on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with a node on the upper layer through a hash value of the node; at least part of the root nodes and the intermediate nodes are cached in a storage medium supporting overlay write data, and modification updating is carried out in the storage medium; and a data record for recording the modification update details for the at least some nodes, and other nodes except the at least some nodes in the root node, the intermediate nodes and the leaf nodes are stored in the database in a persistent manner; the device comprises:
a determination module that determines whether the at least a portion of the nodes cached in the storage medium satisfy a persistent storage condition;
and the writing module writes the at least part of the nodes cached in the storage medium into the database for persistent storage if the at least part of the nodes cached in the storage medium meet the persistent storage condition. The technical scheme has the following technical effects:
when the key-value key value pairs of the blockchain data are stored in the database in the form of the root node, the intermediate node and the leaf node on the logical tree structure, at least part of the root node and the intermediate node on the logical tree structure are cached in a storage medium supporting the overlay type data writing, and the at least part of the nodes are modified and updated in the storage medium, so that the write amplification effect caused in the process of repeatedly writing at least part of the root node and the intermediate node on the logical tree structure into the database can be relieved, and the storage performance of the database can be improved.
Drawings
FIG. 1 is a tree structure diagram of an MPT tree provided by an exemplary embodiment;
FIG. 2 is a schematic diagram of an example embodiment providing for organizing account status data for individual blockchain accounts in a blockchain into an MPT status tree in the form of key-value key value pairs;
FIG. 3 is a diagram illustrating organization of contract data stored in storage corresponding to a contract account into an MPT storage tree, according to an illustrative embodiment;
FIG. 4 is a tree structure diagram of an FDMT tree provided by an exemplary embodiment;
FIG. 5 is a block diagram of a Tree node provided in an exemplary embodiment;
FIG. 6 is a block diagram of a bucket according to an exemplary embodiment;
FIG. 7 is a flowchart of a method for blockchain data storage in accordance with an illustrative embodiment;
FIG. 8 is a schematic diagram of an electronic device according to an exemplary embodiment;
fig. 9 is a block diagram of a block chain data storage device according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of one or more embodiments of the specification, as detailed in the claims which follow.
It should be noted that: in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described herein. In some other embodiments, the method may include more or fewer steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.
The Blockchain (Blockchain) is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. In the block chain system, data blocks are combined into a chain data structure in a sequential connection mode according to a time sequence, and a distributed account book which is not falsified and forged is guaranteed in a cryptology mode. Because the blockchain has the characteristics of decentralization, information non-tampering, autonomy and the like, the blockchain is also paid more and more attention and is applied by people.
Block chains are generally divided into three types: public chain (Public Blockchain), private chain (Private Blockchain) and alliance chain (Consortium Blockchain). Furthermore, there may be a combination of the above types, such as private chain + federation chain, federation chain + public chain, and so on.
Among them, the most decentralized is the public chain. Participants joining the public chain (also referred to as nodes in the blockchain) can read the data records on the chain, participate in transactions, compete for accounting rights for new blocks, and so on. Moreover, each node can freely join or leave the network and perform related operations.
Private chains are the opposite, with the network's write rights controlled by an organization or organization and the data read rights specified by the organization. Briefly, a private chain may be a weakly centralized system with strict restrictions on nodes and a small number of nodes. This type of blockchain is more suitable for use within a particular organization.
A federation chain is a block chain between a public chain and a private chain, and "partial decentralization" can be achieved. Each node in a federation chain typically has a physical organization or organization corresponding to it; the nodes are authorized to join the network and form a benefit-related alliance, and block chain operation is maintained together.
In the field of blockchain, an important concept is an Account (Account). For blockchain networks that support intelligent contracts, blockchain accounts can be generally classified into two types:
contract account (contract account): storing the executed intelligent contract code and the value of the state in the intelligent contract code, wherein the intelligent contract code can be usually only activated by an external account call;
external account (Externally owned account): that is, an account directly controlled by the user, also referred to as a user account.
The design of the external account and the contract account is actually a mapping of the account address to the account status. The status of an account is typically represented by a structure. When a transaction in a block is executed, the status of the account associated with the transaction in the block chain is also typically changed.
In one example, the structure of an account typically includes fields such as Balance, nonce, codehash, and Storageroot. Wherein:
a Balance field for maintaining the current account Balance of the account;
a Nonce field for maintaining a number of transactions for the account. The counter is used for guaranteeing that each transaction can be processed only once, and effectively avoids replay attack;
a Codehash field for maintaining a contract code for the account. In actual practice, only the hash value of the contract code is typically maintained in the Codehash field.
And the Storageoot field is used for maintaining the storage content of the account. For a contract account, a separate persistent storage space is typically allocated for storing the contract data corresponding to the contract account. This separate storage space is often referred to as the account storage of the contract account.
The stored contents of a contract account are usually stored in a logical tree structure in the form of key-value key value pairs. For example, an MPT (metric Patricia Trie) tree is a logical tree structure commonly used in the field of blockchain for storing and maintaining blockchain data, and the tree structure usually includes a root node, an intermediate node, and a leaf node.
The Storage content based on the contract account is constructed into a logical tree structure, which is also commonly referred to as a Storage tree. Whereas the Storage root field typically only maintains the hash value of the root node of the Storage tree. Wherein, for the external account, the field values of the Codehash field and the Storageroot field shown above are both null values.
In practical applications, most blockchain models usually use Merkle trees; or a logical tree structure such as a Merkle tree variety designed on the basis of the data structure of the Merkle tree to store and maintain data.
For example, the MPT tree is a Merkle tree variant that merges the tree structures of Trie dictionary trees for storing and maintaining blockchain data.
For another example, an FDMT (Fixed Depth Merkle Tree) Tree is also a Merkle Tree variation that merges the Tree structure of the Trie into a Tree structure for storing and maintaining blockchain data.
The following description will be given taking an example in which block chain data is stored using an MPT tree.
In one example, blockchain data that needs to be stored and maintained in blockchains typically includes account status (state) data, transaction data, and receipt data. Therefore, in practical applications, the account status data, the transaction data, and the receipt data may be organized into three MPT trees, such as an MPT status tree (also referred to as world state), an MPT transaction tree, and an MPT receipt tree, in the form of key-value key value pairs, and stored and maintained respectively.
In addition to the above three MPT trees, the contract data stored in the Storage space corresponding to the contract account is usually constructed as an MTP Storage tree (hereinafter, referred to as Storage tree). The hash value of the root node of the Storage tree is added to the Storage field in the struct of the contract account corresponding to the Storage tree.
The MPT state tree is an MPT tree which is organized by account state data of all accounts (including external accounts and contract accounts) in a block chain in a key-value key value pair mode.
The MPT transaction tree is an MPT tree organized by transaction (transaction) data in a blockchain in the form of key-value pairs.
The MPT receipt tree is an MPT tree which is organized in a key-value key value pair mode, wherein a transaction (receipt) receipt corresponding to each transaction is generated after the transactions in the block are executed.
The hash values of the root nodes of the MPT state tree, the MPT transaction tree, and the MPT receipt tree shown above are all added to the block header of the corresponding block.
The MPT transaction tree and the MPT receipt tree correspond to the blocks, namely each block has the MPT transaction tree and the MPT receipt tree. The MPT state tree is a global MPT tree, which does not correspond to a specific tile, but covers account state data of all accounts in the tile chain. Each time the blockchain generates a latest block, the account status of blockchain accounts (which may be external accounts or contract accounts) in the blockchain associated with the executed transactions is usually changed after the transactions in the latest block are executed successfully.
For example, when a "transfer transaction" is completed in a block, the Balance of the transferring party account and the transferring party account associated with the "transfer transaction" (i.e., the field value of the Balance field of these accounts) will typically change. After the transaction in the latest block generated by the blockchain is completed, the node device needs to construct an MPT state tree according to the current account state data of all accounts in the blockchain because the account state in the current blockchain changes, so as to maintain the latest state of all accounts in the blockchain.
When a latest block is generated in the block chain and the transaction in the latest block is completed, which causes the account status of some accounts in the block chain to change, the node device needs to reconstruct an MPT status tree based on the latest account status data of all accounts in the block chain. In other words, each block in the block chain has an MPT state tree corresponding to it. The MPT status tree maintains the latest account status of all accounts in the blockchain after the transaction in the block is completed.
Referring to fig. 1, fig. 1 is a tree structure diagram of an MPT tree shown in this specification.
It should be noted that the connection relationship of each node in fig. 1 is only schematic.
The MPT tree is a more traditional modified Merkle tree variety, and combines the advantages of two tree structures, namely a Merkle tree and a Trie dictionary tree (also called as a prefix tree).
Three types of nodes are typically included in the MPT tree, namely leaf nodes (leaf nodes), extension nodes (extension nodes), and branch nodes (branch nodes). Wherein the root node of the MPT tree may typically be an extended node. The intermediate nodes of the MPT tree may typically be branch nodes or other extension nodes.
The extension node and the branch node may be collectively referred to as a character node, and are used to store a character prefix portion of a character string corresponding to a key (i.e., an account address) of the account status data. For MPT trees, the above character prefix portion is usually referred to as shared character prefixes (shared). The shared character prefix refers to a prefix formed by one or more identical characters possessed by keys (namely, block chain account addresses) of all account state data. The leaf node is used for storing a key-end and Value (specific account status data) of the character string corresponding to the key of the block chain data.
And the extension node is used for storing one or more characters (namely, shared nibbles shown in fig. 1) in the shared character prefix of the account address and a hash value (namely, next node shown in fig. 1) of a node at the Next layer linked with the extension node.
The branch node comprises 17 slot positions, the first 16 slot positions correspond to 16 possible hexadecimal characters in the key, one character corresponds to one nibble, each slot position in the first 16 slot positions respectively represents one character in a shared character prefix of an account address, and the slot positions are used for filling a hash value of a node at the next layer linked with the branch node. The last slot is a value slot, typically null.
Leaf nodes for storing the character suffix of the account address (i.e., key-end shown in fig. 1), and the value of the account status data (i.e., the structure of the account described above). The character suffix of the account address and the shared character prefix of the account address jointly form a complete account address. The character suffix refers to a suffix composed of the last one or more characters except the shared character prefix of the account address.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating organization of account status data of each blockchain account in a blockchain into an MPT status tree in the form of key-value key value pairs according to the present specification.
Assume that the key-value key value pairs of account status data that need to be organized into an MTP status tree are shown in table 1 below:
Figure BDA0003773219470000071
TABLE 1
It should be noted that, in table 1, the block chain accounts corresponding to the account addresses in the first three rows are external accounts, and the Codehash and Storage root fields are null values. The block chain account corresponding to the account address in the 4 th row is a contract account, and the Codehash field maintains the hash value of the contract code corresponding to the contract account; the Storage root field maintains a hash value of the root node of the Storage tree of which the Storage contents of the contract account constitute.
Finally, the MPT state tree is organized according to the account state data in the table 1, as shown in FIG. 3.
The MPT state tree is composed of 4 leaf nodes, 2 branch nodes, and 2 extension nodes (one of which serves as a root node).
In fig. 2, the prefix field is a prefix field that the extension node and the leaf node have in common. Different field values of the prefix field may be used to indicate different node types.
For example, the value of the prefix field is 0, which indicates that an extension node includes an even number of nibbles; as previously mentioned, a nibble represents a nibble, consisting of a 4-bit binary, and one nibble may correspond to one character that makes up an account address. The value of the prefix field is 1, and the extension node containing odd number of nibbles(s) is represented; the value of the prefix field is 2, which represents a leaf node containing an even number of nibbles; the value of the prefix field is 3, which indicates a leaf node containing an odd number of nibbles(s). And the branch node does not have the prefix field because the branch node is a character node of a parallel single neighbor.
A Shared neighbor field in the extension node, corresponding to a key value of a key-value pair contained in the extension node, represents a common character prefix between account addresses; for example, all account addresses in the table above have a common character prefix a7. The Next Node field is populated with the hash value (hash pointer) of the Next Node.
The fields of the 16-system characters 0-f in the branch nodes correspond to the key values of the key value pairs contained in the branch nodes; if the branch node is an intermediate node of the account address on the search path on the MPT tree, the Value field of the branch node may be null. And the 0-f fields are used for filling the hash value of the next layer of nodes.
The Key-end in a leaf node corresponds to the Key value of the Key-value pair contained in the leaf node, and represents the last characters of the account address (the character suffix of the account address). The key values of the nodes on the search path from the root node to the leaf nodes form a complete account address. Filling account state data corresponding to the account address in a Value field of the leaf node; for example, a structure composed of fields such as Balance, nonce, code, and storage may be encoded and filled in the Value field of the leaf node.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating organization of contract data stored in a storage space corresponding to a contract account into an MPT storage tree according to this specification.
With continued reference to table 1, the account with the account address "a77d397" shown in table 1 is a contract account, and thus the contract data stored in the storage space corresponding to the contract account is organized into a storage tree. The root node of the storage tree is also linked to the leaf node corresponding to the contract account in the MTP state tree shown in fig. 1 based on the hash value of the root node. The hash value S1 of the root node of the storage tree is added to the storage root field in the account status stored in the leaf node corresponding to the contract account in the MTP status tree shown in fig. 1. In this case, the storage tree may be referred to as a subtree extended from the leaf node of the MTP state tree shown in fig. 1 corresponding to the contract account.
Assume that the key-value pairs of contract data stored in the storage space of the contract account are as shown in table 2 below:
Figure BDA0003773219470000081
Figure BDA0003773219470000091
TABLE 2
It should be noted that the contract data stored in the storage space of the contract account may be in the form of a state variable. When storing, the state variables can be organized into a storage tree as shown in fig. 3 in the form of key-value key value pairs for storage. For example, in one example, a hash value of the account address of the contract account and a storage location of the state variable in the account storage of the contract account may be used as a key, and a value of a variable corresponding to the state variable may be used as a value.
The basic structure of the storage tree shown in fig. 3 is similar to the MTP state tree shown in fig. 2, and is not described again in this specification.
Further, either the node on the MPT state tree shown in fig. 2 or the node on the storage tree shown in fig. 3 may be stored in the database in the form of Key-Value Key Value pairs for persistent storage.
For example, the database may be generally stored in a persistent storage medium (such as a storage disk) mounted on the node device. The storage medium is a physical storage corresponding to the database.
A Key in a Key-Value Key Value pair corresponding to a node in the MPT state tree or the storage tree may be a hash Value of data content contained in the node; value in the key Value pair of the node may specifically be data content contained in the node.
When a node in the MPT state tree or the storage tree is stored in the database, a hash Value of data content included in the node may be calculated (that is, the whole node is subjected to hash calculation), the calculated hash Value is used as a Key, the data content included in the node is used as a Value, and a Key-Value Key Value pair is generated. And then storing the generated Key-Value Key Value pair into a database. When a node on the MPT state tree or the storage tree needs to be queried, content addressing can be performed by using a hash value of data content contained in the node as a key.
Referring to fig. 4, fig. 4 is a tree structure diagram of an FDMT tree shown in the present specification.
The above-mentioned FDMT tree is also a Merkle tree variation of the tree structure fused with the Trie dictionary tree.
In practical applications, the blockchain data may also be stored in the database in the form of key-value key value pairs organized into an FDMT tree.
As shown in fig. 4, in the Tree structure of the FDMT Tree, there may be Tree nodes of the first N layers (3 layers are shown in fig. 4, which is only schematic), and Leaf nodes (i.e., leaf nodes) of the last layer. Among the Tree nodes of the first N layers, the Tree node of the first layer will be the root node, and the Tree nodes of the other layers except the first layer will be the intermediate nodes.
Unlike the MPT Tree described above, the Tree nodes (i.e., the root node and the middle node) in the first N layers of the FDMT Tree described above adopt a unified data structure.
As shown in fig. 4, the Tree nodes in the first N layers of the FDMT Tree may each include a plurality of blocks respectively representing different characters; the block is the "location" of the character in the key used to store the block chain data. And each block may further comprise a plurality of slots each representing a different character. The slot is also used for storing characters in keys of the blockchain data.
For example, fig. 4 shows that N blocks are included for each Tree node. Each block further comprises N slots. Among the nodes in each layer of the FDMT tree, the nodes may still be linked by filling the nodes in the previous layer with the hash value (hash pointer) of the nodes in the next layer. That is, the nodes in the above FDMT tree are linked to the nodes of the upper layer by their own hash values. Correspondingly, the slot may be specifically used to fill a hash value of a node on a next layer linked by the current Tree node. The node at the next layer of the Tree node may be the Tree node or a Leaf node.
It should be noted that the link relationship between the nodes in each layer of the FDMT tree shown in fig. 4 is only an exemplary one, and is not a specific limitation on the link relationship between the nodes in each layer of the FDMT tree.
With continued reference to fig. 4, each Tree node on the FDMT Tree shown in fig. 4 may be used to store at least a portion of the characters in the key of the blockchain data.
The character string corresponding to the key of the block chain data may still include a character prefix and a character suffix. In this case, the Tree node may be used to store the characters in the character prefix of the key of the blockchain data. The leaf node may be configured to store a character suffix of the key of the blockchain data and a Value of the blockchain data.
For each Tree node on the FDMT Tree shown in fig. 4, the actually stored characters may be specifically a character represented by a block in the Tree node (that is, a non-empty block with at least one slot filled with a hash value), and a character string generated by splicing the characters represented by the slots in the block filled with the hash value (that is, the non-empty slots).
It should be noted that, in practical applications, each block in the Tree node may represent only one character. That is, based on the storage format of the Tree node shown in fig. 4, each of the partial characters in the character prefix of the key of the block chain data actually stored by the Tree node is a character string having a length of 2-bit characters.
For example, please refer to fig. 5, fig. 5 is a structural diagram of a Tree node shown in the present specification;
as shown in fig. 5, the Tree node contains 16 blocks representing different 16-ary characters. Each block further comprises 16 slots each representing a different 16-ary character (only the 16 slots contained in block6 are shown in fig. 5). Assuming that a block6 (representing 16-system character 6) in the Tree node is a non-empty block, and a slot4 (representing 16-system character 4), a slot6 (representing 16-system character 6) and a slot9 (representing 16-system character 9) in the block are non-empty slots which are filled with hash values of the next layer of nodes linked by the Tree node; the partial characters in the character prefix of the key of the above block chain data stored by the Tree node are 16-ary character strings "64", "66" and "69", respectively.
The number of blocks included in the Tree node and the number of slots included in each block are not particularly limited in this specification. In practical applications, the number of subblocks included in the Tree node may be determined based on the number of types of character elements included in a character string corresponding to the key of the block chain data; and the number of slots contained in the sub-block.
For example, assume that the key corresponding to the blockchain data is a 16-ary character string, and at this time, the type number of the character elements included in the character string corresponding to the key of the blockchain data is 16; the number of blocks contained in the Tree node and the number of slots contained in each block may be 16.
The number of layers of Tree nodes included in the FDMT Tree may be a fixed value; in practical applications, the value of N may be an integer greater than or equal to 1. That is, the FDMT Tree may be a Merkle Tree, which includes at least one layer of Tree nodes and includes a relatively fixed number of layers of Tree nodes.
For example, in an example, taking the key of the blockchain data as the blockchain account address as an example, it is assumed that the blockchain account addresses supported by the blockchain system are designed such that the first 6-bit address characters may be the same. Then, in this case, since the length of the character stored by the Tree node is a 2-bit character; thus, the above-described FDMT Tree can be designed to have a Tree structure including three levels of Tree nodes.
Further, for the Tree node and Leaf node on the FDMT Tree shown in fig. 4, persistent storage in the form of Key-Value Key Value pairs may also be performed in the database. The Key in the Key-Value Key pair corresponding to the Tree node or the Leaf node in the FDMT Tree may specifically be a hash Value of data content contained in the Tree node or the Leaf node. The Value in the key Value pair of the Tree node or the Leaf node may specifically be data content contained in the Tree node or the Leaf node.
When the Tree node or Leaf node in the FDMT Tree is stored in the database, a hash Value of data content included in the Tree node or Leaf node may be calculated (that is, the whole node is subjected to hash calculation), the calculated hash Value is used as a Key, the data content included in the Tree node or Leaf node is used as a Value, and a Key-Value Key Value pair is generated. And then storing the generated Key-Value Key Value pair into a database. When the node in the FDMT tree needs to be queried, the content can be addressed by using the hash value of the data content contained in the node as a key.
It should be noted that the tree structure of the FDMT tree shown in fig. 4 may be used to store account status data of each blockchain account in a blockchain, and may also be used to store contract data stored in a storage space corresponding to a certain contract account.
The FDMT tree used for storing account status data of each blockchain account in the blockchain may be referred to as an FDMT status tree. The FDMT tree used for storing contract data stored in a storage space corresponding to a certain contract account may be referred to as an FDMT storage tree.
The root node of the FDMT storage tree may be specifically linked to a leaf node of the FDMT status tree corresponding to the contract account through a hash value of the root node. The hash value of the root node in the FDMT storage tree may also be added to the storage root field in the account state stored in the leaf node corresponding to the contract account in the FDMT state tree, which is not described in detail.
It should be noted that, in both the MPT tree shown in fig. 1 and the FDMT tree shown in fig. 4, the Leaf nodes on the tree structure actually store data, which generally has a larger data capacity than other types of nodes. For example, the value of the blockchain data actually stored by the Leaf node is usually the original content of the blockchain data, and the original content of the blockchain data occupies a larger storage space relative to the character prefix of the blockchain data. Therefore, to ensure that the above Leaf nodes can have a larger data capacity, both the MPT tree shown in fig. 1 and the Leaf nodes on the FDMT tree shown in fig. 4 are typically designed in the form of large data blocks.
The specific format and storage structure of the data block are not particularly limited in this specification.
For example, in practical applications, the leaf node may be in the form of a bucket. The bucket may be a container or a storage space for storing data.
For example, referring to fig. 6, fig. 6 is a structural diagram of a bucket shown in this specification.
As shown in fig. 6, in the above-mentioned bucket data (i.e., the bucket node shown in fig. 6), several data records may be included. Each data record corresponds to a piece of blockchain data, and is used for storing a character suffix (i.e., key-end shown in fig. 6) and a value of a key of the blockchain data. That is, a data record refers to a stored record including the value and the character suffix of the key of the above block chain data.
It should be noted that the structure of the bucket shown in fig. 6 is specifically described as an example of a Leaf node on the FDMT tree shown in fig. 4. In practical application, the structure of the bucker node shown in fig. 6 may also be specifically used as a Leaf node of the MPT tree shown in fig. 1, and is not described in detail in this specification.
From the above description, it can be known that nodes containing several slot bits exist in either the MPT tree shown in fig. 1 or the FDMT tree shown in fig. 4. For example, the branch node (branch node) in the MPT Tree shown in fig. 1 and the Tree node in the first N layer of the FDMT shown in fig. 4 all include slots for storing characters in keys of blockchain data.
In practical applications, for nodes including multiple slots in an MPT tree or an FDMT tree, if only a part of the contents stored in the slots in the node are updated, even if the contents stored in the slots other than the part of the slots are not updated, the entire node is usually rewritten into the database for persistent storage after the part of the slots are updated, thereby causing a significant write amplification effect.
Note that Write amplification (Write amplification) is an effect of enlarging a Write bandwidth by making data to be written smaller than data actually written finally.
For example, for a node including a plurality of slots in an MPT tree or an FDMT tree, if only a part of the contents stored in the slots in the node are updated, the contents stored in the part of the slots need to be written, however, since the node in the MPT tree or the FDMT tree is a complete non-separable data whole, even if only the contents stored in a part of the slots need to be written, the whole node has to be written into a database of a disk, so that the data to be written is smaller than the data actually written finally, the above-mentioned write amplification effect is generated, and the write bandwidth is wasted.
Moreover, because the Tree nodes of the first N layers of the FDMT Tree all adopt a data structure comprising a plurality of slot positions, the write amplification effect caused when the block chain data is stored in the FDMT Tree is particularly obvious.
In view of this, the present specification proposes a technical solution for optimizing a write amplification effect generated when storing block chain data by using a logical tree structure.
During implementation, the key-value key value pairs of the blockchain data can be stored in the database in the form of root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with the node on the previous layer through the hash value of the node.
When the logical tree structure is used for storing the block chain data, the block chain data to be stored can be obtained, and the block chain data is converted into a root node, a middle node and a leaf node on the logical tree structure;
in one aspect, at least some of the root nodes and the intermediate nodes may be cached in a storage medium supporting overlay write data, so as to perform modification update on the at least some nodes in the storage medium;
on the other hand, a data record for recording the modification update details of the at least part of nodes may be generated, and the data record and records and other nodes except the at least part of nodes in the root node, the intermediate node and the leaf node may be written into the database for persistent storage.
In the above technical solution, at least some of the root nodes and the intermediate nodes in the logical tree structure are cached in a storage medium that supports overlay write-in of data, and at least some of the nodes are modified and updated in the storage medium, so that a write amplification effect caused in a process of repeatedly writing at least some of the root nodes and the intermediate nodes in the logical tree structure into the database can be alleviated, and further, the storage performance of the database can be improved.
For example, in practical applications, if there are nodes including a plurality of slots in the root node and the intermediate nodes on the logical tree structure, the nodes are cached in a storage medium supporting overlay write data, and modification and update can be performed on at least part of the nodes in the storage medium in an overlay write manner, and the overlay write nodes no longer need to be written into the database in the disk, so that the write amplification effect described above caused by writing the nodes into the database in the disk as a whole can be avoided.
Referring to fig. 7, fig. 7 is a flowchart illustrating a method for storing blockchain data according to an exemplary embodiment. The method is applied to block chain node equipment; storing the key-value key value pairs of the block chain data in a database in the form of a root node, a middle node and a leaf node on a logic tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with a node on the upper layer through a hash value of the node; the method comprises the following steps:
step 702, acquiring a key-value key value pair of block chain data to be stored;
the above-mentioned blockchain data to be stored may specifically include any type of data that needs to be persistently stored in a blockchain.
In an embodiment shown, the blockchain data to be stored may specifically include account status data corresponding to blockchain accounts on the blockchain.
For example, as described above, in practical applications, the blockchain accounts in the blockchain may generally include an external account and a contract account, and thus the account status data corresponding to the blockchain accounts on the blockchain may specifically include account status data (for example, account balance data) corresponding to user accounts on the blockchain, and status variable data (for example, memory data stored in the smart contract) stored in the contract accounts on the blockchain.
In practical applications, the blockchain data to be stored may also include transaction data issued to the blockchain network, and receipt data corresponding to the transaction data generated after the transaction data is executed.
In an example, when acquiring a key-value key value pair of blockchain data to be stored, a node device in a blockchain may specifically process the blockchain data to be locally processed into the key-value key value pair after acquiring the blockchain data to be stored.
In another example, the step of processing the blockchain data to be stored into key-value key value pairs may also be performed by a third party, and the node device may directly obtain the key-value key value pairs of the blockchain data to be stored, which are processed by the third party, from the third party.
The key in the key-value key pair of the blockchain data may refer to a primary key of the blockchain data in the database. The primary key may specifically serve as a query index. The value in the key-value key value pair of the blockchain data specifically refers to the data content of the blockchain data.
It should be noted that, for different types of blockchain data, there may be some difference in key in the key-value key pair.
For example, if the blockchain data is account status data corresponding to a blockchain account in a blockchain, a key in a key-value key value pair of the account status data may be an account address of the blockchain account.
If the block chain data is transaction data in a block chain or receipt data corresponding to the transaction data, a key in a key-value key value pair of the transaction data or the receipt data corresponding to the transaction data may be a transaction identifier; for example, in practical applications, the transaction identifier may specifically be a hash value of the transaction, or may also be a transaction ID assigned to the transaction when the transaction is identified.
Step 704, converting the key-value key value pair of the block chain data into a root node, an intermediate node and a leaf node on a logical tree structure;
for the key-value key value pair of the block chain data to be stored, the key-value key value pair can be organized into a logical tree structure, and the logical tree structure is stored in a database in the form of nodes on the logical tree structure.
The logical tree structure is a tree structure constructed on a logical level based on nodes stored in a database and link relationships between the nodes.
For example, the logical tree structure may specifically include multiple layers of nodes, and the multiple layers of nodes may specifically be stored in the underlying physical storage (such as a disk) that carries the database in units of nodes. When the block chain data stored in the logical tree structure needs to be used, multiple layers of nodes stored in the database may be loaded into the memory, and the logical tree structure is restored in the memory according to the link relationship between the nodes.
In an illustrated embodiment, the logical tree structure may be a Merkle tree in which a tree structure of a dictionary tree is merged; for example, the MPT tree described above may be used, and the FDMT tree described above may be used.
In practical applications, the logical tree structure may include a root node, a middle node, and a leaf node. After the node device in the block chain acquires the key-value key value pair of the block chain data to be stored, the node device may convert the key-value key value pair of the block chain data to be stored into the root node, the intermediate node, and the leaf node on the logical tree structure.
When the key-value key value pairs of the blockchain data are converted into the root node, the intermediate node and the leaf node of the logical tree structure, the root node, the intermediate node and the leaf node used for storing the key-value key value pairs of the blockchain data are searched from the root node of the logical tree structure, and then the searched root node, the intermediate node and the leaf node are updated based on the key-value key value pairs of the blockchain data. In this case, the updated root node, intermediate node, and leaf node are the root node, intermediate node, and leaf node into which the key-value key pair of the above blockchain data is converted.
It should be noted that, if, starting from the root node of the logical tree structure, except for the root node, the intermediate node and the leaf node for storing the key-value key value pair of the blockchain data are not found, the intermediate node and the leaf node for storing the key-value key value pair of the blockchain data may also be created based on the key-value key value pair of the blockchain data in the logical tree structure, and then the queried root node and the newly created intermediate node and leaf node are updated based on the key-value key value pair of the blockchain data.
Of course, if the blockchain data is written into the logical tree structure for the first time, when the key-value key value pair of the blockchain data is converted into the root node, the intermediate node, and the leaf node of the logical tree structure, the root node, the intermediate node, and the leaf node for storing the key-value key value pair of the blockchain data may be initialized. The root node, the intermediate node, and the leaf node that are initially created at this time are the root node, the intermediate node, and the leaf node that convert the key-value key value pair of the above block chain data into.
The nodes in the logical tree structure can still be linked with the nodes in the previous layer through the hash values of the nodes. The root node and the intermediate node are specifically used for storing at least one character in a key corresponding to the key-value key value of the block chain data. The leaf node is specifically used for storing the value of the blockchain data (i.e., the specific content of the blockchain data). The number of layers of the intermediate node may be one or more, and is not particularly limited in this specification.
For example, in one example, the key of the above tile chain data may still include a character prefix portion (Shared neighbor) and a character suffix portion (key-end); in this case, the root node and the intermediate nodes may be used to store the characters in the character prefixes described above. The leaf node may be used to store values of the character suffix and the blockchain data.
On one hand, because of the root node and the intermediate node in the logical tree structure, the characters in the keys of the block chain data can be stored; therefore, the tree structure of the logic has the characteristics of a Trie dictionary tree. On the other hand, the nodes in the logical tree structure can be linked with the nodes in the previous layer through the hash values of the nodes. Therefore, the logical tree structure described above also has the characteristics of a Merkle tree. It will be understood that the logical tree structure described in this specification may be a Merkle tree variation of a tree structure that merges Trie-dictionary trees, similar to an MPT tree or an FDMT tree. It should be added that, when the blockchain data is account status data corresponding to blockchain accounts on a blockchain, the logical tree structure may be a Merkle tree generated based on key-value key values of the account status data corresponding to the blockchain accounts in the blockchain. This Merkle tree may now be referred to as a Merkle status tree.
In practical applications, in order to improve the access performance of the memory State Tree, the memory State Tree is usually split into a Current memory State Tree (Current State Tree) and a historical memory State Tree (History State Tree). The current Merkle state tree is a Merkle state tree which is organized by the latest account state of each block chain account; the historical Merkle status tree is a Merkle status tree organized by historical account status for each blockchain account. Each block has a current Merkle state tree and a historical Merkle state tree corresponding to it.
In this scenario, because the current Merkle state tree maintains the latest account state of each blockchain account, based on this feature, the nodes in the current Merkle state tree usually perform frequent writing and update modification operations. Based on this, in this specification, the tree structure of the logic described in step 702 and step 706 may specifically refer to the current Merkle state tree described above. That is, for the current Merkle state tree described above, the technical solution as described in steps 702-706 can be adopted, while for the historical Merkle state tree, the storage solution in the prior art can still be adopted. Of course, in practical applications, the technical solution as described in step 702-step 706 may also be adopted for the historical Merkle state tree.
Step 706, caching at least part of the root nodes and the intermediate nodes to a storage medium supporting overlay write data, so as to perform modification update on the at least part of the nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
After the key-value key value pairs of the blockchain data are converted into root nodes, intermediate nodes and leaf nodes on a logical tree structure, and the key-value key value pairs are converted into root nodes, intermediate nodes and leaf nodes on the logical tree structure, the root nodes, intermediate nodes and leaf nodes may be stored in a database.
It should be noted that, in practical applications, the root node, the intermediate node, and the leaf node are usually written into the database for persistent storage, and in this specification, in order to alleviate a write amplification effect caused by writing the root node and the intermediate node into the database for persistent storage, a storage policy completely different from that of the leaf node may be specifically adopted when the root node and the intermediate node are stored.
On the other hand, at least some of the root nodes and the intermediate nodes may not be written into the database by default for persistent storage, but may be cached in a storage medium supporting overlay write data and mounted on a node device of a block chain, and may be modified and updated in the storage medium for the at least some nodes.
In an example, the storage medium supporting the overlay write data may be a memory mounted on a node device in a block chain. Of course, the storage medium may be a memory, or may be another type of storage medium supporting overlay writing of data, and is not particularly limited in this specification. For example, in practical applications, the storage medium may specifically be a solid state disk.
On the other hand, for other nodes except for the at least part of nodes in the root node, the intermediate nodes and the leaf nodes, the data is still stored in a mode of writing into the database by default for persistent storage. In addition, a data record for recording the modification update details of the at least part of the nodes can be generated, and then the data record and the other nodes are written into the database for persistent storage.
For example, in practical applications, after all transactions contained in a latest block are executed, the execution of the transactions usually causes a change in data content stored in a part of nodes in a logical tree structure, and at this time, it is usually necessary to recalculate a hash value (i.e., rothash) of a root node in the logical tree structure, fill the hash value of the root node into a block header, and then write the updated nodes in the logical tree structure into a database for persistent storage. In the related art, the process of recalculating the hash value of the root node of the logical tree structure and writing the updated node in the logical tree structure into the database for persistent storage is called commit operation for the logical tree structure. In this specification, when a commit operation is performed on the logical tree structure, the data record may be written into the database together with the other nodes and may be stored persistently.
In an embodiment shown in the present invention, the at least part of nodes may be nodes including a plurality of slots for storing characters in keys of the blockchain data, among the root node and the intermediate node. Of course, in practical applications, the at least part of nodes may also include all root nodes and intermediate nodes in the logical tree structure by default.
For example, in one example, if the logical tree structure is an MPT tree, the at least some of the nodes may be branch nodes (branch nodes) on the MPT tree. In this case, since only the branch node (branch node) on the MPT tree has a plurality of slots, only the branch node (branch node) on the MPT tree may be cached to the storage medium (i.e., only the branch nodes of the root node and the intermediate node of the MPT tree may be cached to the storage medium).
In another example, if the logical Tree structure is an FDMT Tree, the at least some nodes may be Tree nodes of the first N layers of the FDMT. In this case, since the Tree nodes of the first N layers in the FDMT Tree each have a plurality of slots, only the Tree nodes of the first N layers in the FDMT Tree can be cached in the storage medium. (i.e., the root node and intermediate nodes of the FDMT tree are all cached to the storage medium).
Accordingly, the data record may specifically be a data record for recording details of modification update for each slot in at least some of the nodes.
The specific form of the data recording is not particularly limited in the present specification.
In one embodiment, the data record may be a WAL (write ahead log) log. Among them, the WAL logging technology is an efficient logging algorithm. In a storage system with a WAL mode, all data modification aiming at the storage system is written into a WAL log before being submitted into the storage system; and then, modifying the data stored in the WAL log file through a checkpoint event triggered periodically or manually by a user, and writing the modified data into a storage system.
In this case, when the at least part of the nodes cached in the storage medium is lost due to a device anomaly (such as a device endpoint), the at least part of the nodes cached in the storage medium may be subjected to data recovery based on the WAL log persistently stored in the database.
For example, when the device exception causes data loss in the at least part of the cached nodes, the user may recover the data cached in the storage medium through an exception recovery instruction. And the node device in the blockchain may receive an exception recovery instruction of a user for the at least part of the nodes cached in the storage medium, and may then perform data recovery on the at least part of the nodes cached in the storage medium based on the WAL log stored in the database in response to the exception recovery instruction.
In an illustrated embodiment, a persistent storage condition may be set for the at least part of the nodes cached in the storage medium, where the persistent storage condition specifically refers to a condition that the at least part of the nodes cached in the storage medium writes into a database for persistent storage.
In this case, it may be determined whether the persistent storage condition is satisfied by the at least part of the nodes cached in the storage medium; for example, in practical applications, it may be determined periodically based on a preset period whether the persistent storage condition is satisfied by the at least part of the nodes cached in the storage medium. If the at least part of the nodes cached in the storage medium meet the persistent storage condition, the at least part of the nodes cached in the storage medium can be further written into a database for persistent storage.
The persistent storage condition is not particularly limited in this specification, and may be flexibly set based on actual storage requirements in actual applications.
For example, in one embodiment shown, the condition of the persistent storage may specifically include a combination of one or more of the following conditions shown:
condition 1: the number of the data records stored in the database in a persistent mode reaches a threshold value;
in this case, when the number of the data records persistently stored in the database reaches a threshold value, the writing back of the at least part of the nodes of the cache to the database in the disk may be triggered for persistent storage.
Condition 2, the storage capacity of the data records persistently stored in the database reaches a threshold value;
in this case, when the storage capacity of the data record persistently stored in the database reaches a threshold, the writing back of the at least part of the nodes of the cache to the database in the disk may be triggered for persistent storage.
Condition 3: and receiving a persistent storage instruction of a user for at least part of the nodes stored in the storage medium.
In this case, the writing back of the cached at least part of nodes to the database in the disk for persistent storage may be triggered when a persistent storage instruction for the at least part of nodes stored in the storage medium is received, which is input by a user.
In an illustrated embodiment, after the at least part of the nodes cached in the storage medium is successfully written into the database for persistent storage, in order to improve the storage space utilization of the storage medium, the at least part of the nodes cached in the storage medium may be further deleted. In addition, after the at least part of the nodes cached in the storage medium are successfully written into the database for persistent storage, in order to improve the storage space utilization rate of the database, the data records corresponding to the at least part of the nodes, which are persistently stored in the database, may be further deleted based on the same consideration.
In this specification, when storing the root node, the intermediate node, and the leaf node in the logical tree structure, a completely different storage policy from that in the related art is adopted; for example, as described above, in the present specification, unlike the related art, at least some nodes in the root node and the intermediate nodes on the logical tree structure are not stored in the database by default, but are cached in the storage medium; therefore, when reading the root node, the intermediate node, and the leaf node in the logical tree structure, a reading method of respectively reading from the storage medium and the database may be specifically adopted.
For example, in an illustrated embodiment, when a node device in a block chain receives a read instruction for a node on the logical tree structure, the node may be read from the storage medium by default, and when the node is not read from the storage medium, the node may be further read from the database.
In the above technical solution, at least some of the root nodes and the intermediate nodes in the logical tree structure are cached in a storage medium that supports overlay write-in of data, and at least some of the nodes are modified and updated in the storage medium, so that a write amplification effect caused in a process of repeatedly writing at least some of the root nodes and the intermediate nodes in the logical tree structure into the database can be alleviated, and further, the storage performance of the database can be improved.
For example, taking the above logical tree structure as an FDMT tree as an example, since only the branch node in the conventional MPT tree will contain 16 slots for the characters in the key of the block chain data, when nodes in the MPT tree are persistently stored in the database, although the write amplification effect is also caused, the caused write amplification effect is usually not obvious.
Whereas the FDMT tree is significantly different compared to the conventional MPT tree.
Since the Tree node of the first N layer of the FDMT Tree may include a plurality of blocks for storing characters in keys of the block chain data, each location may include a plurality of slots for storing characters in keys of the block chain data; therefore, based on this particular data structure, the Tree node in the FDMT Tree will typically have more slots than the branch node in the MPT Tree. For example, taking an example that one Tree node includes 16 blocks, each block includes 16 slots, one Tree node would include 256 slots, and the number of slots is much larger than that of the branch nodes on the MPT Tree.
With more slots, it means that the Tree node on the FDMT Tree will have more data capacity than the branch node on the MPT Tree. In this case, when the Tree node in the FDMT Tree is written into the database for persistent storage, the write amplification effect caused by the Tree node is more obvious than that caused by the MPT Tree. For example, if data stored in any one slot of 256 slots included in a Tree node on the FDMT Tree is updated, when the Tree node including the 256 slots is rewritten into the database for persistent storage, the write bandwidth consumed by the Tree node is obviously much greater than the write bandwidth consumed when a branch node including only 16 slots on the MPT Tree is written into the database in its entirety.
If the storage scheme described in the above embodiment is adopted, the Tree nodes on the first N layers of the FDMT Tree are cached in the memory, and are modified and updated in the memory, so that the Tree nodes on the first N layers of the FDMT Tree are prevented from being repeatedly written into the database in the disk, and the write amplification effect caused by the repeated writing can be relieved.
Corresponding to the method embodiment, the application also provides an embodiment of the device.
Corresponding to the above method embodiments, the present specification also provides an embodiment of a block chain data storage device.
The embodiments of the block chain data storage apparatus of the present specification can be applied to electronic devices. The apparatus embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. Taking software implementation as an example, as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for operation through the processor of the electronic device where the device is located.
From a hardware aspect, as shown in fig. 8, the block chain data storage device in this specification is a hardware structure diagram of an electronic device where the block chain data storage device is located, and except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 8, the electronic device where the block chain data storage device is located in the embodiment may also include other hardware according to an actual function of the electronic device, which is not described again.
Fig. 9 is a block diagram illustrating a blockchain data storage device in an exemplary embodiment of the present description.
Referring to fig. 9, the blockchain data storage device 90 may be applied to the electronic device shown in fig. 8, wherein key-value key value pairs of the blockchain data are stored in the database in the form of root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with a node on the upper layer through a hash value of the node; the apparatus 90 comprises:
the obtaining module 901 obtains a key-value key value pair of block chain data to be stored;
a conversion module 902, configured to convert the key-value key value pairs of the blockchain data into a root node, a middle node, and a leaf node on a logical tree structure;
a storage module 903, which caches at least some of the root nodes and the intermediate nodes to a storage medium supporting overlay write data, so as to perform modification update on the at least some nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage media or other magnetic storage devices, or any other non-transmission medium, that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein in one or more embodiments to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of one or more embodiments herein. The word "if," as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.
The above description is only for the purpose of illustrating the preferred embodiments of the one or more embodiments of the present disclosure, and is not intended to limit the scope of the one or more embodiments of the present disclosure, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the one or more embodiments of the present disclosure should be included in the scope of the one or more embodiments of the present disclosure.

Claims (17)

1. A method for storing blockchain data, wherein key-value key value pairs of the blockchain data are stored in a database in the form of root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with the node on the upper layer through the hash value of the node; the method comprises the following steps:
acquiring a key-value key value pair of block chain data to be stored;
converting the key-value key value pairs of the blockchain data into root nodes, intermediate nodes and leaf nodes on a logical tree structure;
caching at least part of the root nodes and the intermediate nodes to a storage medium supporting overlay type write data so as to update the modification of the at least part of the nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
2. The method of claim 1, the at least some nodes comprising a plurality of slots for depositing characters in keys of the blockchain data; the slot position is used for storing a hash value of a next layer node linked with the node; the data record is used for recording the modification update details of each slot in the at least part of nodes.
3. The method of claim 2, wherein the logical tree structure comprises a Merkle tree that merges a tree structure of a dictionary tree.
4. The method of claim 3, the blockchain data comprising account status data corresponding to blockchain accounts on the blockchain;
the Merkle tree generated based on the key-value key value pairs of the account state data corresponding to each blockchain account in the blockchain comprises: a current Merkle status tree generated based on the latest account status data of each blockchain account; and a historical Merkle status tree organized based on historical account status data of the individual blockchain accounts; the logical tree structure is the current Merkle state tree.
5. The method of claim 3, the logical tree structure comprising an MPT tree;
the root node is an extension node on the MPT tree; the intermediate node is the extension node or branch node on the MPT tree; the at least part of nodes are branch nodes on the MPT tree.
6. The method of claim 3, the logical tree structure comprising an FDMT tree;
wherein, the root node and the middle node on the FDMT tree both comprise a plurality of positions for storing characters in keys of the block chain data; each position further comprises a plurality of slot positions for storing characters in keys of the block chain data; the slot position is used for storing the hash of the node which is linked to the next layer of the node; the at least some nodes are root nodes and intermediate nodes on the FDMT tree.
7. The method of claim 1, further comprising:
determining whether the at least a portion of the nodes cached in the storage medium satisfy a persistent storage condition;
if the at least part of the nodes cached in the storage medium meet the persistent storage condition, writing the at least part of the nodes cached in the storage medium into the database for persistent storage.
8. The method of claim 7, further comprising:
after the at least part of the nodes cached in the storage medium are successfully written into the database for persistent storage, further deleting the at least part of the nodes cached in the storage medium.
9. The method of claim 7, further comprising:
after the at least part of the nodes cached in the storage medium are successfully written into the database for persistent storage, further deleting the data records corresponding to the at least part of the nodes, which are persistently stored in the database.
10. The method of claim 1, the persistent storage condition comprising a combination of any one or more of:
the number of data records stored persistently in the database reaches a threshold value;
the storage capacity of the data records stored persistently in the database reaches a threshold value;
persistent storage instructions are received for the at least some of the nodes stored in the storage medium.
11. The method of claim 1, further comprising:
reading the node from the storage medium in response to a read instruction for a node on the logical tree structure, and further reading the node from the database when the node is not read from the storage medium.
12. The method of claim 1, the data records comprising WAL logs.
13. The method of claim 12, further comprising:
receiving an exception recovery instruction for the at least some nodes cached in the storage medium;
and responding to the abnormal recovery instruction, and performing data recovery on at least part of the nodes cached in the storage medium based on the WAL logs stored in the database.
14. The method of claim 1, the storage medium comprising a memory.
15. A blockchain data storage apparatus, key-value key value pairs of the blockchain data being stored in a database in the form of root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in keys of the block chain data; the leaf node is used for storing the value of the block chain data; any node on the tree structure is linked with the node on the upper layer through the hash value of the node; the device comprises:
the acquisition module is used for acquiring a key-value key value pair of the block chain data to be stored;
the conversion module is used for converting the key-value key value pairs of the block chain data into a root node, an intermediate node and a leaf node on a logic tree structure;
the storage module caches at least part of the root nodes and the intermediate nodes to a storage medium supporting overlay write data so as to perform modification updating on the at least part of the nodes in the storage medium; and generating a data record for recording the modification update details of the at least part of nodes, and writing the data record and other nodes except the at least part of nodes in the root node, the middle node and the leaf nodes into the database for persistent storage.
16. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor implements the steps of the method of any one of claims 1-14 by executing the executable instructions.
17. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 14.
CN202210908663.8A 2022-07-29 2022-07-29 Block chain data storage method and device and electronic equipment Pending CN115221176A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210908663.8A CN115221176A (en) 2022-07-29 2022-07-29 Block chain data storage method and device and electronic equipment
PCT/CN2022/135539 WO2024021419A1 (en) 2022-07-29 2022-11-30 Blockchain data storage method and apparatus, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210908663.8A CN115221176A (en) 2022-07-29 2022-07-29 Block chain data storage method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115221176A true CN115221176A (en) 2022-10-21

Family

ID=83614655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210908663.8A Pending CN115221176A (en) 2022-07-29 2022-07-29 Block chain data storage method and device and electronic equipment

Country Status (2)

Country Link
CN (1) CN115221176A (en)
WO (1) WO2024021419A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024021419A1 (en) * 2022-07-29 2024-02-01 蚂蚁区块链科技 (上海) 有限公司 Blockchain data storage method and apparatus, and electronic device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175188B (en) * 2019-05-31 2021-05-11 杭州复杂美科技有限公司 Block chain state data caching and querying method, equipment and storage medium
CN110347660B (en) * 2019-06-28 2020-08-11 阿里巴巴集团控股有限公司 Block chain based hierarchical storage method and device and electronic equipment
US11489663B2 (en) * 2020-01-31 2022-11-01 International Business Machines Corporation Correlation-based hash tree verification
CN114706848A (en) * 2022-02-25 2022-07-05 蚂蚁区块链科技(上海)有限公司 Block chain data storage, updating and reading method and device and electronic equipment
CN115221176A (en) * 2022-07-29 2022-10-21 蚂蚁区块链科技(上海)有限公司 Block chain data storage method and device and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024021419A1 (en) * 2022-07-29 2024-02-01 蚂蚁区块链科技 (上海) 有限公司 Blockchain data storage method and apparatus, and electronic device

Also Published As

Publication number Publication date
WO2024021419A1 (en) 2024-02-01

Similar Documents

Publication Publication Date Title
CN110334154B (en) Block chain based hierarchical storage method and device and electronic equipment
CN110457319B (en) Block chain state data storage method and device and electronic equipment
CN110471795B (en) Block chain state data recovery method and device and electronic equipment
CN110493325B (en) Block chain state data synchronization method and device and electronic equipment
CN110347684B (en) Block chain based hierarchical storage method and device and electronic equipment
US11113272B2 (en) Method and apparatus for storing blockchain state data and electronic device
CN110347660B (en) Block chain based hierarchical storage method and device and electronic equipment
CN113220685B (en) Traversal method and device for intelligent contract storage content and electronic equipment
CN112988761B (en) Block chain data storage method and device and electronic equipment
CN112988912B (en) Block chain data storage method and device and electronic equipment
US11288247B2 (en) Blockchain based hierarchical data storage
CN114706848A (en) Block chain data storage, updating and reading method and device and electronic equipment
CN112988909B (en) Block chain data storage method and device and electronic equipment
CN112988908B (en) Block chain data storage method and device and electronic equipment
CN115221176A (en) Block chain data storage method and device and electronic equipment
CN112988910B (en) Block chain data storage method and device and electronic equipment
CN112988911B (en) Block chain data storage method and device and electronic equipment
CN112905607B (en) Block chain data storage method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination