CN117056342B

CN117056342B - Data processing method based on block chain and related equipment

Info

Publication number: CN117056342B
Application number: CN202311306070.5A
Authority: CN
Inventors: 廖志勇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-10-10
Filing date: 2023-10-10
Publication date: 2024-01-26
Anticipated expiration: 2043-10-10
Also published as: CN117056342A

Abstract

The embodiment of the application provides a data processing method and related equipment based on a block chain, wherein the block chain comprises a first block and a second block, a first merck prefix tree is maintained in the first block, and a second merck prefix tree is maintained in the second block; the first merck prefix tree and the second merck prefix tree both contain target tree nodes; the method comprises the following steps: acquiring a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first merck prefix tree; acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in a second merck prefix tree; and if the similarity between the first index and the second index meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database. According to the embodiment of the application, the data in the block can be effectively stored and managed.

Description

Data processing method based on block chain and related equipment

Technical Field

The present disclosure relates to the field of blockchain technologies, and in particular, to a blockchain-based data processing method and related devices, and more particularly, to a blockchain-based data processing method, a blockchain-based data processing apparatus, a node device, a computer readable storage medium, and a computer program product.

Background

The Blockchain (Blockchain) is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like, and can combine blocks into a chained data structure in a sequential connection mode according to time sequence, and the distributed account book is not tamperable and not counterfeitable in a cryptographic mode.

Currently, when data in a storage block is organized into a merck tree in the form of Key-Value pairs (KV), wherein keys of the Key-Value pairs are hash values of tree nodes, and values of the Key-Value pairs are data stored by the tree nodes. Over time, the blocks will increase, and the stored block data will also increase, and the data in the blocks cannot be effectively stored and managed by adopting the storage mode.

Disclosure of Invention

The embodiment of the application provides a data processing method based on a block chain and related equipment, which can effectively store and manage data in a block.

In one aspect, an embodiment of the present application provides a data processing method based on a blockchain, where the blockchain includes a first block and a second block, the first block maintains a first merck prefix tree, and the second block maintains a second merck prefix tree; the first merck prefix tree and the second merck prefix tree both contain target tree nodes, but the version of the target tree nodes in the first merck prefix tree is different from the version in the second merck prefix tree; the method comprises the following steps:

Acquiring a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first merck prefix tree; the first index is used for indicating first data stored in a target tree node in a first merck prefix tree;

acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in a second merck prefix tree; the second index is used for indicating second data stored in a target tree node in a second merck prefix tree;

obtaining similarity between the first index and the second index;

and if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database.

In one aspect, an embodiment of the present application provides a blockchain-based data processing device, where the blockchain includes a first block and a second block, the first block maintains a first merck prefix tree, and the second block maintains a second merck prefix tree; the first and second merck prefix trees each contain a target tree node, but the version of the target tree node in the first merck prefix tree is different from the version in the second merck prefix tree, the apparatus comprising:

The processing unit is used for acquiring a first index of the target tree node; the first index is constructed according to the association information of the target tree node in the first merck prefix tree; the first index is used for indicating first data stored in a target tree node in a first merck prefix tree;

the processing unit is further used for acquiring a second index of the target tree node, and the second index is constructed according to the association information of the target tree node in a second merck prefix tree; the second index is used for indicating second data stored in a target tree node in a second merck prefix tree;

the processing unit is also used for acquiring the similarity between the first index and the second index;

and the storage unit is used for storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database if the similarity meets the similarity condition.

In one aspect, an embodiment of the present application provides a node device, including:

a processor adapted to execute a computer program;

a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements a blockchain-based data processing method as described above.

In one aspect, the present embodiments provide a computer readable storage medium storing a computer program that is loaded by a processor and that performs a blockchain-based data processing method as described above.

In one aspect, embodiments of the present application provide a computer program product comprising a computer program or computer instructions which, when executed by a processor, implement the blockchain-based data processing method described above.

In an embodiment of the present application, the blockchain includes a first block in which a first merck prefix (MerklePatriciaTrie, MPT) tree is maintained and a second block in which a second MPT tree is maintained; the first MPT tree and the second MPT tree each contain a target tree node, but the version of the target tree node in the first MPT tree is different from the version in the second MPT tree; acquiring a first index of a target tree node, wherein the first index is constructed according to the association information of the target tree node in a first MPT tree; the first index is used for indicating first data stored in a target tree node in a first MPT tree, acquiring a second index of the target tree node, and constructing the second index according to association information of the target tree node in a second MPT tree; the second index is used for indicating second data stored in a target tree node in a second MPT tree; therefore, the embodiment of the application constructs different indexes for the target tree nodes of different versions based on the association information of the target tree nodes in different MPT trees, which provides basis for storing the data stored by the target tree nodes in different MPT trees into the same data block. And obtaining the similarity between the first index and the second index, and if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database. In the embodiment of the application, the similarity between the first index and the second index is fully utilized, the first data and the second data can be stored in the same data block of the database, and the data stored by tree nodes of different versions are effectively managed, so that the effective storage management of the data in the block is realized.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a block chain network architecture diagram provided in accordance with one exemplary embodiment of the present application;

FIG. 2 is a block chain architecture diagram according to one exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of a state tree provided by an exemplary embodiment of the present application;

FIG. 4 is a schematic diagram of a memory tree according to an exemplary embodiment of the present application;

FIG. 5a is a schematic diagram of an Stable file according to an exemplary embodiment of the present application;

FIG. 5b is a schematic diagram of a data block structure provided in an exemplary embodiment of the present application;

FIG. 6 is a flowchart of a method for processing data based on a blockchain in accordance with an exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of the locations of a target tree node and a next tree node in a state tree according to an exemplary embodiment of the present application;

FIG. 8 is a flowchart of a data processing method based on a blockchain according to another exemplary embodiment of the present application;

FIG. 9 is a flowchart of a method for processing data based on a blockchain in accordance with yet another exemplary embodiment of the present application;

FIG. 10 is a block chain based data processing apparatus according to one illustrative embodiment of the present application;

fig. 11 is a schematic structural diagram of a node device according to an exemplary embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

1. Blockchain networks and blockchains:

an embodiment of the present application relates to a blockchain network, please refer to fig. 1, which is a block chain network architecture diagram provided in an exemplary embodiment of the present application. In fig. 1, a blockchain network includes a plurality of node devices 101, each node device 101 being operable to receive input information (e.g., transaction data) and to maintain shared data within the blockchain network based on the received input information. In order to ensure the data sharing, the information intercommunication in the blockchain network can exist information connection between every two node devices, and the information transmission can be carried out between the node devices through the information connection. For example, when any node device in the blockchain network receives input information, other node devices in the blockchain network acquire the input information according to a consensus algorithm, and store the input information as data in shared data, so that the data stored on all node devices in the blockchain network are consistent.

The node device 101 may be a server or a terminal device. The terminal device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a vehicle-mounted terminal, etc., and the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligent platforms, etc.

Each node device 101 maintains the same one or more blockchains, illustratively, as in fig. 1, each node device 101 maintains the same one blockchain 11. The block chain is a novel application mode of computer technology such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The blockchain 11 may include a plurality of blocks, and the blockchain may combine the plurality of blocks in a sequential manner into a chained data structure in a time sequence, and cryptographically secure a non-tamperable and non-counterfeitable distributed ledger. Referring to fig. 2, a block chain architecture diagram is provided in accordance with an exemplary embodiment of the present application. A blockchain is made up of a plurality of blocks, and each block maintains a plurality of trees, which may include, but are not limited to, state trees, storage trees, and the like. Illustratively, as in FIG. 2, the blockchain includes at least a first block and a second block, and both the first block and the second block maintain a state tree and a storage tree. Any block may include a block header and a block body. The block body includes the input information, and the block header may include, but is not limited to: a Hash field (previous_hash) of the parent block, a State Root field (state_root), and the like. The Hash field of the parent block is used for storing the Hash value of the last block, and the previous_hash in the block header of the current block can be pointed to the last block (i.e. the parent block) through the Hash field of the parent block. Wherein the state root field is used to store a hash of the root node of the state tree.

The state tree may be made up of accounts, where accounts may include contract accounts, which refer to accounts that are automatically generated when intelligent contracts are deployed, and external accounts. An external account refers to an account that is actually controlled by an object (e.g., a user) and that is capable of deploying a smart contract. A block maintains a state tree, and the state tree maintained by the block records addresses of all accounts in the block and account information corresponding to each account. As shown in fig. 3, a schematic diagram of a state tree according to an exemplary embodiment of the present application is provided. The state tree includes a plurality of tree nodes, any tree node of the state tree may include one of: root node, intermediate node, and leaf node.

In fig. 3, the addresses of the accounts are distributed from the root node to leaf nodes of a state tree, one of which stores account information of one account; that is, account information for an account may be determined by traversing from the root node to the leaf node. Illustratively, the addresses of account 1 are distributed from the root node to leaf nodes 31, and the leaf nodes 31 are used for storing account information of account 1; the address of account 2 is distributed from the root node to the leaf node 32, and the leaf node 32 is used for storing account information of account 2; when the leaf node in the state tree is used to store account information of a contract account, as in fig. 3, the leaf node 32 is used to store account information of the contract account, where the account information of the contract account includes a random tree (Nonce), a Balance (Balance), a hash (CodeHash) of a corresponding contract, and a Storage Root storage_root. The Storage Root storage_root is used for storing the hash of the Root node of the Storage tree corresponding to the contract account.

The storage tree is used to store all account status information in a certain contract account, that is, one contract account corresponds to one storage tree, and the account status information includes, but is not limited to: account state variables, and variable values of account state variables. As shown in fig. 4, a schematic diagram of a memory tree is provided for an exemplary embodiment of the present application. The storage tree also includes a plurality of tree nodes, any tree node in the storage tree may include at least one of: root node, intermediate node, and leaf node. A leaf node in the storage tree stores account state information whose locations of storage slots (storage units) in the contract account are distributed from the root node to the leaf node, and corresponding account state information in the contract account can be determined by traversing from the root node to the leaf node.

It should be noted that, a route taken between a certain tree node and a root node in the state tree may be determined as a path of the tree node in the state tree, and a route taken between a certain tree node and a root node in the storage tree may be determined as a path of the tree node in the storage tree.

It should be appreciated that when account status information in a contract account changes, a latest block is generated in the blockchain, each time a latest block is generated in the blockchain, and after transactions in the latest block are executed, the account status information in the blockchain changes, a status tree and a storage tree need to be reconstructed based on the latest account status information of all accounts in the blockchain, but each time a new block is released, data stored in a part of tree nodes in the status tree and the storage tree may change, but the change is not modified in a corresponding tree node in the original status tree, some branches are newly created, and the data stored in the original tree node is reserved. When only the newly changed tree node needs to be modified, the other unmodified tree nodes point directly to the corresponding tree node in the previous block. As shown in fig. 2, the first block includes a variable value 33 of an account state variable in the contract account 1, if a contract of a transaction in the second block is called, the variable value 33 of the account state variable in the contract account 1 is changed to 35, then the hash of a tree node where the variable value 35 of the account state variable is located changes, the content and hash of the tree node involved in the whole path from the tree node to the Root node of the Storage tree also changes, that is, the storage_root in the contract account 1 also changes, so that the change of the state Root is affected, but the data stored in other tree nodes (that is, the tree nodes outside the whole path from the tree node to the Root node of the Storage tree) are not changed, and only the corresponding tree node in the previous block is directly pointed to, finally the second block maintains a new state tree, and the new state tree maintains the latest account state information under all accounts.

2. MPT (merklepatricia trie, merck prefix) tree:

MPT Tree is a data structure combining Merkle Tree and Patricia Tree (compressed prefix Tree, a more space-efficient Trie, dictionary Tree), and MPT may be used to store key-value pair data of arbitrary length. Wherein, merkle Tree means that a Hash value is calculated for each data (such as transaction), and then the Hash is calculated again by two-by-two connection until the Merkle root at the top layer; the Trie tree is also called a prefix tree or a dictionary tree, and is an ordered multi-way tree used for storing and retrieving key-value pairs (keys) which can be mapped to a limited character set to form a character string, and tree nodes in the Trie tree can record one character in the character string and point to the next character, so that a complete key is formed, and the same prefix can be shared among the tree nodes. Patricia Tree refers to an improvement over Trie Tree in which when a Tree node has only one child node, the child and parent nodes are merged, which allows the compressed prefix Tree to be more efficiently used to store data with longer identical prefixes.

The tree nodes of the MPT tree may include one of: expansion nodes, branch nodes and leaf nodes. In the embodiment of the present application, the state tree and the storage tree may be MPT trees, which is illustrative, and the root node of the state tree (or the storage tree) may include an expansion node; the intermediate nodes of the state tree (or storage tree) may include expansion nodes, or the intermediate nodes of the state tree (or storage tree) may include branching nodes, which are not limited in this application.

The expansion node may store up to 16 child nodes, each of which may correspond to an account. The data (value) stored by the expansion node may include: in addition, in the embodiment of the application, the data stored in the expansion node may further include a height value of a next tree node, where the height value of the next tree node may be represented by a commitHeight.

In the blockchain protocol, the hash of the tree node or the address of the account is a 16-ary string, so the branch node is a list of 17 in length, and the first 16 elements correspond to 16 possible hexadecimal values in the key; in the embodiment of the present application, the data (value) stored by the branch node may include: in addition, in the embodiment of the present application, the data stored by the branch node may further include a height value of a next tree node.

The leaf node is used for storing actual data, in this embodiment, the leaf node of the state tree is used for storing account information and metadata of a contract account, the metadata includes a tree name of the storage tree and a height value of a root node of the storage tree, the metadata does not participate in hash calculation of the tree node, the tree name of the storage tree can be represented by a store ID, and the height value of the root node of the storage tree can be represented by a completheight. The leaf nodes of the storage tree are used to store account status information for the contract accounts.

In the embodiment of the application, an index (key) may be created for a tree node based on association information of the tree node in a state tree or a storage tree, and data (i.e., value) stored in the state tree or the storage tree by the tree node is stored in a database in a key-value form, where the key of the tree node is the index and the value is the data stored in the state tree or the storage tree by the tree node.

3. Database:

the database provided by the embodiment of the application can be a kv open source database; illustratively, the database may include, but is not limited to, a leverdb; the database may include one or more Sstable (ordered string table, sorted String Table) files, where the Sstable files are used to store an ordered set of key-value pairs, and one Sstable file is typically divided into blocks according to a fixed size, so that a plurality of data blocks are obtained, and the size of each data block is fixed. As shown in fig. 5a, a schematic diagram of an Sstable file according to an exemplary embodiment of the present application is provided. In fig. 5a, an Sstable file is divided into n Data blocks (i.e. Data Block1, data Block2 … Data Block n) according to 4kiB, and each Data Block has a size of 4kiB.

As shown in fig. 5b, a schematic diagram of a data block according to an exemplary embodiment of the present application is provided. The data blocks may be used to store data, and in embodiments of the present application, the data blocks may be in the form of key-value, where the data is stored by the storage tree nodes in different state trees or storage trees, and in one implementation, each data block is compressed using a compression algorithm (snapy algorithm). The higher the compression ratio, the more the data overlap in a data block. In addition, each data block may also store additional two auxiliary fields, compression Type (compression Type) and check code (CRC), respectively. The compression type may be used to indicate whether data compression has occurred in the data block, and the CRC is used to check the data and compression type in the data block.

As can be seen from the above description, a blockchain includes a plurality of blocks, each of which maintains an MPT tree, where the MPT tree is a state tree or a storage tree. When updating a certain tree node in the MPT tree, only the data stored in other tree nodes associated with the tree node will be updated with a high probability, so for the same tree node in the MPT tree maintained by different blocks, the data stored in the tree nodes of different versions have a large data overlap, for example, for the expansion node of the MPT tree, the hash of 16 child nodes is stored, and when updating the expansion node, the hash of only a few child nodes (such as the hash of one child node) is updated with a high probability, that is, the data stored in the expansion node of different versions (such as the hash of 16 child nodes stored in the expansion node of different versions) has a large data overlap.

Based on this, in order to be able to store the data stored in the tree nodes of different versions into the same data block, in the embodiment of the present application, a data processing scheme based on a blockchain is provided, where the data processing scheme takes a first block and a second block included in the blockchain as an example, where the first block maintains an MPT tree, and the second block maintains a second MPT tree; the first MPT tree and the second MPT tree both comprise target tree nodes, but the versions of the target tree nodes in the first MPT tree are different from the versions of the target tree nodes in the second MPT tree, a first index of the target tree nodes is obtained, and the first index is constructed according to the association information of the target tree nodes in the first MPT tree; the first index is used for indicating first data stored in a target tree node in a first MPT tree, acquiring a second index of the target tree node, and constructing the second index according to association information of the target tree node in a second MPT tree; a second index for indicating second data stored in a target tree node in the second MPT tree; therefore, the embodiment of the application constructs different indexes for the target tree nodes of different versions based on the association information of the target tree nodes in different MPT trees, which provides basis for storing the data stored by the target tree nodes in different MPT trees into the same data block. And obtaining the similarity between the first index and the second index, and if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database. In the embodiment of the application, the similarity between the first index and the second index is fully utilized, the first data and the second data can be stored in the same data block of the database, and the data stored by tree nodes of different versions can be effectively managed, so that the effective storage management of the data in the block is realized.

A related description of the blockchain-based data processing method provided in the embodiments of the present application follows. The blockchain includes a plurality of blocks, in this embodiment, the blockchain includes a first block and a second block. In one implementation, the first block and the second block may be two blocks that are consecutive on a blockchain. In another implementation, the blocks on the blockchain are divided into a plurality of sections according to a preset index building factor, illustratively, the blockchain comprises 100 blocks (such as block 0-block 99), the preset index building factor has a value of 10, the blocks on the blockchain are divided into 10 sections according to the index building factor 10, that is, blocks 0-9 are divided into section 1, blocks 10-19 are divided into section 2, and so on, blocks 90-99 are divided into section 10. The first block and the second block may refer to any two blocks located within the same section.

It should be appreciated that the value of the predetermined index building factor is determined by the data storage requirements of the blockchain, including at least one of the following: compression rate at the time of data storage and processing efficiency at the time of data storage may refer to: the efficiency of data writing into the database when data is stored. When the data storage requirement comprises the compression rate during data storage, the higher the compression rate during data storage is, the larger the value of the preset index construction factor is, but the lower the processing efficiency during data storage is. When the data storage requirement includes processing efficiency in data storage, the higher the processing efficiency in data storage, the smaller the preset index construction factor value will be, but the lower the compression rate in data storage is.

In the practical application process, a proper index construction factor can be set according to the data storage requirement, if the data storage requirement comprises data processing efficiency and the processing efficiency is higher when the data storage is required, the value of the preset index construction factor can be set smaller (for example, set to 1); if the data storage requirement includes a compression rate at the time of data storage, and the compression rate at the time of data storage is required to be higher, the preset index building factor may be set to be larger (for example, set to 100, 1000, etc.). When the data storage requirement includes the compression rate during data storage and the processing efficiency during data storage, the preset index construction factor can be appropriately valued on the premise of balancing the processing efficiency during data storage and the compression rate during data storage.

Referring to fig. 6, a flowchart of a data processing method based on a blockchain is provided in an exemplary embodiment of the present application. The blockchain-based data processing method may be performed by a node device in the blockchain network. A first MPT tree is maintained in the first block, and a second MPT tree is maintained in the second block; the first MPT tree and the second MPT tree each contain a target tree node, but the version of the target tree node in the first MPT tree is different from the version in the second MPT tree. The data processing method based on the blockchain in the embodiment of the application comprises the following steps S601-S604:

S601, acquiring a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first MPT tree; the first index is used to indicate first data stored in a target tree node in the first MPT tree.

S602, acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in a second MPT tree; the second index is used to indicate second data stored in a target tree node in a second MPT tree.

The indexes (i.e., the first index and the second index) of the target tree node may be obtained by constructing association information of the target tree node in the corresponding MPT tree according to an index construction rule. The association information may include block heights, paths of the target tree nodes in the corresponding MPT tree, and tree names of the MPT tree; the index of the target tree node obtained by constructing according to the index constructing rule can be constructed based on the block height, the path of the target tree node in the corresponding MPT tree and the tree name of the MPT tree. How to construct the association information of the target tree node in the corresponding MPT tree according to the index construction rule to obtain the index of the target tree node is described in the subsequent embodiments, which is not described herein.

It should be understood that the first MPT tree and the second MPT tree according to the embodiments of the present application are the same type of MPT tree, which is schematically shown, and the first MPT tree and the second MPT tree may be both storage trees, or the first MPT tree and the second MPT tree may be both state trees.

(1) When the first MPT tree and the second MPT tree are both state trees, the target tree node may include one of: an expansion node, a branch node and a leaf node in the state tree. Wherein:

when the target tree node comprises an expansion node, the first data or the second data stored in the target tree node comprises a hash value of a next tree node of the target tree node and a height value of the next tree node. Specifically, the first data stored in the target tree node includes a hash value of a next tree node of the target tree node in the first MPT tree and a height value of the next tree node; the second data stored in the target tree node includes a hash value of a next tree node of the target tree node in the second MPT tree and a height value of the next tree node.

When the target tree node includes a branch node, the first data or the second data stored in the target tree node includes a hash value of a next tree node of the target tree node and a height value of the next tree node. Specifically, the first data stored in the target tree node includes a hash value of a next tree node of the target tree node in the first MPT tree, and the second data stored in the target tree node includes a hash value of a next tree node of the target tree node in the second MPT tree and a height value of the next tree node.

In one implementation, the height value of the next tree node may be equal to the block height of the block in which the next tree node is located. As shown in fig. 7, a schematic diagram of the positions of a target tree node and a next tree node in a state tree according to an exemplary embodiment of the present application is provided. When a child node is newly added to the target tree node 11 in the state tree maintained by the block 1, the comitheight=1 stored in the target tree node 11; when a child node is added to the target tree node 12 in the state tree maintained by block 2, the destination tree node 12 stores a commit height=2.

When the target tree node includes a leaf node, the first data or the second data stored in the target tree node includes: the system comprises account information and metadata, wherein the metadata comprises a tree name of a storage tree corresponding to a contract account where the account information is located and a height value of a root node of the storage tree. Specifically, the first data stored in the target tree node includes account information and metadata corresponding to the target tree node in the first MPT tree, and the first data stored in the target tree node includes account information and metadata corresponding to the target tree node in the second MPT tree.

(2) The first MPT tree and the second MPT tree are both storage trees, and the first MPT tree and the second MPT are both used for storing all account state information of the target contract account in the corresponding block.

Specifically, the first MPT tree is used to store all account status information of the target contract account in the first block, and the second MPT tree is used to store all account status information of the target contract account in the second block. The target tree node may comprise one of: expansion nodes, branch nodes and leaf nodes in the storage tree. Wherein:

When the target tree node includes a leaf node, the first data or the second data stored in the target tree node includes: account status information. Specifically, the first data stored in the target tree node includes account state information of the target contract account in the first MPT tree; the second data stored in the target tree node includes account state information of the target contract account in the second MPT tree.

It should be understood that when the target tree node is a branch node or an expansion node, the number of next tree nodes of the target tree node is plural, and at this time, the first data stored in the target tree node or the second data stored in the target tree node includes hash values and height values of each next tree node of the target tree node.

S603, obtaining the similarity between the first index and the second index.

Wherein obtaining the similarity between the first index and the second index may include, but is not limited to, the following:

(1) The similarity between the first index and the second index is used to represent the length of the same prefix that is provided between the first index and the second index. Illustratively, the first index is 00110, the second index is 00111, and it can be seen that the first index and the second index have a longer identical prefix, that is, the first index and the second index have an identical prefix of 0011, and the similarity between the first index and the second index is that the length of the identical prefix between the first index and the second index is 4.

(2) The similarity calculation method may include, but is not limited to, for the similarity between the first index and the second index: cosine similarity calculation method, euclidean distance similarity calculation method, and the like. Illustratively, the similarity calculation method includes a euclidean distance similarity calculation method, which can perform vector conversion on the first index and the second index to obtain a vector of the first index and a vector of the second index, perform euclidean distance calculation on the vector of the first index and the vector of the second index, and use the calculated distance as the similarity between the first index and the second index.

(3) The first index comprises N digits which are sequentially arranged, and the second index comprises N digits which are sequentially arranged; obtaining the similarity between the first index and the second index comprises the following steps: s1, traversing and comparing whether values of the same digits between the first index and the second index are the same according to the arrangement sequence of the N digits. S2, when the first traversal comparison is different from any same digit between the first index and the second index, ending the traversal comparison. S3, determining the similarity between the first index and the second index according to the length M of the digits with the same value between the first index and the second index obtained through traversal comparison; m is a positive integer and M is less than or equal to N.

Illustratively, the first index is 0SH1, and the first index includes 4 digits arranged in sequence, which are a first digit, a second digit, a third digit, and a fourth digit, respectively, where the value of the first digit is 0, the value of the second digit is S, the value of the third digit is H, and the value of the fourth digit is 1. The second index is 0SH2, and the second index also comprises 4 digits which are sequentially arranged, namely a first digit, a second digit, a third digit and a fourth digit, wherein the value of the first digit is 0, the value of the second digit is S, the value of the third digit is H, and the value of the fourth digit is 2. Firstly, according to the arrangement sequence of 4 digits, traversing and comparing whether the value of the first digit of the first index is identical to the value of the first digit of the second index, when the value of the first digit of the first index is identical to the value of the first digit of the second index, continuing to compare the value of the second digit of the first index to be identical to the value of the second digit of the second index, and so on, when the value of the fourth digit of the traversing and comparing to the value of the fourth digit of the first index is not identical to the value of the fourth digit of the second index, determining that the length of the digits with the same value between the first index and the second index obtained by traversing and comparing is 3, and then determining the similarity between the first index and the second index according to the length.

According to the length M of the digits with the same value between the first index and the second index obtained by traversal comparison, determining the similarity between the first index and the second index comprises the following steps: in one implementation, the value of M is determined as a similarity between the first index and the second index. In another implementation, the duty cycle of M in N is determined as the similarity between the first index and the second index. Illustratively, the length M is 3, N is 4, the ratio of the length M to N is 3/4, and 3/4 can be determined as the similarity between the first index and the second index.

S604, if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database.

In the embodiment of the present application, the similarity between the first index and the second index satisfies the similarity condition, which means that the first index and the second index are relatively close, so that the first data indicated by the first index and the second data indicated by the second index can be guaranteed to be stored in the same data block with a high probability. Thus, a similarity condition may be preset, illustratively, the similarity condition includes a similarity threshold, which may be set according to data storage requirements. If the similarity between the first index and the second index is greater than or equal to the similarity threshold, determining that the similarity between the first index and the second index meets the similarity condition, and proceeding to step S604; if the similarity between the first index and the second index is smaller than the similarity threshold, it is determined that the similarity does not satisfy the similarity condition, and at this time, the first data indicated by the first index and the second data indicated by the second index may be stored in different data blocks in the database, respectively.

In the embodiment of the application, a first index of a target tree node is obtained, wherein the first index is constructed according to the association information of the target tree node in a first MPT tree; the first index is used for indicating first data stored in a target tree node in a first MPT tree, acquiring a second index of the target tree node, and constructing the second index according to association information of the target tree node in a second MPT tree; a second index for indicating second data stored in a target tree node in the second MPT tree; therefore, the embodiment of the application constructs different indexes for the target tree nodes of different versions based on the association information of the target tree nodes in different MPT trees, which provides basis for storing the data stored by the target tree nodes in different MPT trees into the same data block. And obtaining the similarity between the first index and the second index, and if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database. In the embodiment of the application, the similarity between the first index and the second index is fully utilized, the first data and the second data are stored in the same data block of the database, and the data stored by tree nodes of different versions can be effectively managed, so that the effective storage management of the data in the block is realized.

Referring to fig. 8, a flowchart of a data processing method based on a blockchain is provided in another exemplary embodiment of the present application. The blockchain-based data processing method may be performed by a node device in the blockchain network. In this embodiment, the first block and the second block included in the blockchain are also described as an example. The data processing method based on the block chain comprises the following steps S801-S806:

s801, acquiring a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first MPT tree; the first index is used to indicate first data stored in a target tree node in a first MPT tree.

S802, acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in a second MPT tree; the second index is used to indicate second data stored in a target tree node in a second MPT tree.

S803, obtaining the similarity between the first index and the second index.

The specific implementation manner of step S801 to step S803 can be referred to the specific implementation manner of step S601 to step S603, and will not be described herein.

S804, if the similarity meets the similarity condition, based on the first index and the second index, writing the first data indicated by the first index and the second data indicated by the second index into the same data block in the ordered string table file in sequence.

In a specific implementation, the size between the first index and the second index is determined, if the first index is smaller than the second index, based on the first index, first data indicated by the first index is written into a target data block of the ordered string table file, and based on the second index, second data indicated by the second index is written into the target data block of the ordered string table file, and when the second data is written, only a part except a similar part between the second index and the first index is stored in the target data block, so that repeated content storage between indexes is avoided.

It should be noted that, the above only takes the target tree node in the MPT tree as an example, and all other tree nodes in the MPT tree can refer to the target tree node to store data. In addition, in the specific implementation, the blockchain includes a plurality of blocks, each block maintains a corresponding MPT tree, each block maintains an MPT tree that may include a target tree node, and versions of the target tree node in the MPT tree maintained by each block are different, where, in this case, the index of the target tree node may be obtained by constructing association information of the target tree node in the MPT tree maintained by the corresponding block, and then obtaining a similarity between indexes of the target tree node, and determining whether the similarity between indexes of the target tree node meets a similarity condition, and storing data indicated by an index corresponding to the similarity meeting the similarity condition into the same data block. It should be understood that, since one data block has a fixed size, data indicated by an index corresponding to a similarity satisfying the similarity condition may be stored in the same data block without exceeding the data block size.

For example, taking a block chain including a first block, a second block, and a third block as an example, the first block, the second block, and the third block are three consecutive blocks in the same interval, the first block maintains a first MPT tree, the second block maintains a second MPT tree, and the third block maintains a third MPT tree; the first MPT tree, the second MPT tree and the third MPT tree comprise target tree nodes, the first index of each target tree node is obtained by constructing the association information of each target tree node in the first MPT tree, the second index of each target tree node is obtained by constructing the association information of each target tree node in the second MPT tree, and the third index of each target tree node is obtained by constructing the association information of each target tree node in the third MPT tree. If the similarity between the first index and the second index meets the similarity condition, the similarity between the second index and the third index meets the similarity condition, the first data indicated by the first index, the second data indicated by the second index, and the third data indicated by the third index occupy a data block with a size smaller than the fixed size of the data block, the first data indicated by the first index, the second data indicated by the second index, and the third data indicated by the third index may be stored in the same data block.

Wherein the more indexes of tree nodes satisfying the similar condition, the higher the degree of overlap between data written into the same data block. As can be seen from fig. 2, each time a new block is released, there is a possibility that some of the data stored in the tree nodes of the MPT tree will change, but the change is not modified at the tree node corresponding to the original state tree, but some branches are newly created, and the data stored in the tree node of the original state tree is retained. The index of the tree nodes aiming at different versions is relatively close, and the characteristic that corresponding data is written in sequence according to the index by utilizing the sstable file can be utilized, so that the data stored by the tree nodes of different versions can be written into the same data block, and then the higher the overlapping degree of the data written into the same data block is. For example, in the above example, the more indexes of the target tree node satisfying the similarity condition, for example, the similarity between the first index, the second index and the third index satisfies the similarity condition, which means that the first index, the second index and the third index are relatively close, and based on the first index, the second index and the third index, the first data indicated by the first index, the second data indicated by the second index and the third data indicated by the third index are sequentially written into the same data block, and the first data, the second data and the third data written into the same data block have a larger data overlap degree, which can be beneficial to the subsequent compression of the data in the data block.

S805, compressing the data blocks written with the first data and the second data by adopting a compression algorithm to obtain compressed data blocks. Among them, compression algorithms include, but are not limited to: the snappy algorithm.

The higher the degree of overlap between data written in the same data block, the higher the compression rate at the time of data storage, but the lower the processing efficiency at the time of data storage.

S806, storing the compressed data blocks into an ordered string table file of a database.

The number of the ordered character string table files can be multiple, and the ordered character string table files can be stored in a database in a layered mode. When any ordered string table file is combined with the ordered string table file of the next layer, the smaller the preset index construction factor is, the smaller the overlapping degree between the index stored in any ordered string table file and the index stored in the ordered string table file of the next layer is, the higher the processing efficiency in data storage is, but the lower the compression rate in data storage is; the larger the overlapping degree between the index stored in any ordered string table file and the index stored in the ordered string table file of the next layer, the lower the processing efficiency during data storage, but the higher the compression rate during data storage; illustratively, the index stored in the current ordered string table file includes 0-100, the index stored in the ordered string table file of the next layer includes 50-200, the overlapping degree between the index stored in any ordered string table file and the index stored in the ordered string table file of the next layer is large, the index in the current ordered string table file and the index in the ordered string table file of the next layer need to be combined, and the index in the ordered string table file of the next layer is 50-100 and is put into the current ordered string table file, so that the processing efficiency in data storage becomes low, and the compression rate in data storage becomes high. If the index in the current ordered string table file is overlapped with the index in the ordered string table file of the next layer, the merging process may not be needed, so that the processing efficiency in the data storage process is improved, but the compression rate is lower.

In the embodiment of the application, a first index of a target tree node is obtained, wherein the first index is constructed according to the association information of the target tree node in a first MPT tree; the first index is used for indicating first data stored in a target tree node in a first MPT tree, acquiring a second index of the target tree node, and constructing the second index according to association information of the target tree node in a second MPT tree; a second index for indicating second data stored in a target tree node in the second MPT tree; therefore, the embodiment of the application constructs different indexes for the target tree nodes of different versions based on the association information of the target tree nodes in different MPT trees, which provides basis for storing the data stored by the target tree nodes in different MPT trees into the same data block. And obtaining the similarity between the first index and the second index, if the similarity meets the similarity condition, writing the first data indicated by the first index and the second data indicated by the second index into the same data block in the ordered string table file sequentially based on the first index and the second index, compressing the data block written with the first data and the second data by adopting a compression algorithm to obtain a compressed data block, and storing the compressed data block into the ordered string table file of the database. Because each time a new block is released, the data stored by some tree nodes in the MPT tree may be changed, but the change is not modified at the corresponding tree node of the original state tree, but some branches are newly created, and the data stored by the original tree node is reserved, so that there may be a large data overlap for the data stored by different versions of target tree nodes (such as different versions of expansion nodes), in this embodiment, the indexes of different versions of target tree nodes are built based on the association information of different versions of target tree nodes in the corresponding MPT tree, so that the indexes of target tree nodes meeting similar conditions are relatively close, when the first data and the second data are written into the data block in sequence, the first data and the second data with large data overlap can be written into the same data block based on the relatively close indexes, thereby improving the data compression rate, reducing the data size storage, and improving the expansibility of node equipment to a certain extent.

The following describes a related description of an index construction manner of the target tree node provided in the embodiment of the present application. It should be understood that, only the index construction mode of the target tree node is described herein by taking the target tree node as an example, and the other tree nodes in the first MPT tree and the second MPT tree can both refer to the index construction mode of the target tree node to construct corresponding indexes.

In some alternative embodiments, the index of the target tree node may be constructed by constructing association information of the target tree node in the corresponding MPT tree according to an index construction rule. Wherein the association information includes block height, paths of the target tree nodes in the corresponding MPT tree, and tree names of the MPT tree.

Specifically, the first index of the target tree node may be obtained by constructing association information of the target tree node in the first MPT tree according to an index construction rule, and the second index of the target tree node may be obtained by constructing association information of the target tree node in the second MPT tree according to an index construction rule.

(1) And constructing the association information of the target tree node in the first MPT tree according to the first index of the target tree node by an index construction rule.

The construction method of the first index of the target tree node comprises the following steps: and constructing a first index of the target tree node according to an index construction rule based on the block height of the first block, the path of the target tree node in the first MPT tree and the tree name of the MPT tree. Specifically, according to an index construction rule, a rounding operation can be performed on a quotient between the block height of the first block and a preset index construction factor to obtain a first rounding result, and a remainder operation can be performed on a quotient between the block height of the first block and the index construction factor to obtain a first remainder result; and according to the index construction rule, performing splicing processing on the first rounding result, the tree name of the first MPT tree, the path of the target tree node in the first MPT tree and the first surplus result to obtain a first index of the target tree node.

Illustratively, the preset index construction factor is expressed as a factor, the index construction rule is expressed as a block height/factor+tree name+path+block height% factor, wherein the block height/factor is used for expressing that the quotient between the block height and the preset index construction factor is subjected to rounding operation, the block height% factor is expressed that the quotient between the block height and the preset index construction factor is subjected to redundancy operation, the "+" is expressed as splicing treatment, the block height of the first block is set to be 1, the factor is set to be 10, the tree name of the first MPT tree is set to be S, the path of the first MPT tree is set to be H, the quotient between the block heights 1 and 10 is subjected to rounding operation according to the index construction rule, the first rounding result is set to be 0, and the quotient between the block heights 1 and 10 is subjected to redundancy operation, and the first redundancy result is set to be 1; and then, according to an index construction rule, performing splicing processing on the first rounding result 0, the tree name S of the first MPT tree, the path H and the first remainder result 1 to obtain a first index of 0SH1.

(2) The second index of the target tree node may be obtained by constructing association information of the target tree node in the second MPT tree according to an index construction rule.

The construction method of the second index of the target tree node comprises the following steps: according to the index construction rule, performing rounding operation on a quotient between the block height of the second block and a preset index construction factor to obtain a second rounding result, and performing remainder operation on a quotient between the block height of the second block and the index construction factor to obtain a second remainder result; and according to the index construction rule, performing splicing processing on the second rounding result, the tree name of the second MPT tree, the path of the target tree node in the second MPT tree and the second surplus result to obtain a second index of the target tree node. Illustratively, the index construction rule is expressed as "block height/factor+tree name+path+block height% factor", the block height of the second block is set to be 2, the factor is set to be 10, the tree name of the second MPT tree is set to be S, the path of the second MPT tree is set to be H, the quotient between the block heights 1 and 10 is rounded according to the index construction rule to obtain a second rounded result of 0, and the quotient between the block heights 1 and 10 is subjected to remainder calculation to obtain a second remainder result of 2; and then, according to the index construction rule, performing splicing processing on the second rounding result 0, the tree name S, the path H and the second remainder result 2 of the second MPT tree to obtain a second index of 0SH2.

Wherein the first MPT tree and the second MPT tree have the same tree name. As an implementation, when the first MPT tree and the second MPT tree are both state trees, the tree names of the first MPT tree and the second MPT tree are both preset fixed strings, such as "S", "SS", and so on. As another implementation, when the first MPT tree and the second MPT tree are both storage trees, the first MPT tree and the second MPT tree have the same tree name, and the tree name definitions of the first MPT tree and the second MPT tree include the following two types: (1) the first MPT tree and the second MPT are used for storing all account state information of the target contract account in the corresponding block, and the tree names of the first MPT tree and the second MPT tree are the addresses of the target contract account. (2) The tree names of the first and second MPT trees are determined jointly by the block height of the block in which the target contract account was first created and the order in which the target contract account was created in the block. It should be appreciated that the block in which the target contract account is first created may be the first block, the second block, or other blocks. Illustratively, in block 1 (block 1 has a block height of 1) the target contract accounts are first created, and the order in which the target contract accounts are created in block 1 is 2 (i.e., the target contract accounts are created second in block 1), then the tree names of the first and second MPT trees may be 12.

It should be appreciated that the above index construction rules are defined based on index construction factors; in one implementation, as can be seen from the foregoing, the blocks on the blockchain are divided into a plurality of intervals according to a preset index construction factor, and the definition principle of the index construction rule includes: in the MPT tree maintained by different blocks located within the same interval, the similarity between different indexes of the same tree node should satisfy the similarity condition.

The larger the value of the index construction factor is, the more the number of blocks in the same interval is, and the more the number of indexes meeting similar conditions is; the smaller the value of the index construction factor is, the smaller the number of blocks in the same interval is, and the smaller the number of indexes meeting the similar condition is. Illustratively, the index construction factor has a value of 1000, and the number of blocks in the same interval is 999; the index construction factor takes a value of 100, and the number of blocks in the same interval is 99.

In summary, by creating different indexes of the target tree node based on the association information of the target tree node in different MPT trees through the above index construction rule, a certain similarity can be provided between different indexes of the target tree node, so that it is beneficial to store data (for example, first data indicated by the first index and second data indicated by the second index) indicated by different indexes of which the similarity meets a similarity threshold into the same database of the database.

The data stored by any tree node in the read database provided in the embodiments of the present application is explained in the following. Referring to fig. 9, a flowchart of a data processing method based on a blockchain is provided in an exemplary embodiment of the present application. The blockchain-based data processing method may be performed by a node device in the blockchain network, in this embodiment, taking reading data stored in the first MPT tree by the target tree node from the database as an example, reading data stored in a different MPT tree by other tree nodes may refer to reading data stored in the first MPT tree by the target tree node. In an embodiment of the present application, the blockchain-based data processing method may include the following steps S901 to S903:

s901, acquiring association information of a target tree node in a first MPT tree, wherein the association information comprises: the block height of the first block, the path of the target tree node in the first MPT tree, and the tree name of the first MPT tree.

In one implementation, when the first MPT tree is a state tree or a storage tree, the association information includes a block height of the first block, and the obtaining association information of the target tree node in the first MPT tree may specifically include: a last tree node of the target tree node in the first MPT tree is determined and a height value of the target tree node in the first MPT tree is read from the last tree node.

In another implementation, when the first MPT tree is a storage tree, the target tree node is a root node in the first MPT tree, and the obtaining the association information of the target tree node in the first MPT tree may specifically include: and acquiring a state tree corresponding to the first MPT tree, reading the tree name of the first MPT tree and the height value of the target tree node in the first MPT tree from the leaf nodes of the state tree, and determining the null value as the path of the target tree node in the first MPT tree. Specifically, the data stored in the leaf node of the state tree corresponding to the first MPT tree includes metadata, where the metadata includes a tree name of the first MPT and a height value of a root node of the first MPT tree, and the tree name of the first MPT tree and the height value of the target tree node in the first MPT tree are read from the metadata stored in the leaf node.

The height value of the target tree node in the first MPT tree may be equal to the block height of the first block.

S902, according to the block height of the first block, constructing a path of the target tree node in the first merck prefix and a tree name of the first merck prefix tree according to an index construction rule to obtain a first index of the target tree node.

The specific implementation manner of step S902 may refer to the above process of constructing the first index of the target tree node according to the index rule, which is not described herein.

S903, according to the first index, searching a data block where the first data indicated by the first index is located from the database, and reading the first data from the searched data block.

In a specific implementation, determining an ordered character string table file in which the first index is located, searching a data block in which first data indicated by the first index is located from the determined ordered character string table file according to the first index, and reading the first data from the searched data block.

In this embodiment, the first index of the target tree node is constructed by the obtained association information of the target tree node in the first MPT tree, so that the first data indicated by the target tree node can be accurately read from the database.

A related description of the blockchain-based data processing device provided in the embodiments of the present application follows.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a blockchain-based data processing apparatus provided in an embodiment of the present application, where the blockchain-based data processing apparatus may be a computer program (including program code) in a node device, for example, the blockchain-based data processing apparatus may be an application software in the node device; the blockchain-based data processing device may be used to perform some or all of the steps in the method embodiments shown in fig. 6, 8, and 9. Referring to fig. 10, the blockchain includes a first block in which a first MPT tree is maintained and a second block in which a second MPT tree is maintained; the first MPT tree and the second MPT tree each contain a target tree node, but the version of the target tree node in the first MPT tree is different from the version in the second MPT tree, the blockchain-based data processing apparatus comprising:

A processing unit 1001, configured to obtain a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first MPT tree; the first index is used for indicating first data stored in a target tree node in the first MPT tree;

the processing unit 1001 is further configured to obtain a second index of the target tree node, where the second index is constructed according to association information of the target tree node in a second MPT tree; the second index is used for indicating second data stored in a target tree node in a second MPT tree;

the processing unit 1001 is further configured to obtain a similarity between the first index and the second index;

the storage unit 1002 is configured to store the first data indicated by the first index and the second data indicated by the second index into the same data block of the database if the similarity satisfies the similarity condition.

The block chain comprises a plurality of blocks, and the blocks on the block chain are divided into a plurality of intervals according to a preset index construction factor; the first block and the second block refer to any two blocks located in the same interval;

the value of the index construction factor is determined by the data storage requirement of the blockchain; the data storage requirements include at least one of: compression rate at the time of data storage and processing efficiency at the time of data storage; the higher the compression rate is, the larger the value of the index construction factor is; the higher the processing efficiency, the smaller the value of the index building factor.

The index of the target tree node is obtained by constructing the association information of the target tree node in the corresponding MPT tree according to an index construction rule;

the index construction rule is defined based on an index construction factor; the definition principle of the index construction rule comprises: in MPT trees maintained by different blocks located in the same interval, the similarity between different indexes of the same tree node should satisfy a similarity condition;

the larger the value of the index construction factor is, the more the number of blocks in the same interval is, and the more the number of indexes meeting similar conditions is; the smaller the value of the index construction factor is, the smaller the number of blocks in the same interval is, and the smaller the number of indexes meeting the similar condition is.

Wherein, the association information includes: block height, path of the target tree node in the corresponding MPT tree, and tree name of the MPT tree;

the construction method of the first index comprises the following steps:

performing rounding operation on a quotient between the block height of the first block and a preset index construction factor according to an index construction rule to obtain a first rounding result; performing remainder operation on a quotient between the block height of the first block and the index construction factor to obtain a first remainder result;

According to an index construction rule, performing splicing processing on the first rounding result, the tree name of the first MPT tree, the path of the target tree node in the first MPT tree and the first surplus result to obtain a first index of the target tree node;

the construction method of the second index comprises the following steps:

performing rounding operation on the quotient between the block height of the second block and the index building factor according to the index building rule to obtain a second rounding result; performing remainder operation on the quotient between the block height of the second block and the index construction factor to obtain a second remainder result;

and according to the index construction rule, performing splicing processing on the second rounding result, the tree name of the second MPT tree, the path of the target tree node in the second MPT tree and the second surplus result to obtain a second index of the target tree node.

Wherein the first index comprises N digits which are sequentially arranged, and the second index comprises N digits which are sequentially arranged; the processing unit 1001 is specifically configured to:

traversing and comparing whether values of the same digits between the first index and the second index are the same according to the arrangement sequence of the N digits;

when the first traversal comparison is different from any same digit between the first index and the second index, ending the traversal comparison; the method comprises the steps of,

Determining the similarity between the first index and the second index according to the length M of the digits with the same value between the first index and the second index obtained by traversing comparison; m is a positive integer and M is less than or equal to N.

The processing unit 1001 is specifically configured to:

determining the value of M as the similarity between the first index and the second index; or,

determining the duty ratio of M in N as the similarity between the first index and the second index;

the similarity condition comprises a similarity threshold, and if the similarity between the first index and the second index is greater than or equal to the similarity threshold, the similarity meets the similarity condition; if the similarity between the first index and the second index is smaller than the similarity threshold, the similarity does not satisfy the similarity condition.

The first MPT tree and the second MPT tree are state trees, and have the same tree name, and the tree name is a preset fixed character string; the state tree is used for storing addresses of all contract accounts in the corresponding block and account information of each contract account; the target tree node includes one of: expansion nodes, branch nodes and leaf nodes in the state tree;

when the target tree node comprises an expansion node, the first data or the second data stored in the target tree node comprises a hash value of a next tree node of the target tree node and a height value of the next tree node;

When the target tree node comprises a branch node, the first data or the second data stored in the branch node comprises a hash value of a next tree node of the branch node and a height value of the next tree node;

when the target tree node includes a leaf node, the first data or the second data stored in the leaf node includes: the system comprises account information and metadata, wherein the metadata comprises a tree name of a storage tree corresponding to a contract account where the account information is located and a height value of a root node of the storage tree.

The first MPT tree and the second MPT tree are both storage trees, and are used for storing all account state information of the target contract account in the corresponding block; the target tree node includes one of: storing expansion nodes, branch nodes and leaf nodes in the tree;

when the target tree node comprises a branch node, the first data or the second data stored in the branch node comprises a hash value of a next tree node of the target tree node and a height value of the next tree node;

When the target tree node includes a leaf node, the first data or the second data stored in the leaf node includes: account status information;

wherein the first MPT tree and the second MPT tree have the same tree name, the tree name is the address of the target contract account, or the tree name is determined by the block height of the block in which the target contract account is first created and the order in which the target contract account is created in the block.

Wherein the database comprises an ordered character string table file; the ordered string table file includes data blocks; the storage unit 1002 is specifically configured to:

based on the first index and the second index, writing the first data indicated by the first index and the second data indicated by the second index into the same data block in the ordered string table file in sequence;

adopting a compression algorithm to compress the data blocks written with the first data and the second data to obtain compressed data blocks;

storing the compressed data blocks into an ordered character string table file of a database;

wherein, the more the indexes of the tree nodes satisfying the similar condition are, the higher the overlapping degree between the data written in the same data block is, the higher the compression rate at the time of data storage is, but the lower the processing efficiency at the time of data storage is.

Wherein, the processing unit 1001 is further configured to:

acquiring association information of a target tree node in a first MPT tree, wherein the association information comprises: the block height of the first block, the path of the target tree node in the first MPT tree and the tree name of the first MPT tree;

according to the block height of the first block, constructing a path of the target tree node in the first MPT and a tree name of the first MPT tree according to an index construction rule to obtain a first index of the target tree node;

according to the first index, searching a data block where the first data indicated by the first index is located from the database, and reading the first data from the searched data block.

The first MPT tree is a storage tree, and the target tree node is a root node in the first MPT tree; the processing unit is specifically used for:

acquiring a state tree corresponding to the first MPT tree;

reading the tree name of the first MPT tree and the height value of the target tree node in the first MPT tree from the leaf nodes of the state tree, wherein the height value of the target tree node in the first MPT tree is equal to the block height of the first block;

the null value is determined as a path of the target tree node in the first MPT tree.

In an embodiment of the present application, a blockchain includes a first block in which a first MPT (MerklePatriciaTrie, MPT) tree is maintained and a second block in which a second MPT tree is maintained; the first MPT tree and the second MPT tree each contain a target tree node, but the version of the target tree node in the first MPT tree is different from the version in the second MPT tree; acquiring a first index of a target tree node, wherein the first index is constructed according to the association information of the target tree node in a first MPT tree; the first index is used for indicating first data stored in a target tree node in a first MPT tree, acquiring a second index of the target tree node, and constructing the second index according to association information of the target tree node in a second MPT tree; the second index is used for indicating second data stored in a target tree node in a second MPT tree; therefore, the embodiment of the application constructs different indexes for the target tree nodes of different versions based on the association information of the target tree nodes in different MPT trees, which provides basis for storing the data stored by the target tree nodes in different MPT trees into the same data block. And obtaining the similarity between the first index and the second index, and if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database. In the embodiment of the application, the similarity between the first index and the second index is fully utilized, the first data and the second data can be stored in the same data block of the database, and the data stored by tree nodes of different versions are effectively managed, so that the effective storage management of the data in the block is realized.

The node device provided in the embodiment of the present application is explained in the following.

Further, the embodiment of the application also provides a schematic structural diagram of the node device, and the schematic structural diagram of the node device can be seen in fig. 11; the node device may be the above server, and the node device may include: a processor 1101, an input device 1102, an output device 1103 and a memory 1104. The processor 1101, the input device 1102, the output device 1103, and the memory 1104 are connected by buses. The memory 1104 is used for storing a computer program comprising program instructions, and the processor 1101 is used for executing the program instructions stored in the memory 1104.

The processor 1101 performs the following operations by executing program instructions in the memory 1104:

acquiring a first index of a target tree node; the first index is constructed according to the association information of the target tree node in the first MPT tree; the first index is used for indicating first data stored in a target tree node in the first MPT tree;

acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in a second MPT tree; the second index is used for indicating second data stored in a target tree node in a second MPT tree;

Obtaining similarity between the first index and the second index;

the construction method of the first index comprises the following steps:

the construction method of the second index comprises the following steps:

Wherein the first index comprises N digits which are sequentially arranged, and the second index comprises N digits which are sequentially arranged; the processor 1101 may specifically perform the following operations when acquiring the similarity between the first index and the second index:

The processor 1101 may specifically perform the following operations when determining the similarity between the first index and the second index according to the length M of the digits having the same value between the first index and the second index obtained by the traversal comparison:

when the target tree node comprises a branch node, the first data or the second data stored in the target tree node comprises a hash value of a next tree node of the branch node and a height value of the next tree node;

When the target tree node includes a leaf node, the first data or the second data stored in the target tree node includes: the system comprises account information and metadata, wherein the metadata comprises a tree name of a storage tree corresponding to a contract account where the account information is located and a height value of a root node of the storage tree.

when the target tree node comprises a branch node, the first data or the second data stored in the target tree node comprises a hash value of a next tree node of the target tree node and a height value of the next tree node;

when the target tree node includes a leaf node, the first data or the second data stored in the target tree node includes: account status information;

Wherein the database comprises an ordered character string table file; the ordered string table file includes data blocks; the processor 1101 may specifically perform the following operations when storing the first data indicated by the first index and the second data indicated by the second index into the same data block of the database:

Wherein the processor 1101 is further configured to:

The first MPT tree is a storage tree, and the target tree node is a root node in the first MPT tree; the processor 1101 may specifically perform the following steps when acquiring the association information of the target tree node in the first MPT tree:

acquiring a state tree corresponding to the first MPT tree;

Furthermore, it should be noted here that: the embodiments of the present application further provide a computer readable storage medium, and a computer program is stored in the computer readable storage medium, where the computer program includes program instructions, when executed by a processor, can perform the methods in the embodiments corresponding to fig. 6, fig. 8, and fig. 9, and therefore, a detailed description will not be given here. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application. As an example, the program instructions may be deployed on one node device or executed on multiple node devices located at one site or, alternatively, distributed across multiple sites and interconnected by a communication network.

According to one aspect of the present application, a computer program product is provided, the computer program product comprising a computer program stored in a computer readable storage medium. The processor of the node device reads the computer program from the computer readable storage medium, and the processor executes the computer program, so that the node device can perform the methods in the embodiments corresponding to fig. 6, 8 and 9, which are described above, and thus will not be described in detail herein.

Those skilled in the art will appreciate that the processes implementing all or part of the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, and the program may be stored in a computer readable storage medium, and the program may include the processes of the embodiments of the methods as above when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.

The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims

1. A data processing method based on a block chain, which is characterized in that the block chain comprises a first block and a second block, wherein a first merck prefix tree is maintained in the first block, and a second merck prefix tree is maintained in the second block; the first merck prefix tree and the second merck prefix tree both include target tree nodes, but the version of the target tree nodes in the first merck prefix tree is different from the version in the second merck prefix tree; the method comprises the following steps:

Acquiring a first index of the target tree node; the first index is constructed according to the association information of the target tree node in the first merck prefix tree; the first index is used for indicating first data stored in a target tree node in the first merck prefix tree;

acquiring a second index of the target tree node, wherein the second index is constructed according to the association information of the target tree node in the second merck prefix tree; the second index is used to indicate second data stored in a target tree node in the second merck prefix tree;

obtaining the similarity between the first index and the second index;

if the similarity meets the similarity condition, storing the first data indicated by the first index and the second data indicated by the second index into the same data block of a database;

the index of the target tree node is obtained by constructing the association information of the target tree node in the corresponding merck prefix tree according to an index construction rule; the association information of the target tree node in the corresponding merck prefix tree comprises: block height, path of the target tree node in the corresponding merck prefix tree, and tree name of the merck prefix tree; the construction method of the index of the target tree node comprises the following steps: according to the index construction rule, carrying out rounding operation on a quotient between the block height and a preset index construction factor to obtain a rounding result; performing remainder taking operation on a quotient between the block height and a preset index construction factor to obtain a remainder taking result; and according to the index construction rule, splicing the rounding result, the tree name of the merck prefix tree, the tree name of the target tree node in the corresponding merck prefix tree and the remainder result to obtain the index of the target tree node.

2. The method of claim 1, wherein the blockchain includes a plurality of blocks, the blocks on the blockchain being partitioned into a plurality of intervals according to a predetermined index build factor; the first block and the second block refer to any two blocks located in the same interval;

wherein the value of the index construction factor is determined by the data storage requirement of the blockchain; the data storage requirements include at least one of: compression rate at the time of data storage and processing efficiency at the time of data storage; the higher the compression rate is, the larger the value of the index construction factor is; the higher the processing efficiency is, the smaller the value of the index construction factor is.

3. The method of claim 2, wherein the index build rule is defined based on the index build factor; the definition principle of the index construction rule comprises the following steps: in the merck prefix tree maintained by different blocks located in the same interval, the similarity between different indexes of the same tree node should satisfy the similarity condition;

the larger the value of the index construction factor is, the more the number of blocks in the same interval is, and the more the number of indexes meeting similar conditions is; the smaller the value of the index construction factor is, the smaller the number of blocks in the same interval is, and the smaller the number of indexes meeting similar conditions is.

4. The method of claim 1, wherein the method of constructing the first index comprises:

performing rounding operation on a quotient between the block height of the first block and a preset index construction factor according to the index construction rule to obtain a first rounding result; performing remainder operation on the quotient between the block height of the first block and the index construction factor to obtain a first remainder result;

according to the index construction rule, performing splicing processing on the first rounding result, the tree name of the first merck prefix tree, the path of the target tree node in the first merck prefix tree and the first surplus result to obtain a first index of the target tree node;

the construction method of the second index comprises the following steps:

performing rounding operation on the quotient between the block height of the second block and the index construction factor according to the index construction rule to obtain a second rounding result; performing remainder operation on the quotient between the block height of the second block and the index construction factor to obtain a second remainder result;

and according to the index construction rule, performing splicing processing on the second rounding result, the tree name of the second merck prefix tree, the path of the target tree node in the second merck prefix tree and the second remainder result to obtain a second index of the target tree node.

5. The method of claim 1, wherein the first index comprises N digits arranged in sequence, and the second index comprises N digits arranged in sequence;

the obtaining the similarity between the first index and the second index includes:

traversing and comparing whether the values of the same digits between the first index and the second index are the same according to the arrangement sequence of the N digits;

when the first traversal comparison is different from the value of any same digit between the first index and the second index, ending the traversal comparison; the method comprises the steps of,

determining the similarity between the first index and the second index according to the length M of digits with the same value between the first index and the second index obtained through traversal comparison; m is a positive integer and M is less than or equal to N.

6. The method of claim 5, wherein determining the similarity between the first index and the second index based on the length M of the digits having the same value between the first index and the second index obtained by the traversal comparison comprises:

the similarity condition comprises a similarity threshold value, and if the similarity between the first index and the second index is larger than or equal to the similarity threshold value, the similarity meets the similarity condition; and if the similarity between the first index and the second index is smaller than the similarity threshold value, the similarity does not meet the similarity condition.

7. The method of claim 4, wherein the first merck prefix tree and the second merck prefix tree are both state trees, and the first merck prefix tree and the second merck prefix tree have the same tree name, the tree name being a preset fixed string; the state tree is used for storing addresses of all contract accounts in the corresponding block and account information of each contract account; the target tree node comprises one of: an expansion node, a branch node and a leaf node in the state tree;

when the target tree node comprises a leaf node, the first data or the second data stored in the target tree node comprises: account information and metadata, wherein the metadata comprises a tree name of a storage tree corresponding to a contract account where the account information is located and a height value of a root node of the storage tree.

8. The method of claim 4, wherein the first and second merck prefix trees are both storage trees, the first and second merck prefix trees each being used to store all account state information of a target contract account in a respective block; the target tree node comprises one of: expansion nodes, branch nodes and leaf nodes in the storage tree;

when the target tree node comprises a leaf node, the first data or the second data stored in the target tree node comprises: account status information;

wherein the first and second merck prefix trees have the same tree name, which is an address of the target contract account, or which is determined by a block height of a block in which the target contract account is first created and an order in which the target contract account is created.

9. The method of claim 1, wherein the database includes ordered string table files therein; the ordered string table file comprises data blocks;

the storing the first data indicated by the first index and the second data indicated by the second index into the same data block of a database includes:

based on the first index and the second index, writing first data indicated by the first index and second data indicated by the second index into the same data block in the ordered string table file in sequence;

storing the compressed data blocks into the ordered string table file of the database;

10. The method of claim 1, wherein the method further comprises:

acquiring the association information of the target tree node in the first merck prefix tree, wherein the association information of the target tree node in the first merck prefix tree comprises the following steps: the block height of the first block, the path of the target tree node in the first merck prefix tree, and the tree name of the first merck prefix tree;

according to the block height of the first block, constructing a path of the target tree node in the first merck prefix and a tree name of the first merck prefix tree according to an index construction rule to obtain a first index of the target tree node;

And according to the first index, searching a data block where the first data indicated by the first index is located from the database, and reading the first data from the searched data block.

11. The method of claim 10, wherein the first merck prefix tree is a storage tree and the target tree node is a root node in the first merck prefix tree; the obtaining the association information of the target tree node in the first merck prefix tree includes:

acquiring a state tree corresponding to the first merck prefix tree;

reading the tree name of the first merck prefix tree and the height value of the target tree node in the first merck prefix tree from the leaf nodes of the state tree, wherein the height value of the target tree node in the first merck prefix tree is equal to the block height of the first block;

a null value is determined as a path of the target tree node in the first merck prefix tree.

12. A blockchain-based data processing device, wherein the blockchain includes a first block and a second block, wherein a first merck prefix tree is maintained in the first block, and a second merck prefix tree is maintained in the second block; the first and second merck prefix trees each include a target tree node, but the version of the target tree node in the first merck prefix tree is different from the version in the second merck prefix tree, the apparatus comprising:

A processing unit, configured to obtain a first index of the target tree node; the first index is constructed according to the association information of the target tree node in the first merck prefix tree; the first index is used for indicating first data stored in a target tree node in the first merck prefix tree;

the processing unit is further configured to obtain a second index of the target tree node, where the second index is obtained by constructing the second index according to association information of the target tree node in the second merck prefix tree; the second index is used to indicate second data stored in a target tree node in the second merck prefix tree;

the processing unit is further configured to obtain a similarity between the first index and the second index;

a storage unit, configured to store, if the similarity satisfies a similarity condition, first data indicated by the first index and second data indicated by the second index into a same data block of a database;

the index of the target tree node is obtained by constructing the association information of the target tree node in the corresponding merck prefix tree according to an index construction rule; the association information of the target tree node in the corresponding merck prefix tree comprises: block height, path of the target tree node in the corresponding merck prefix tree, and tree name of the merck prefix tree; the construction method of the index of the target tree node comprises the following steps: performing rounding operation on a quotient between the block height and a preset index construction factor according to an index construction rule to obtain a rounding result; performing remainder taking operation on a quotient between the block height and a preset index construction factor to obtain a remainder taking result; and according to the index construction rule, splicing the rounding result, the tree name of the merck prefix tree, the tree name of the target tree node in the corresponding merck prefix tree and the remainder result to obtain the index of the target tree node.

13. A node device, comprising:

a processor adapted to execute a computer program;

a computer readable storage medium having a computer program stored therein, which when executed by the processor, performs the blockchain-based data processing method of any of claims 1-11.

14. A computer readable storage medium, characterized in that the computer storage medium stores a computer program which, when executed by a processor, performs the blockchain-based data processing method according to any of claims 1-11.

15. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the blockchain-based data processing method of any of claims 1-11.