CN115248811B

CN115248811B - Scalable collaborative blockchain block storage method and device

Info

Publication number: CN115248811B
Application number: CN202111504696.8A
Authority: CN
Inventors: 尹波; 李家齐
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2023-05-12
Anticipated expiration: 2041-12-10
Also published as: CN115248811A

Abstract

The invention discloses an expandable collaborative blockchain block storage method and device. And then, by utilizing the specific characteristics of storage capacity, response efficiency, overhead and the like of different nodes in the partition, an optimization objective function for minimizing the total cost under the condition that each block meets the response efficiency threshold value is defined, and the distribution of the blocks is realized under the condition that the block response efficiency requirement is met, so that the cost is reduced as much as possible.

Description

Scalable collaborative blockchain block storage method and device

Technical Field

The invention relates to the technical field of blockchain, in particular to a scalable collaborative blockchain block storage method and device.

Background

The problem of blockchain storage is a great challenge for practical application of blockchain technology, and directly affects the storage performance of blockchains. In conventional blockchain systems, each node (blockchain node) holds a complete copy of the blockchain. Referring to fig. 1, the most basic structure in a blockchain system is a block, and a blockchain is formed by linking a plurality of blocks like a linked list. A block consists of two parts, namely a block header block, containing a lot of information including index parent block Hash value (Prev Hash), timestamp (Timestamp), nonce, merkle Root (Merkle Root), transaction data, etc. The block header is used to record the meta-information of the current block, one block header without transaction data is about 80 bytes, where Prev Hash is used to link the parent block, nonce is a random number used for the proof of work algorithm, and TimeStamp records the current time. The leaf nodes of the Merkle tree store hash values of Transaction data (Tx), and the non-leaf nodes calculate new hash values according to the hash values of the child nodes. If the data in the block is changed, the structure of the Merkle tree is affected to change, so that the block chain has a tamper-proof function. The block contains the transaction stored by the block.

As the amount of blockchain data increases, the user storage pressure in the blockchain network increases. Taking bitcoin as an example, the total data amount of all nodes in the bitcoin reaches 440.96GB, and the total data amount comprises 708438 blocks, so that a new block can be generated only in about ten minutes on average, and 3.41 transactions are generated in each second on average. The total data of the ethernet full node is 8897.8819GB, which contains 13560829 blocks, a new block is given out for 13.23 seconds, and an average of 14.6 transactions per second is given out (by 2021, 11, 6 days). However, in the internet of things era, internet of things devices become potential users of blockchain systems, such as personal computers, iPad, and the like. When the equipment of the Internet of things is added into the blockchain network, the problems of rapid increase of the number of nodes and insufficient energy storage capacity of the equipment can be generated, and the equipment can not be added into the blockchain network to be used as a full-node verification new transaction under the existing blockchain system.

To address the problem of data growth, both users and blockchain platforms desire to be able to control the amount of data stored. The existing solutions to the storage problem are mainly divided into under-chain storage and on-chain storage from the technical point of view of use. The store-under-chain relieves the store pressure by transferring the blockchain data into a distributed storage system in parallel with it. But this approach requires maintenance of the off-chain storage system; meanwhile, in order to give consideration to the security of data, a part of data redundancy is generated, and the reduction of dependence on cloud storage services is also required to be considered in cloud-based under-chain storage. The on-chain storage includes coding, slicing, etc. The coding divides the block into a plurality of data packets, then uses the coding technology to generate the code fragments of the original data packets and sends the code fragments to the nodes, and when the nodes need to verify the transaction, the data packets are recovered, but the calculation cost is high. The slicing scheme needs to periodically change the slicing to cause data migration work, and meanwhile, the problem of data expansion is considered.

Delay and throughput are a pair of metrics that measure blockchain scalability, and it is currently desirable to have a low-delay and high-throughput system architecture, i.e., a system response time that a user can feel, such as a web page opening in a few seconds, a shorter one indicates a lower delay, and a throughput that indicates how many users can enjoy such a low delay at the same time, and if the concurrent user volume is large, the user feels that the web page opening speed is slow, which means that the throughput of the system architecture is to be improved. The goal of scalability is to achieve maximum throughput with acceptable delay.

In the related scheme, the storage pressure of the nodes is mainly relieved from the system architecture design, but in practical application, characteristics among different nodes in the blockchain are inconsistent, and the related scheme does not allocate blocks to specific nodes according to the related characteristics of the nodes, so that the expansibility and the applicability of the blockchain system are poor.

Disclosure of Invention

The present invention aims to at least solve the technical problems existing in the prior art. Therefore, the invention provides an expandable collaborative block chain block storage method and device, which can realize the distribution of blocks under the condition of meeting the requirement of block response efficiency, and reduce the cost as much as possible; storage and calculation resources of the nodes are effectively utilized, and expandability and applicability of the blockchain system are improved.

In a first aspect of the present invention, there is provided a scalable collaborative blockchain block storage method, comprising the steps of:

calculating the storage capacity, response efficiency and unit cost of each node in the current block chain partition;

calculating all blocks to be allocated, and allocating a corresponding response efficiency threshold value for each block to be allocated, wherein the response efficiency threshold value of the block to be allocated is used for representing the minimum response efficiency value of the block to be allocated which can be completely inquired and responded;

constructing an optimization objective function and constraint conditions, wherein the optimization objective function is as follows:

wherein the said

Representing a minimum total cost for each of the blocks to be allocated satisfying a respective response efficiency threshold condition, the v _j Representing the storage capacity of the node, the V representing the set of capacities of the node in the current blockchain shard, the τ _j Representing a storage capacity v _j The number of allocations of the node of (2), the m _j Representing a storage capacity v _j The number of the nodes of (c), the c _j Representing a storage capacity v _j Is a unit cost of the node;

the constraint conditions are as follows:

wherein the E (b) _i ) Representing the block b to be allocated _i The response efficiency of r is as follows _i Representing said b _i And a response efficiency threshold of said block to be allocated, said B representing said set of blocks to be allocated,

the N (b) _i ) Representing storing said b _i Is said +.>

Representing storing said b _i Is said +.>

Representing storing said b _i The number of nodes according to the different storage capacities of the nodes;

and distributing the blocks to be distributed to the corresponding nodes according to the optimization objective function and the constraint conditions.

According to the embodiment of the invention, at least the following technical effects are achieved:

firstly, setting a response efficiency threshold value for each block to be allocated, wherein the sum of response efficiencies provided by all nodes in the constraint partition aiming at the block to be allocated in the constraint condition reaches the threshold value, so that the problem that block data cannot be queried due to the single-point failure problem of the nodes can be avoided, and the throughput of a block chain system aiming at user query is ensured. And then, by utilizing the specific characteristics (namely storage capacity, response efficiency, overhead) of different nodes in the partition, an optimization objective function for minimizing the total cost under the condition that each block meets the threshold value of the response efficiency is defined, and the distribution of the blocks is realized under the condition that the requirements of the block response efficiency are met, so that the cost is reduced as much as possible.

In a second aspect of the present invention, there is provided a scalable collaborative blockchain block storage device, comprising:

the first calculation unit is used for calculating the storage capacity, the response efficiency and the unit cost of each node in the current blockchain partition;

the second calculation unit is used for calculating all blocks to be allocated, and allocating a corresponding response efficiency threshold value for each block to be allocated, wherein the response efficiency threshold value of the block to be allocated is used for representing the minimum response efficiency value of the block to be allocated which can be completely inquired and responded;

the third calculation unit is used for constructing an optimization objective function and constraint conditions, wherein the optimization objective function is as follows:

wherein the said

Indicating that each block to be allocated meets the corresponding response efficiencyMinimizing total cost under threshold condition, v _j Representing the storage capacity of the node, the V representing the set of capacities of the node in the current blockchain shard, the τ _j Representing a storage capacity v _j The number of allocations of the node of (2), the m _j Representing a storage capacity v _j The number of the nodes of (c), the c _j Representing a storage capacity v _j Is a unit cost of the node;

the constraint conditions are as follows:

the N (b) _i ) Representing storing said b _i Is said +.>

Representing storing said b _i Is said +.>

and the block allocation unit is used for allocating the blocks to be allocated to the corresponding nodes according to the optimization objective function and the constraint conditions.

In a third aspect of the invention, an electronic device is provided comprising at least one control processor and a memory for communicatively coupling with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the scalable collaborative blockchain block storage method described above.

In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the scalable collaborative blockchain block storage method described above.

It should be noted that the advantages of the second to fourth aspects of the present invention and the prior art are the same as those of the scalable collaborative blockchain block storage method described above and the prior art, and will not be described in detail herein.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a basic block diagram of a blockchain system;

FIG. 2 is a block chain system according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a scalable collaborative blockchain block storage method provided by an embodiment of the present invention;

fig. 4 is a schematic diagram of an allocation process of an allocation method based on unit overhead according to an embodiment of the present invention;

fig. 5 is a schematic diagram of an allocation process of an allocation method based on a gain ratio according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a distribution process of a distribution method based on a bounded backpack according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a computer device according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a computer readable storage medium according to an embodiment of the present invention.

Detailed Description

The following description of the technical solutions according to the embodiments of the present invention will be provided fully with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

Before describing the method embodiments of the present application, the system embodiments of the present application are described with reference to fig. 2 (wherein b ₁ To b ₁₀ Representing the components of the blockchain copy, the dots represent the nodes), the blockchain system firstly performs slicing treatment (random slicing is performed according to a fairness principle) on all nodes in the blockchain in a network-level slicing mode according to a preset rule to obtain a plurality of slices, each slice contains a plurality of nodes, the communication cost between the slicing nodes is lower than that of the other slices, and the maintenance of the blockchain ledger is facilitated. And when the block data queried by one node is not stored locally, other nodes storing the block data in the same block are requested, so that the cross-block query is avoided, and the storage expandability in the block chain is improved. Each node stores a complete block header in the block chain, solves the trust problem in node communication, and can verify through root hash in the block header stored by itself when transaction data (i.e. block data) is acquired from other nodes in the partition.

According to the embodiment of the block chain system, although the slicing process is performed, all points of each slice commonly store the current block copy, so that the cross-slice query is avoided, and the storage expandability in the block chain is improved. However, on the one hand, since there is no full node in the blockchain system, if a single point failure occurs in a node, the blockdata cannot be completely queried, on the other hand, in practical application, different nodes have their own storage capacity, response capacity and cost, the storage capacity of different types of nodes is inconsistent, the performance condition of the node determines its capacity of responding to the blockquery, the stronger the response capacity of the node is, then the greater throughput is provided for the query, the storage capacity of the node is consumed by the node storage block, and the node occupies resources such as network bandwidth when responding to the query, which causes certain cost when the node stores the block.

Referring to fig. 3, in one embodiment of the present invention, there is provided a scalable collaborative blockchain block storage method, where the method embodiment of the present application is used in the blockchain system described above, the method includes the following steps:

Step S100, calculating the storage capacity, response efficiency and unit cost of each node in the current blockchain partition;

step 200, calculating all blocks to be allocated, and allocating a corresponding response efficiency threshold value for each block to be allocated, wherein the response efficiency threshold value of the block to be allocated is used for representing the minimum response efficiency value of the block to be allocated which can be completely inquired and responded;

step S300, constructing an optimization objective function and constraint conditions, wherein the optimization objective function is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

representing each to be allocatedThe blocks each meet the minimum total cost under the corresponding response efficiency threshold condition, v _j Representing the storage capacity of a node, V represents the set of capacities of the nodes in the current blockchain shard, τ _j Representing a storage capacity v _j Number of node assignments, m _j Representing a storage capacity v _j The number of nodes, c _j Representing a storage capacity v _j Is a unit cost of the node of (a);

the constraint conditions are as follows:

wherein E (b) _i ) Representing block b to be allocated _i Response efficiency, r _i Representation b _i B represents the set of blocks to be allocated,

N(b _i ) Representation store b _i Capacity set of nodes of->

Representation store b _i Response efficiency of the node of->

Representation store b _i The number of different storage capacity nodes;

And step 400, distributing the blocks to be distributed to corresponding nodes according to the optimization objective function and the constraint conditions.

In the steps S100 to S400, a response efficiency threshold is set for each block, and the sum of the efficiencies provided by all the nodes in the slice for the block reaches the threshold, so that the problem that the block data cannot be queried due to the single point failure problem of the node can be avoided. Then, an optimization objective function for minimizing the total cost under the condition that each block meets the response efficiency threshold is defined, so that the distribution of the blocks is realized, and the cost is reduced as much as possible under the condition that the response efficiency requirement of the blocks is met. Moreover, the block storage allocation is performed by considering the specific characteristics of the nodes, so that the storage and calculation resources of the nodes can be effectively utilized, and the expandability and applicability of the block chain system are improved.

The following description of the above steps S100 to S400 is fully expanded:

first, for simplicity of explanation, the present application does not consider blocks that have already been allocated, and therefore, blocks to be allocated in the following steps are simply referred to as blocks. The following blockchain refers to the current blockchain shard.

Define N as the set of all nodes in the blockchain: n= { N ₁ ,…，n _|N| N is }, where n _j Representing a node in the blockchain. Each node has its own storage capacity, and because the blockchain system is that a plurality of nodes store the whole block copy together, when the node receives the inquiry of a user to a certain block, if the node does not store the block, the node which stores the block will be requested to transmit the block.

The response efficiency of the node to the block is defined, and the higher the response efficiency is, the node can respond to the inquiry of other nodes in time, and the same node has different response efficiencies to different blocks. Nodes have a cost overhead (including the cost of storing and querying). Thus, the nodes may be represented using storage capacity, response efficiency, unit overhead.

The storage capacity of a node is assumed to be a fixed value, and thus the node is divided into node sets according to capacity. It is assumed that nodes having the same storage capacity have the same response efficiency and cost overhead. Let the capacity range of the node in N be [1, K]Let node capacity set v= { V ₁ ,…,v _K "1.ltoreq.v _j K is less than or equal to K. The capacity in N is v _j The number of the nodes is m _j . Definition B is the set of all blocks in the blockchain: b= { B ₁ ,…,b _|B| }, wherein b _i Representing a block in a blockchain system. Since nodes are classified by storage capacity, v is given in the embodiments of the present application _j Capacity node (this embodiment is abbreviated as "v _j -definition of a node ").

Definition 1: v _j The nodes are represented as

Wherein v is _j Representing the maximum number of blocks that a node can store (i.e. the storage capacity of the node), the +.>

Representing node response query block b _i Efficiency (i.e., response efficiency of the node), c _j Representing the cost overhead of a node storing a single block (i.e., the unit overhead of the node), m _j Is the number of nodes of this type. For example, N is 5, where the storage capacities of nodes 1 and 2 are 1,3 and 4 are 2,5 are 3, then the capacity range of the nodes in N is [1,3 ]]At this time, there are three types of nodes, i.e., three capacity nodes having storage capacities of 1, 2, and 3, respectively.

Definition 2: define block b _i Is E (b) _i )(E(b _i ) May also be referred to as all stores b _i The node responding to query b _i And the sum of the efficiencies of (a). If block b _i Stored by more nodes, then when multiple queries need to fetch b _i When it is possible to select the storage b _i And responds to different nodes of the network, so that the overall response efficiency is higher. For example, b _i The query feedback method is stored by nodes 1, 2 and 3, the number 4 can be used for searching for the number 1 to obtain, the number 5 can be used for searching for the number 2 to obtain, and the number 6 can be used for searching for the number 3 to obtain, so that queuing is not needed, and the query feedback can be obtained quickly.

For each block b _i There is a response efficiency threshold r _i The response efficiency threshold value (namely the minimum response efficiency value that the block can be completely queried and responded) which indicates that the block should be reached, and the sum of the efficiency capability provided by all nodes in the partition aiming at the block reaches the threshold value, which means that the nodes in the partition can respond to query in time, so that the problem that the block data cannot be queried due to the single-point failure problem of the nodes is avoided, and the throughput of the block chain system aiming at user query is ensured. Thus, for block set B, response efficiency can be obtainedThe set of thresholds r= { R ₁ ,…，r _|B| }. Setting a response efficiency threshold: the response efficiency threshold is a threshold set for a block, and not a certain area is the threshold. Some newly generated blocks, or blocks containing important data, will have a higher response efficiency threshold set because of the high frequency with which new blocks and important data are queried. While those old data are typically rarely queried, so the response efficiency threshold may be set lower.

E (b) is given below _i ) Is defined in the following (a):

given block b _i Let N (b) _i ) To store b _i Of the nodes of (1), which comprises

V _j -a node. Let E (b) _i ) Is N (b) _i ) The intermediate node responds to query b _i Sum of efficiencies at that time:

specifically, based on the above method embodiment and the above two definitions, the steps of the method embodiment are as follows:

the construction optimization objective function is as follows:

for equation (2): given set of blocks b= { B ₁ ,…,b _|B| Node capacity set v= { V ₁ ,…,v _K The set of response efficiency thresholds for the block is r= { R } ₁ ,…，r _|B| }. The problem of block allocation in a blockchain is to allocate blocks in B to V, i.e., to produce a block-capacity allocation set Λ= { (B) _i ,v _j ) }. Let B _Λ For a set of blocks contained in an allocation pair, the capacity is v _j Is assigned a number of times τ _j The optimization objective is to satisfy response effect at each blockThe overall cost is minimized at the rate threshold condition.

The constraint conditions are as follows:

for equation (3): each block b _i E (b) _i ) Not less than threshold r _i . Wherein, table 1 is an explanatory summary of the symbols of the examples of the present application:

TABLE 1

After the optimization objective function and the constraint condition are constructed in step S300, it needs to be proved that the block allocation problem proposed by the implementation of the method is an NP-hard problem, so as to realize the solution.

The proving process comprises the following steps: the bounded backpack problem (Bounded Knapsack Problem, BKP) may be reduced to a block allocation problem. Examples of bounded backpack problems are given here: giving K articles, and making the weight set of each article be { w ₁ ,…,w _K Value set of all items { u } ₁ ，…，u _K }. Let the total weight threshold value be W, the total value threshold value be U, and the article W _i The number of uses is at most k _i . The problem of bounded backpack is to calculate the usage times set alpha = { alpha of the article ₁ ，…，α _m Total weight threshold and total value threshold are met, i.e.:

constructing a block allocation problem instance according to the bounded backpack problem instance: the items in the BKP problem correspond to nodes of certain capacity; let t in the allocation problem _j The response efficiency of the node to any block is the same, so

Can be abbreviated as p _j Value u in BKP problem _j Corresponding response efficiency p _j The method comprises the steps of carrying out a first treatment on the surface of the Number of uses alpha _j Corresponding to the distribution times tau _j The method comprises the steps of carrying out a first treatment on the surface of the Weight w of article _j Cost c of the corresponding node _j n _j The method comprises the steps of carrying out a first treatment on the surface of the Upper limit of number of use k _j Corresponding upper limit m of distribution times _j . Assume that there is only one block B in the block set B _i Its response efficiency threshold r _i Corresponding to U.

Given Λ= { (b) _i ,t _j ) The allocation pair set is a block allocation instance. Obviously Λ is a viable allocation scheme for the block allocation problem if and only if { τ } ₁ ,…,τ _|K| And is a viable solution to the BKP problem.

The above demonstrates that a special case of the BKP problem is NP-complete by reducing it to a special case of the block allocation problem, and thus the block allocation problem is NP-complete.

The above has proven that the block allocation problem is an NP-hard problem, so in step S400 of this implementation, the blocks to be allocated can be allocated to the corresponding nodes according to the optimization objective function and the constraint condition.

Specifically, the embodiment of the application provides the following three heuristic methods for carrying out allocation calculation based on static environment:

the first method solves by an allocation method based on unit overhead:

in order to minimize the total cost, it is a straightforward idea to preferentially select a capacity node with a small unit overhead, and to assign a block to that capacity node. Single v _j Overhead v of capacity node _j c _j Proportional to the unit overhead. Let ω (b) _i ) The method comprises the following steps: omega (b) _i )＝r _i -E(b _i ) The more the differenceLarge, indicating a large gap from the given threshold. Thus, a given node capacity v _j Preferentially let ω (b) _i ) Block allocation of large value to v _j -a capacity node.

As shown in the pseudo code of table 2, when there are blocks in the block set B that do not meet the response efficiency threshold, an iterative operation is performed, one capacity node is selected and the blocks are allocated per iteration. Let the set of blocks that currently have not met the threshold condition be B', so the condition of the loop is

First selecting v with minimum current unit cost _j Capacity node, then selects ω (B) _i ) V of maximum value _j Blocks for allocation to v _j -a node. Let allocation to v _j The block set is B _j . Thus for B _j Any one of blocks b, (v) _j B) adding an allocation subset Λ. If the block meets the threshold condition, it needs to be deleted from B' (corresponding to rows 6-8 in Table 2). In addition, v _j The number of nodes is m _j Thus if this upper limit is reached, v needs to be set _j Deleted from V (corresponding to rows 9-10 in table 2). As can be seen from table 2, the capacity node with the smallest unit overhead is always preferentially allocated, and the other capacity node is not considered until the capacity node is allocated.

/>

TABLE 2

Time complexity analysis: the capacity sets are ordered first, with a complexity of O (KlogK). The set of blocks is ordered with a complexity O (|b|log|b|). The maximum number of times the loop is executed is

The maximum number of cycles is equal to |n|. The temporal complexity is O (|N) |b|log|b|+klogk).

TABLE 3 Table 3

TABLE 4 Table 4

For ease of understanding, a set of specific examples is provided herein, given six blocks assigned to nodes of three different capacities, the response efficiency thresholds for the blocks and the attributes of the node capacity sets are shown in tables 3, 4. As shown in fig. 4, if there is a connection between a block and a node, it means that the block is allocated to the corresponding node; the value on the connection line is the response efficiency of the node relative to the block; the bolded block indicates that the block meets the response efficiency threshold requirement. According to the distribution method based on unit cost, all nodes are compared according to the unit cost, and the node v with the minimum unit cost is selected ₃ (unit overhead is 5). Due to v ₃ The capacity of the node is 3, thus ω (b _i ) Maximum three blocks are allocated to v ₃ -a node. At this time, each block b _i E (b) _i ) Are all 0, thus ω (b) _i ) Equal to r _i Is a value of (2). Omega (b) _i ) The largest three blocks are b ₁ 、b ₂ 、b ₃ (omega (b) of them _i ) Values 100, 90, 80, respectively) which are assigned to v ₃ -nodes (as shown in fig. 4 (a)). Updating response efficiency of block b ₁ And b ₃ Response efficiency E (b) ₁ ) And E (b) ₃ ) 100 and 90, respectively, have met their response efficiency thresholds. b ₂ Response efficiency E (b) ₂ ) 80, less than a threshold 90.v ₃ The number of nodes is only 1, reaching the upper limit of number. Thus, will b ₁ 、b ₃ Delete from B', v ₃ The node is deleted from the node capacity set V. Similarly, v is selected in the following steps ₂ Node, now ω (b) _i ) The largest block is b ₄ 、b ₅ (omega (b) of them _i ) Values 80 and 70, respectively), block b ₄ And b ₅ Assigned to v ₂ Node, at this point b ₅ The block reaches a response efficiency threshold. Due to v ₂ There are 2 nodes in total, continue to select v ₂ Node, block b ₂ 、b ₄ 、b ₆ Omega (b) _i ) The values are 10, 10 and 60 respectively, block b ₄ 、b ₆ Assigned to v ₂ -a node. Until now, only block b ₂ The response efficiency threshold is not reached. Finally select v ₁ Node, allocation block b ₂ Give v ₁ -a node. All blocks meet their response efficiency threshold at this point. The final allocation scheme is (v ₃ ,b ₁ b ₂ b ₃ )，(v ₂ ,b ₄ b ₅ )，(v ₂ ,b ₄ b ₆ )，(v ₁ ,b ₂ ) The total cost of the allocation scheme is 70.

The second method, the gain ratio based allocation method, solves:

the optimization objective of the block allocation problem is to minimize the total cost, but to meet the requirement of each block response efficiency threshold, the allocation method based on unit overhead only considers the cost when selecting the node capacity, and does not consider the response efficiency. For this purpose, a gain ratio based method is proposed here:

let it be assumed that the capacity v is selected _j To allocate blocks, let allocation v _j The block set is B _j . Using deltac to represent allocation B _j Give v _j Total cost added:

the delta R is used to represent the increase in response efficiency of all blocks:

since the nodes provide different response efficiencies for different blocks, the change ΔE in the overall system response efficiency is the change in response efficiency provided by all nodes, which is virtually equivalent to B _j Assigned to v _j Response efficiency of the post contribution. The added total cost ΔC is also v _j Cost of node, i.e. v _j c _j 。

For a good distribution, the smaller the ΔC value, the better so that the total cost can be minimized; the larger the ΔR value, the better so that the threshold for the required response efficiency can be reached as early as possible. Thus, the gain ratio is defined as the ratio of Δc and Δr (ratio=Δc/Δe). It is apparent that the smaller the value of ratio, the better, and the node is selected according to the value of ratio. Let v _j Cost-effectiveness ratio of nodes is ratio _j . According to the formula (6) and the formula (7), a specific formula of gain ratio is obtained as follows:

based on the above discussion, the ratio is preferentially selected _j The small value of node capacity is beneficial in reducing the overall cost under the constraint of guaranteeing a response efficiency threshold. It should be noted that the allocation to v _j -set of blocks B of nodes _j Is unknown. Now, how to calculate B _j . Given v _j Node whose cost is determined (i.e. c _j c _j ) Therefore, here, considering the response efficiency, a block set B' is selected from which the response efficiency threshold has not been satisfied

Maximum v _j And blocks, thereby satisfying the threshold condition as early as possible.

As shown in Table 5, the gain ratio of each capacity node in the current capacity set is first calculatedThe rate (corresponding to rows 3-5 in table 5). Wherein line 4 is used to calculate the allocation to v _j -the set of blocks of nodes is B _j Line 5 calculates the gain ratio according to equation (8). Then selecting the capacity node with the smallest gain ratio, and making the capacity node be v _j Will B _j Block allocation v in (2) _j Nodes (corresponding to rows 6-7 in table 5). The steps of updating B' and V are consistent with the unit-overhead based allocation method (corresponding to rows 8-13 in Table 5).

TABLE 5

Time complexity analysis: the complexity of lines 3-5 is O (|B|K). The complexity of the 6 th row ordering is O (KlogK). The complexity of lines 8-11 is O (|B|). The maximum number of times the loop is executed is

The temporal complexity is thus O (|B|K|N|+|N|KlogK).

For ease of understanding, a set of specific examples are provided herein. The procedure was as follows, again using as input the above tables 3, 4: first, the gain ratio of each capacity node is calculated, and the ratio is calculated ₁ ＝15/100＝0.15，ratio ₂ ＝20/190＝0.105，ratio ₃ =15/280=0.054. In ratio of ₁ For example, due to v ₁ The capacity node can only store 1 block, thus selecting

Maximum block b ₅ At this time->

ΔC＝c ₁ =15. Thus first choose v ₃ -node (ratio) ₃ Minimum value) while calculating the gain ratio while determining b ₁ 、b ₃ 、b ₄ Blocks, three blocks are allocated to v ₃ After the node, the response efficiency threshold and the node capacity set of the block are updated (as in fig. 5 (a)Shown). b ₁ 、b ₃ 、b ₄ These three blocks meet the response efficiency threshold requirement. v ₃ Only 1 node and has been allocated, so only v is considered next ₁ -node and v ₂ -a node. At this time ratio ₁ ＝15/100＝0.15，ratio ₂ =20/190=0.105, in ratio ₂ For example, due to v ₂ The capacity node can store 2 blocks at this time

ΔC＝c ₂ ·v ₂ =10×2=20. Thus choose v ₂ -node (ratio) ₂ Minimum value of (b) while determining b ₂ 、b ₅ The block (as shown in fig. 5 (b)). b ₂ And b ₅ The block reaches the threshold requirement. Only b remains in the block set ₆ Is not allocated. At this time ratio ₁ ＝15/90＝0.167，ratio ₂ =10/70=0.143. In ratio of ₂ For example, due to v ₂ The capacity node can store 2 blocks, and only b is left in the block set ₆ Is not allocated, at this point->

ΔC＝c ₂ =10. Thus finally selecting v ₂ Node (ratio at this time) ₂ Minimum value of (b)) is stored in b) which does not meet the response efficiency threshold ₆ The block (as shown in fig. 5 (c)). All blocks meet their response efficiency threshold at this point. The final allocation plan is (v ₃ ,b ₁ b ₃ b ₄ )，(v ₂ ,b ₂ b ₅ )，(v ₂ ,b ₆ ) The total cost of the allocation plan is 45.

The third method, the distribution method of bounded backpacks, solves:

because bounded backpack problems have been reduced to block allocation problems, classical methods of bounded backpack can be used to apply to block allocation problems. Regarding the cost threshold W as knapsack capacity, node unit cost c _j Considered as the weight of the article, the number of nodes is m _j Is the number of items. The value of the article in the problem of bounded backpacks is a fixedWhereas in the block allocation problem the nodes have different response efficiencies for different blocks, where the minimum response efficiency p _j Corresponding value u _j ：

If the node minimum response efficiency is used for the bounded backpack method, it is easier for the node to store other blocks to meet the response efficiency threshold of the block in the block allocation. In order to ensure that the selected nodes can meet the response efficiency threshold requirements of all the blocks, the sum of the response efficiencies contributed by the nodes selected by the bounded knapsack method is not less than the sum of the response efficiency thresholds of all the blocks, namely +. >

The bounded backpack approach may be solved using a dynamic programming approach. In the (Improved Dynamic Programming) IDP method, let f (d, k) be the optimal solution for a backpack of capacity d, an instance of item type k (BKP). Starting with a starting capacity of 0, each item is added to the solution, adding at most m _j Next, if the optimization objective of the dynamic programming array

If the solution is promoted, the solution is saved as a copy, and the copy is compared with other solutions to select the optimal solution. If the optimization objective cannot be lifted, updating the initial capacity, and then performing the above steps until the combination of the articles with the maximum value under the condition of not exceeding the backpack capacity is selected, namely f (d, k), wherein the formula is as follows:

where d=0, …, W, k=0, …, K.

Method for using bounded backpackThe selected capacity node, the number of times each type of capacity node is selected, may be obtained. Let the obtained capacity node set be A, let v _j The number of capacity nodes is m' _j . The capacity node set a is the node set with the highest contribution to the query efficiency among all the capacity nodes V. Using this set of nodes for block allocation can more quickly reach the query efficiency threshold requirement of the block. Since the method of this embodiment needs to determine the blocks specifically allocated by the nodes, after the bounded backpack method is used, the blocks of each node need to be determined. The optimization objective of the block allocation problem is to minimize the total cost, so the nodes in a are sorted by unit storage cost, with the low unit cost nodes being preferred. For a block, according to a query efficiency threshold r _i And (5) arranging, namely preferentially distributing the blocks with large thresholds. It should be noted that the block here only needs to be ordered once, compared to the allocation method based on unit overhead. In the worst case, after all nodes in a are allocated blocks, there may be blocks that have not reached the response efficiency threshold. These blocks that do not reach the threshold are then assigned to the nodes in V-Sup>A, the assignment procedure being the same as above. In summary, the bounded backpack-based block allocation includes two phases, a first phase of allocating blocks to a set of capacity nodes generated by the backpack method, the allocation phase being referred to herein as a master allocation phase; the second phase assigns blocks that do not reach the threshold condition to the remaining capacity nodes, which is referred to herein as the supplemental assignment phase.

Because W is a given value in the bounded backpack approach, and W is uncertain in the block allocation problem. To perform the IDP method requires that a total cost W be given for the block allocation problem. To calculate W, each block is based on r _i And the lowest unit storage cost node in all nodes is preferentially selected for each block until the response efficiency threshold requirement is met. Calculation under this allocation scheme

As shown in table 6. First use is made of a method for boundingThe IDP method of the knapsack problem results in a set a of capacity nodes to be allocated (corresponding to rows 1-4 in table 6). Rows 5-11 perform a first stage block allocation and rows 12-18 perform a 2 nd stage block allocation. The cycle conditions for the two stages are different: the first stage is

I.e., there are blocks that have not reached the threshold and the nodes in a have not been allocated; the second stage is->

I.e. there are more blocks that do not reach the threshold.

Time complexity analysis: the time complexity of performing the IDP method is O (WK). The complexity of both the first and second stages is O (|N|+KlogK+|B|log|B|). The temporal complexity of the knapsack method is therefore O (|n|+|b|log|b|+wk).

/>

TABLE 6

For ease of understanding, a set of specific examples are provided herein. Also using tables 3, 4 as inputs, v was determined according to the bounded backpack method ₁ The capacity node is selected 2 times, v ₂ The capacity node is selected 2 times, v ₃ The capacity node is selected 1 time. The blocks are allocated according to the method. v ₃ The unit cost of the capacity node is 5, which is lowest in node set A, so v is selected first ₃ Capacity nodes. The method first selects r _i The largest three blocks, i.e. r ₁ ＝100、r ₂ ＝90、r ₃ =80, thus will b ₁ 、b ₂ 、b ₃ Assigned to v ₃ -nodes (as shown in fig. 6 (a)). b ₁ And b ₃ The node reaches a response efficiency threshold, v ₃ The capacity node reaches the upper limit of the selection times. Splicing jointNext, select v ₂ Capacity node because v ₂ The unit cost of the capacity node is 10. B after the previous dispensing ₂ The block still fails to meet the query efficiency threshold requirement, thus consider b ₂ 、b ₄ 、b ₅ 、b ₆ These four blocks, where r _i The largest two blocks are b ₂ And b ₄ . Thus will b ₂ Block and b ₄ Block allocation v ₂ -nodes (as shown in fig. 6 (b)). b ₂ The blocks meet the threshold requirement. v ₂ The capacity nodes have two in total, so v is selected again ₂ Capacity node, b ₄ b ₅ Block allocation v ₂ -nodes (as shown in fig. 6 (c)). Finally, b ₆ Block allocation v ₁ -nodes (as shown in fig. 6 (d)). At this time

Although->

No additional nodes from V-Sup>A are needed for block allocation. The final distribution plan is thus (v) ₃ ,b ₁ b ₂ b ₃ )，(v ₂ ,b ₂ b ₄ )，(v ₂ ,b ₄ b ₅ )，(v ₁ ,b ₆ ). The total cost of the allocation plan is 70.

The method embodiment provided by the application has the following beneficial effects:

The above method embodiment only considers the block allocation problem in the static environment, but in the actual production process, the blockchain system generates a new block at intervals, and the blocks in the blockchain system are continuously updated, so that the dynamic update problem of the blockchain system needs to be considered. For this reason, the embodiment of the application also provides an extensible collaborative blockchain block storage method to solve the problem of dynamic update of a blockchain system.

The present method embodiment provides two concepts:

the first concept is: and executing the three heuristic methods under the static environment to allocate the original blocks to the nodes, and fully storing the capacity owned by the nodes. When the newly added blocks reach a certain number, the new blocks are allocated to the unallocated nodes.

In some embodiments: calculating the number of newly added blocks in the block chain, and when the number of the newly added blocks does not reach a threshold value, distributing the blocks to corresponding nodes according to a heuristic method under a static environment until the storage capacity in the corresponding nodes is fully stored; when the number of the newly added blocks reaches a threshold value, the newly added blocks are distributed to the nodes which are not distributed in the blockchain based on a dynamic distribution method of the rest nodes.

TABLE 7

The pseudo code provided in Table 7 makes the set of blocks B= { B ₁ ,B ₂ …, each time for a new set of blocks B _i Allocation is made (new blocks appear in the new set). Calling a static allocation method, and inputting the static allocation method into a node capacity set V and a new zone block set B _i And a set of response efficiency thresholds R. Let V _Λ To allocate in the subset lambdaIs defined as a set of node capacities. After each allocation, V is deleted from V _Λ . When a new node is added to the blockchain system, the new node is directly added to the node capacity set, so that only the node capacity set V in the table 7 needs to be changed.

The second concept is: the node selects half of the current residual capacity to participate in the block allocation process each time, and dynamically updates the reserved space for the next block until the capacity of the node is full, and then no new block is allocated.

In some embodiments: each time a block is allocated to a corresponding node according to a heuristic under a static environment, half of the current remaining storage capacity of the node is utilized to participate in the allocation of the block.

TABLE 8

Pseudo code as provided in Table 8, let V _i (Λ) is the capacity V _i The number of blocks allocated by a node in allocation Λ, then the remaining capacity of the node is V _i -V _i (Λ). Since 1/2 of the remaining capacity is selected each time to participate in the next allocation, the node uses the capacity of the next allocation as follows

When a new node is added into the blockchain system, the new node is directly added into the node capacity set. Therefore, only the node capacity set V in table 8 needs to be changed.

Experimental results are provided below:

first, block allocation experimental data in static environment:

experimental platform (environment): the hardware environment of the experiment is Inter (R) Core (TM) i5-10400F CPU (2.90 GHz), RAM is a 16GB PC, the operating system is Windows10, and the programming language is C++.

Experimental data: and generating |B| blocks, wherein the number of the blocks is 400-1500. The response efficiency threshold of each block is the interval

Random values in (a) are provided. Generating K types of nodes, wherein the total number of the nodes is equal to |N|, and the value range of |N| is 450-1000. To simplify the experiment, the number of capacity nodes of each class is the same. Each type of capacity node can provide different response efficiencies for different blocks, and the response efficiencies are in the interval [ gamma, delta ]]Random values in (a) are provided. The cost of a node storing a single block is interval [5,30]Random values in (a) are provided. All values generated are positive integers. Table 9 lists the values of the experimental parameters in the case of static blocks, where the default values of the parameters are indicated in bold.

Performance index: total cost, run time, with the response efficiency threshold requirements for each block met.

/>

TABLE 9

Second, block allocation experimental data in dynamic environment:

Experimental data: generating |B| blocks with the number of blocks being 1500-3500, dividing the block set into 5 block sets according to the time stamp generated by the blocks, namely B= { B ₁ ,B ₂ ,B ₃ ,B ₄ ,B ₅ }. Taking the example of 2500 blocks, each 500 blocks generated form a new set of blocks. The response efficiency threshold of each block is the interval

Random values in (a) are provided. Generating K types of nodes, wherein the total number of the nodes is equal to |N|, and the value of |N| is 1000-2000. To simplify the experiment, the number of capacity nodes of each class is the same. Each type of capacity node can provide different response efficiencies for different blocks, and the response efficiencies are in the interval [ gamma, delta ]]Random values in (a) are provided. The cost of a node storing a single block is interval [5,30]Random values in (a) are provided. All values generated are positive integers. Table 10 lists the values of the experimental parameters in the case of dynamic block update. Performance index: total cost, run time, with the response efficiency threshold requirements for each block met.

Table 10

An embodiment of the present invention provides an scalable collaborative blockchain block storage device, including a first computing unit, a second computing unit, a third computing unit, and a block allocation unit, wherein: the first calculation unit is used for executing the step S100 in the method embodiment, the second calculation unit is used for executing the step S200 in the method embodiment, the third calculation unit is used for executing the step S300 in the method embodiment, and the block allocation unit is used for executing the step S400 in the method embodiment.

It should be noted that the present apparatus embodiment and the above-described method embodiment are based on the same inventive concept, and thus the relevant content of the above-described method embodiment is also applicable to the present apparatus embodiment, and thus will not be described in detail herein.

Referring to fig. 7, the present application further provides a computer device 301, comprising: memory 310, processor 320, and computer program 311 stored on memory 310 and executable on the processor, processor 320 implementing when executing computer program 311: a scalable adaptive collaborative blockchain block storage method as described above. The processor 320 and the memory 310 may be connected by a bus or other means. Memory 310 acts as a non-transitory computer readable storage medium that may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, memory 310 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some implementations, memory 310 may optionally include memory located remotely from the processor to which the remote memory may be connected via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software programs and instructions required to implement the scalable collaborative blockchain block storage method of the above embodiments are stored in memory that, when executed by a processor, perform the scalable collaborative blockchain block storage method of the above embodiments, e.g., perform method steps S100 through S400 of fig. 3 described above.

Referring to fig. 8, the present application also provides a computer-readable storage medium 401 storing computer-executable instructions 410, the computer-executable instructions 410 for performing: such as the scalable collaborative blockchain block storage method described above.

The computer-readable storage medium 401 stores computer-executable instructions 410, where the computer-executable instructions 410 are executed by a processor or controller, for example, by a processor in the above-described electronic device embodiment, and may cause the processor to perform the scalable collaborative blockchain block storage method in the above-described embodiment, for example, performing the method steps S100 to S400 in fig. 3 described above.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of data such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired data and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any data delivery media.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. An expandable collaborative blockchain block storage method is characterized by comprising the following steps:

dividing the whole block chain into a plurality of block chain fragments according to a preset rule;

wherein the said

the constraint conditions are as follows:

the N (b) _i ) Representing storing said b _i Is said +.>

Representing storing said b _i Is responsive to the node of (a)Efficiency, said->

and according to the optimization objective function and the constraint condition, distributing the blocks to be distributed to the corresponding nodes, wherein the distributing the blocks to be distributed to the corresponding nodes comprises the following steps:

and calculating the number of newly added blocks to be allocated in the current blockchain partition, and when the number of the newly added blocks to be allocated reaches the threshold value, allocating the newly added blocks to be allocated to the nodes which are not allocated in the current blockchain partition based on a dynamic allocation method of the remaining nodes.

2. The scalable collaborative blockchain block storage method of claim 1, wherein the blocks to be allocated are allocated to the respective nodes by a static heuristic method according to the optimization objective function and the constraints.

3. The scalable collaborative blockchain block storage method of claim 2, wherein the static heuristic method includes any one of a unit-overhead-based allocation method, a gain-ratio-based allocation method, and a bounded backpack-based block allocation method.

4. The scalable collaborative blockchain block storage method of claim 1, wherein upon allocation of blocks to be allocated to respective said nodes, the scalable collaborative blockchain block storage method further comprises:

and using half of the current residual storage capacity of the node to participate in the allocation of the block to be allocated until the storage capacity of the node is full.

5. An expandable collaborative blockchain block storage device, comprising:

the first calculation unit is used for dividing the whole blockchain into a plurality of blockchain fragments according to a preset rule and calculating the storage capacity, response efficiency and unit cost of each node in the current blockchain fragments;

wherein the said

the constraint conditions are as follows:

wherein the E (b) _i ) Representing the block b to be allocated _i The response efficiency of r is as follows _i Representing said b _i Is a response efficiency threshold of (2)Said B representing a set of said blocks to be allocated,

the N (b) _i ) Representing storing said b _i Is said +. >

Representing storing said b _i Is said +.>

a block allocation unit, configured to allocate the block to be allocated to the corresponding node according to the optimization objective function and the constraint condition, where allocating the block to be allocated to the corresponding node includes:

6. An electronic device, characterized in that: comprising at least one control processor and a memory for communication connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the scalable collaborative blockchain block storage method of any of claims 1-4.

7. A computer-readable storage medium, characterized by: the computer readable storage medium stores computer executable instructions for causing a computer to perform the scalable collaborative blockchain blockstoring method of any of claims 1-4.