WO2021003985A1 - Blockchain data archiving storage method and apparatus, computer device and storage medium - Google Patents

Blockchain data archiving storage method and apparatus, computer device and storage medium Download PDF

Info

Publication number
WO2021003985A1
WO2021003985A1 PCT/CN2019/123147 CN2019123147W WO2021003985A1 WO 2021003985 A1 WO2021003985 A1 WO 2021003985A1 CN 2019123147 W CN2019123147 W CN 2019123147W WO 2021003985 A1 WO2021003985 A1 WO 2021003985A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
storage
fragmented
blockchain
preset
Prior art date
Application number
PCT/CN2019/123147
Other languages
French (fr)
Chinese (zh)
Inventor
薄辰龙
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021003985A1 publication Critical patent/WO2021003985A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Definitions

  • This application relates to a blockchain data archiving and storage method, device, computer equipment and storage medium.
  • the inventor realized that the current blockchain data storage method or traditional method, because the storage space of the blockchain node increases linearly with the increase of the block, the storage space of the block node is limited, and the incremental zone data will gradually affect The response speed of the memory further affects the efficiency of consensus, verification, and read and write operations for high concurrency of node devices, and there is a problem of low data storage efficiency.
  • a blockchain data archive storage method is provided.
  • a method for archiving and storing blockchain data includes:
  • the preset archiving conditions include height value conditions and access frequency conditions
  • periodically detecting data that meets preset archiving conditions in the blockchain data, and obtaining the data to be archived includes:
  • the cycle time is less than the time required to generate a data set that meets the preset total amount
  • the data sets that meet the preset total are sequentially obtained from the lowest height value
  • the data in the data set is determined as the data to be archived.
  • performing fragmentation processing on the data to be archived to obtain fragmented data includes: dividing the data to be archived into at least two groups to obtain at least two groups of grouped data, and adding each grouped data to the at least two data During fragmentation, fragment data is obtained.
  • the distributed storage engine includes a private storage engine, a federated storage engine, and a public storage engine, identifying the data source type of sharded data, and obtaining the distributed storage engine corresponding to the data source type includes:
  • the entry parameters corresponding to the general public storage engine are modified to obtain the target public storage engine.
  • the storage node storing the fragmented data in the target distributed storage engine includes:
  • the storage node storing the fragmented data in the target distributed storage engine includes: identifying the data volume or data importance level identifier of the fragmented data, and according to the data Identifies the amount or data importance level, and allocates the fragmented data to the storage nodes in the target distributed storage engine.
  • the storage node storing the fragmented data in the target distributed storage engine includes:
  • the fragmented data is stored to the storage node in the target distributed storage engine.
  • the data slice where the sliced data is located includes multiple data packets; after storing the sliced data to the storage node in the target distributed storage engine, it further includes:
  • each data packet constructs a Merkle tree corresponding to the data slice where each data packet is located.
  • it further includes:
  • a block chain data archive storage device includes:
  • the data detection module is used to periodically detect the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived.
  • the preset archiving conditions include height value conditions and access frequency conditions;
  • the data fragmentation module is used for fragmentation processing of the data to be archived to obtain fragmented data
  • the distributed storage engine acquisition module is used to identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine;
  • the data storage module is used to store the fragmented data to the storage node in the target distributed storage engine.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • the preset archiving conditions include height value conditions and access frequency conditions
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • the preset archiving conditions include height value conditions and access frequency conditions
  • Fig. 1 is an application environment diagram of a method for archiving and storing blockchain data according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of a method for archiving and storing blockchain data according to one or more embodiments.
  • Fig. 3 is a schematic diagram of a process of archiving and storing blockchain data according to one or more embodiments.
  • Figure 4 is a schematic diagram of a hash ring in an embodiment.
  • Fig. 5 is a block diagram of a blockchain data archive storage device according to one or more embodiments.
  • Fig. 6 is a block diagram of a blockchain data archive storage device according to one or more embodiments.
  • Figure 7 is a block diagram of a computer device according to one or more embodiments.
  • the blockchain data archiving storage method provided in this application can be applied to the application environment shown in FIG. 1.
  • the blockchain network includes multiple blockchain nodes 102, and the blockchain nodes 102, the control system 104, and the distributed storage system 106 realize pairwise communication through the network.
  • the blockchain node 102 and the control system 104 can be different computer devices, which include but are not limited to servers, computer hosts, smart phones, tablets, smart wearable devices, etc.
  • Each computer device serves as a node of the blockchain Devices can participate in database recording, and data can be synchronized quickly between various computer devices.
  • Each blockchain node 102 has a corresponding control system 104, and the distributed storage system 106 may include multiple storage devices, where multiple storage devices are assembled through applications or software to provide data storage and access functions externally.
  • the distributed storage system 106 can provide blockchain block data storage services for at least one blockchain node.
  • the control system 104 takes multiple servers as an example.
  • the server periodically detects the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived.
  • the data to be archived is fragmented to obtain the fragmented data and identify
  • the data source type of the fragmented data is obtained, the distributed storage engine corresponding to the data source type is obtained, the target distributed storage engine is obtained, and the fragmented data is stored to the storage node in the target distributed storage engine.
  • the above scheme not only realizes the archive processing of the blockchain data, but also realizes the expansion of the blockchain data storage space through a distributed storage method, and meets the storage space requirements of the linear growth of the blockchain data.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for archiving and storing blockchain data is provided. Taking the method applied to the server in the control system 104 in FIG. 1 as an example for description, the method includes the following steps:
  • Step 200 Periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived.
  • the preset archiving conditions include height value conditions and access frequency conditions.
  • Blockchain also known as distributed ledger technology, is a new technology in which several computer devices participate in bookkeeping and jointly maintain a complete distributed database (which can be regarded as a ledger).
  • Blockchain data refers to data that are bound into block files (which can be regarded as account pages) in chronological order, and string together to form data in the form of a chain.
  • each "block data" in the blockchain data is like the "account page" in the ledger, and each page records several data transaction records, and the pages of the account are bound in chronological order It forms a complete ledger that is a distributed database.
  • the user can initiate a data transaction request through the blockchain node in the terminal blockchain node 102 and submit the data to be processed.
  • the data to be processed After the blockchain node receives the data processing request, the data to be processed performs corresponding data processing, for example, Consensus processing and storage of data with other blockchain nodes.
  • Data archiving is the process of moving data that is no longer frequently used to a separate storage device for long-term preservation.
  • the height value of the block data in the blockchain and the frequency of access can intuitively reflect whether the block data of the blockchain node is frequently used data (hot data) or data that has not been used for a long time (cold data). Therefore, in this embodiment, the data height value condition and the access frequency condition are used as the preset archiving conditions for detecting the data to be archived in the blockchain data.
  • the data that meets the preset archiving condition in the blockchain data is detected according to the cycle time, and the data that meets the preset archiving condition is determined as the data to be archived.
  • the preset archiving condition includes a height value condition and an access frequency condition.
  • Step 400 Perform fragmentation processing on the data to be archived to obtain fragmented data.
  • the data to be archived can be divided into multiple (at least two) data groups to obtain at least two groups of grouped data, and each grouped data is added to at least two data fragments to obtain the divided data. Fragment data, to ensure that all packet data (data to be archived) exist in at least two data fragments, and each data fragment does not include all packet data.
  • the data to be archived can be divided into groups including the same amount of data by means of equal division, or the data to be archived can be divided into groups including different amounts of data by means of random division. Data, and then add each divided data packet to at least two data fragments. When a data fragment is maliciously damaged, the complete data will not be lost, and other data fragments can be used to repair the file.
  • Step 600 Identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine.
  • the storage engine is used to store data in MySQL (Structured Query Language, structured query language) in memory using a variety of different technologies. Each of these technologies uses different storage mechanisms, indexing techniques, and locking Level and ultimately provide a wide range of different functions and capabilities, by choosing different technologies, to obtain additional speed or functionality, thereby improving the overall function of the application.
  • Blockchain data is divided by data source type, including public chain data, private chain data, and alliance chain data. Specifically, by identifying the type of blockchain data source, the distributed storage engine corresponding to the data source type can be obtained to obtain the target Distributed storage engine.
  • Step 800 Store the fragmented data to the storage node in the target distributed storage engine.
  • the storage method of the blockchain data is distributed storage, which is to store the blockchain data on multiple independent devices. It adopts a scalable system structure and uses multiple storage servers to share the storage load. It not only improves the reliability, availability and access efficiency of the system, it is also easy to expand.
  • a storage node is an independent device (storage server) used to store data. After the corresponding distributed storage engine is obtained, the fragmented data is stored to the storage node in the target distributed storage engine. Specifically, according to the hash value of the fragment data, the fragment data may be stored in the storage node pointed to by the hash value. It may also be to identify the data volume or data importance level identification of the fragmented data, and allocate the fragmented data to the corresponding storage nodes in the target distributed engine for storage according to the data volume or data importance level identification.
  • the above-mentioned blockchain data archiving storage method by periodically detecting the data to be archived and performing fragmentation processing on it, can avoid data redundancy, reduce the computing speed, and can flexibly choose the storage method according to the source of the fragmented data to ensure data security Availability, and assigning fragmented data to storage nodes in the corresponding distributed storage engine can save node storage space, ensure good storage space and high data availability, and improve data processing efficiency.
  • periodically detecting data that meets preset archiving conditions in the blockchain data to obtain the data to be archived includes: Step 220: Counting the lowest height value in the blockchain node according to the cycle time The total amount of data between the data and the data with the highest height value, the cycle time is less than the time required to generate a data set that meets the preset total amount. When the total amount of data is greater than the preset total amount, the lowest height value is obtained in turn For data sets that meet the preset total amount, the access frequency of each data in the data set is detected, and when each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
  • the total amount of preset data can be n, which can be set freely according to the data processing capability of the server.
  • the time required to generate the preset total amount of data can be determined by the data flow in the business processing process. Set the detection cycle time to be less than the data generation time of the preset total amount, so that the data that has met the preset total amount can be archived before the preset total amount of data is generated again, so as to avoid the accumulation of data that meets the archiving conditions.
  • the lowest height value of the locally stored data is H0
  • the highest height value data is H31
  • the lowest height value starts acquiring data that meets the preset total of 20 data, that is, acquiring data with height values of H0-H19. Further, a list is used to record the data whose access frequency exceeds the preset access frequency in the most recent period and the corresponding height value of the data.
  • the preset access frequency is p, and the data whose access frequency exceeds p can be ⁇ X1,...,X30 ⁇ , When the preset total amount of data H0-H19 are not in ⁇ X1,...,X30 ⁇ , it is determined that the data with the height value of H0-H19 is the data to be archived.
  • the detection cycle time is set to be less than the time required to generate the preset total amount of data, and the data blocks meeting the preset total amount can be archived before the preset total amount of data is generated again. Avoid the accumulation of local data to be archived, resulting in data redundancy.
  • performing fragmentation processing on the data to be archived to obtain the fragmented data includes: step 420, using a consistent hash algorithm to perform fragmentation processing on the archived data to obtain the fragmented data.
  • the consistent hashing ring-cutting algorithm can be used to implement data fragmentation.
  • the hash ring is cut into fragments of the same size, and then these fragments are handed over to different storage.
  • the node is responsible.
  • the corresponding Key (key value, K1 as shown in Figure 4) is hashed into the circular hash space through a hash algorithm, and the packet data is calculated through a specific hash function to obtain the hash value corresponding to the data.
  • Hash the hash value onto the hash ring (ie, the Hash ring, as shown in Figure 4) to form fragmented data.
  • the fragments it is responsible for do not need to be merged clockwise After that, it is handed over to storage nodes, such as storage nodes a, b, and c as shown in Figure 4, but the entire shard can be handed over to any storage node as a whole more flexibly.
  • storage nodes such as storage nodes a, b, and c as shown in Figure 4, but the entire shard can be handed over to any storage node as a whole more flexibly.
  • one shard is often used as the smallest data migration and backup unit. It can be understood that, in other embodiments, the fragmentation processing method may also be performed in a manner such as alternate placement or interval division.
  • data packets form at least 3 data fragments, and each data fragment is composed of part of the packet data (not including all the packet data), and the number of data packets in each data fragment can be the same or different , It is also necessary to ensure that each data packet is added to at least two data shards to form at least two storage copies. Even if a storage node in the storage system is attacked, the complete data will not be leaked. It can be based on other storage nodes. The copy of data to repair the missing data. By sharding the data, the data processing speed and data throughput are improved, and the amount of data is prevented from being blocked due to excessive data.
  • the consistent Hash can solve the stability problem very well, and all the storage nodes can be arranged at the end. Hash ring.
  • the distributed storage engine includes a public storage engine, a federation storage engine, and a private storage engine to identify the data source type of the sharded data, and obtain the distribution corresponding to the data source type Storage engine to obtain the target distributed storage engine includes: step 620, when it is identified that the shard data comes from the private blockchain, modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine, and when the segment is identified When the slice data comes from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine. When it is identified that the slice data originates from the public blockchain, modify the general public storage The entry parameter corresponding to the engine is used to obtain the public storage engine.
  • Distributed storage engines include public storage engines, alliance storage engines, and private storage engines.
  • Data source types can include private blockchain (private chain), alliance blockchain (consortium chain), and public blockchain (public Chain), private storage engine, consortium storage engine, and public storage engine refer to storage engines used to store the data of the private blockchain, consortium blockchain, and public blockchain, respectively.
  • the private storage engine is used to store the data of the source private blockchain for itself Use; when identifying the sharded data from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine, which is used to store the data from the alliance blockchain , For specific multi-party use; when identifying that the sharded data comes from a public blockchain, modify the entry parameters corresponding to the general public storage engine to obtain the public storage engine, which is used to store the data from the public zone
  • the data of the block chain is used by multiple parties. It can also be receiving a parameter modification request, and building a storage engine corresponding to the target modification parameter according to the target modification parameter carried in the parameter modification request.
  • the storage method is flexibly selected according to the data source type of the fragmented data to ensure the safety and availability of the data.
  • storing the sharded data to the storage node in the target distributed storage engine includes: step 820, calculating the hash value of the sharding data according to the hash algorithm, and dividing the sharded data The data is stored in the storage node pointed to by the hash value in the target distributed storage engine.
  • the hash value of each piece of data can be calculated according to the hash algorithm, and the hash value points to the storage address of the piece of data, and the piece of data is stored in the target distributed storage engine
  • the storage node the hash value points to.
  • the fragmented data is stored in the storage node corresponding to the storage address in the private storage engine.
  • the preset importance level of the fragmented data can also be preset to identify the importance level identifier of each fragmented data. When the importance level of the fragmented data is high, the choice is strong in computing power and fast in calculation.
  • Storage node for storage When the corresponding distributed storage engine is obtained, the hash value of each piece of data can be calculated according to the hash algorithm, and the hash value points to the storage address of the piece of data, and the piece of data is stored in the target distributed storage engine The storage node the hash value points to.
  • the fragmented data is stored in the storage node corresponding to the storage address in the private storage engine.
  • the allocation of the storage nodes of the fragment data is completed according to the hash value of the fragment data, and the corresponding storage allocation data can facilitate the recording of the correspondence between the allocation data and the storage nodes.
  • storing the sharded data in the storage node of the target distributed storage engine includes: step 840, evaluating and classifying each storage node, determining the storage performance of each storage node, and performing weighting on the sharded data. Value distribution evaluation, determine the storage requirements of the sharded data, and store the sharded data to the storage node in the target distributed storage engine according to the storage performance of each storage node and the storage requirements of each sharded data.
  • the storage method of sharded data can also be pre-evaluated and categorized storage nodes. Specifically, the storage nodes are evaluated according to their capacity, computing speed, and computing power to determine the storage performance of the storage nodes; at the same time The importance, data volume, and type of shard data are evaluated for weight distribution, and the storage requirements of shard data are determined. Then, based on the storage performance of each candidate node and the data storage requirements of each shard, peer selection is made, and the sharded data Store to the appropriate storage node in the target distributed storage engine. Among them, storage nodes are nodes that work independently.
  • it can be to identify the degree of data confidentiality, data volume, and data type of the fragmented data, and number each storage node according to the computing power, computing speed, and capacity of each storage node.
  • the storage nodes with the fastest speed or the largest capacity are sequentially numbered 1, 2, 3...m, etc., and important data or data with a large amount of data or high data confidentiality are sequentially sent to the storage nodes numbered 1, 2, 3...m Store it.
  • by slicing the data and sending the data to the corresponding storage node high data throughput is achieved and the technical cost of the device is reduced.
  • the data shard where the sharded data is located includes multiple data packets; after storing the sharded data to the storage node in the target distributed storage engine, the method further includes: 900. Calculate the hash value of each data group, construct a Merkle tree corresponding to the data segment where each data group is located according to the hash value of each data group, and record where each Merkle tree and each data segment are located Correspondence of storage nodes.
  • Merkel tree or Merck tree, hash tree
  • each leaf node of the tree stores the hash value of the data block.
  • the data fragment where the fragment data is located includes multiple data packets.
  • the meck corresponding to the data fragment where each data packet is located can be established based on the hash value of each data packet For ease of understanding, it can be called a fragmented Merkle tree.
  • Each data fragment corresponds to a fragmented Merkle tree. After obtaining the fragmented Merkle tree corresponding to each data fragment, you can The corresponding relationship between the Merkle tree of each fragment and the storage node where each data fragment is located is recorded.
  • the search and verification of data can be realized through the Merkle tree, which improves the privacy and security of stored data.
  • the method further includes: step 950, performing capacity detection on each storage node, and sending a storage node expansion request when the capacity of a storage node reaches a preset storage capacity.
  • it may also include: detecting the capacity of each storage node according to the cycle time, and when the capacity of a storage node reaches the preset storage capacity, sending a storage node expansion request. Specifically, it may be to send a storage node expansion request to the master node in the storage node. After receiving the storage node expansion request, the master node actively adds other storage nodes to achieve the purpose of expanding the node. Among them, the master node can manage some changes in the cluster, such as creating or deleting an index, adding or removing other nodes, and so on. It may also send a storage node expansion request to the management terminal, and after receiving the storage node expansion request, the management terminal adds other storage nodes to expand the storage capacity. In this embodiment, by periodically detecting the capacity of the storage node to expand the storage space, the problem of data loss caused by insufficient storage space of the storage node can be avoided.
  • a blockchain data archive storage device including: a data detection module 410, a data slicing module 420, a distributed storage engine acquisition module 430, and a data storage module 440 ,among them:
  • the data detection module 410 is configured to periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived.
  • the preset archiving conditions include height value conditions and access frequency conditions;
  • the data fragmentation module 420 is configured to perform fragmentation processing on the data to be archived to obtain fragmented data
  • the distributed storage engine obtaining module 430 is used to identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine;
  • the data storage module 440 is used to store the fragmented data to the storage node in the target distributed storage engine.
  • the blockchain data archiving storage device further includes a relationship recording module 450, which is used to calculate the hash value of each data group, and construct a relationship with each data group according to the hash value of each data group.
  • the Merkle tree corresponding to the data segment where the data group is located records the correspondence between each Merkle tree and the storage node where each data segment is located.
  • the blockchain data archive storage device further includes a capacity detection module 460, which is used to perform capacity detection on each storage node. When the capacity of each storage node reaches the preset storage capacity, send Storage node expansion request.
  • the data detection module 410 is further configured to, when the total amount of data is greater than the preset total amount, sequentially obtain data sets meeting the preset total amount from the lowest height value, and detect the access frequency of each data in the data set , When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
  • the data fragmentation module 420 is further configured to divide the data to be archived into at least two groups to obtain at least two groups of grouped data, and add each grouped data to the at least two data fragments to obtain the divided data. Piece data.
  • the distributed storage engine acquisition module 430 is also used to modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine when it is identified that the shard data comes from the private blockchain.
  • the data storage module 440 is further configured to calculate the hash value of the fragmented data according to the hash algorithm, and store the fragmented data to the storage node pointed to by the hash value in the target distributed storage engine.
  • the data storage module 440 is also used to identify the data volume or data importance level identification of the fragmented data, and allocate the fragmented data to the target distributed storage engine according to the data volume or data importance level identification Storage node.
  • the data storage module 440 is also used to evaluate and classify each storage node, determine the storage performance of each storage node, evaluate the weight distribution of the shard data, determine the storage requirements of the shard data, and The storage performance of each storage node and the storage requirements of each fragmented data are stored to the storage node in the target distributed storage engine.
  • Each module in the above-mentioned blockchain data archiving storage device can be implemented in whole or in part by software, hardware and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 7.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store the data to be archived in the blockchain.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a blockchain data archiving and storage method.
  • Figure Y is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and one or more processors, in which computer readable instructions are stored, and when the computer readable instructions are executed by the processor, the blockchain data archive storage method provided in any embodiment of the present application is implemented A step of.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, one or more processors are provided in any of the embodiments of the present application The steps of the blockchain data archive storage method.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A blockchain data archiving storage method, comprising: periodically detecting data that meets preset archiving conditions in blockchain data to obtain data to be archived, wherein the preset archiving conditions include a height value condition and an access frequency condition; performing fragmentation processing on the data to be archived to obtain fragmented data; identifying the data source type of the fragmented data, and acquiring a distributed storage engine corresponding to the data source type to obtain a target distributed storage engine; and storing the fragmented data in a storage node in the target distributed storage engine.

Description

区块链数据归档存储方法、装置、计算机设备和存储介质Block chain data archive storage method, device, computer equipment and storage medium
相关申请的交叉引用Cross references to related applications
本申请要求于2019年7月8日提交中国专利局,申请号为2019106116447,申请名称为“区块链数据归档存储方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on July 8, 2019, with the application number 2019106116447 and the application titled "Blockchain data archiving and storage methods, devices, computer equipment and storage media", all of which The content is incorporated in this application by reference.
技术领域Technical field
本申请涉及一种区块链数据归档存储方法、装置、计算机设备和存储介质。This application relates to a blockchain data archiving and storage method, device, computer equipment and storage medium.
背景技术Background technique
随着数据存储技术的发展,出现了使用区块链技术对海量数据进行记录存储的方式,通过区块链技术存储数据保证了数据的安全性以及存储空间,进而引出目前的区块链数据存储方式或者传统方法。With the development of data storage technology, there has been a way to record and store massive amounts of data using blockchain technology. Storing data through blockchain technology ensures data security and storage space, which leads to the current blockchain data storage Way or traditional way.
然而,发明人意识到,目前的区块链数据存储方式或者传统方法,由于区块链节点存储空间随着区块增加而线性增长,区块节点存储空间有限,增量的区数据会逐渐影响存储器的响应速度,进而影响节点设备的对高并发的共识、验证以及读写操作的效率,存在数据存储效率低下的问题。However, the inventor realized that the current blockchain data storage method or traditional method, because the storage space of the blockchain node increases linearly with the increase of the block, the storage space of the block node is limited, and the incremental zone data will gradually affect The response speed of the memory further affects the efficiency of consensus, verification, and read and write operations for high concurrency of node devices, and there is a problem of low data storage efficiency.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种区块链数据归档存储方法、装置、计算机设备和存储介质。According to various embodiments disclosed in the present application, a blockchain data archive storage method, device, computer equipment, and storage medium are provided.
一种区块链数据归档存储方法,方法包括:A method for archiving and storing blockchain data, the method includes:
周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件;Periodically detect the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived. The preset archiving conditions include height value conditions and access frequency conditions;
对待归档数据进行分片处理,得到分片数据;Perform fragmentation processing on the archived data to obtain fragmented data;
识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the sharded data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
将分片数据存储至目标分布式存储引擎中的存储节点。Store the sharded data to the storage node in the target distributed storage engine.
在其中一个实施例中,周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据包括:In one of the embodiments, periodically detecting data that meets preset archiving conditions in the blockchain data, and obtaining the data to be archived includes:
按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,周期时间小于生成满足预设总量的数据集的所需时间;Count the total amount of data between the lowest height value data and the highest height value data in the blockchain node according to the cycle time. The cycle time is less than the time required to generate a data set that meets the preset total amount;
当数据总量大于预设总量时,则从最低高度值依次获取满足预设总量的数据集;When the total amount of data is greater than the preset total, the data sets that meet the preset total are sequentially obtained from the lowest height value;
检测数据集中每个数据的访问频率;及Check the access frequency of each data in the data set; and
当数据集中每个数据均小于预设访问频率时,则将数据集中的数据确定为待归档数据。When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
在其中一个实施例中,对待归档数据进行分片处理,得到分片数据包括:将待归档数据划分为至少两个组别,得到至少两组分组数据,将各分组数据添加到至少两个数据分片中,得到分片数据。In one of the embodiments, performing fragmentation processing on the data to be archived to obtain fragmented data includes: dividing the data to be archived into at least two groups to obtain at least two groups of grouped data, and adding each grouped data to the at least two data During fragmentation, fragment data is obtained.
在其中一个实施例中,分布式存储引擎包括私有式存储引擎、联盟式存储引擎以及公有式存储引擎,识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎包括:In one of the embodiments, the distributed storage engine includes a private storage engine, a federated storage engine, and a public storage engine, identifying the data source type of sharded data, and obtaining the distributed storage engine corresponding to the data source type includes:
当识别到分片数据来源于私有区块链时,修改通用私有式存储引擎对应的入口参数,获取私有式存储引擎;When it is recognized that the shard data comes from the private blockchain, modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine;
当识别到分片数据基于智能合约来源于联盟区块链时,修改通用联盟式存储引擎对应的入口参数,获取联盟式存储引擎;及When it is recognized that the shard data comes from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine; and
当识别到分片数据来源于公有式区块链时,修改通用公有式存储引擎对应的入口参数,获取目标公有式存储引擎。When it is recognized that the sharded data comes from the public blockchain, the entry parameters corresponding to the general public storage engine are modified to obtain the target public storage engine.
在其中一个实施例中,将分片数据存储至目标分布式存储引擎中的存储节点包括:In one of the embodiments, the storage node storing the fragmented data in the target distributed storage engine includes:
根据哈希算法,计算分片数据的哈希值;及According to the hash algorithm, calculate the hash value of the fragmented data; and
将分片数据存储至目标分布式存储引擎中哈希值指向的存储节点。Store the sharded data to the storage node pointed to by the hash value in the target distributed storage engine.
在其中一个实施例中,所述将所述分片数据存储至所述目标分布式存储引擎中的存储节点包括:识别所述分片数据的数据量或数据重要程度级别标识,根据所述数据量或数据重要程度级别标识,将所述分片数据分配至所述目标分布式存储引擎中的存储节点。In one of the embodiments, the storage node storing the fragmented data in the target distributed storage engine includes: identifying the data volume or data importance level identifier of the fragmented data, and according to the data Identifies the amount or data importance level, and allocates the fragmented data to the storage nodes in the target distributed storage engine.
在其中一个实施例中,将分片数据存储至目标分布式存储引擎中的存储节点包括:In one of the embodiments, the storage node storing the fragmented data in the target distributed storage engine includes:
对各存储节点进行评估归类,确定各存储节点的存储性能;Evaluate and classify each storage node to determine the storage performance of each storage node;
对分片数据进行权值分配评估,确定分片数据的存储需求;及Perform weight distribution evaluation on sharded data to determine the storage requirements of sharded data; and
根据各存储节点的存储性能以及各分片数据的存储需求,将分片数据存储至目标分布式存储引擎中的存储节点。According to the storage performance of each storage node and the storage requirements of each fragmented data, the fragmented data is stored to the storage node in the target distributed storage engine.
在其中一个实施例中,分片数据所在的数据分片包括多个数据分组;将分片数据存储至目标分布式存储引擎中的存储节点之后,还包括:In one of the embodiments, the data slice where the sliced data is located includes multiple data packets; after storing the sliced data to the storage node in the target distributed storage engine, it further includes:
计算各数据分组的哈希值;Calculate the hash value of each data packet;
根据各数据分组的哈希值,构建与各数据分组所在的数据分片对应的梅克尔树;及According to the hash value of each data packet, construct a Merkle tree corresponding to the data slice where each data packet is located; and
记录各梅克尔树与各数据分片所在的存储节点的对应关系。Record the corresponding relationship between each Merkle tree and the storage node where each data fragment is located.
在其中一个实施例中,还包括:In one of the embodiments, it further includes:
对各存储节点进行容量检测;及Perform capacity inspection on each storage node; and
当各存储节点容量达到预设存储容量时,发送存储节点扩展请求。When the capacity of each storage node reaches the preset storage capacity, a storage node expansion request is sent.
一种区块链数据归档存储装置,装置包括:A block chain data archive storage device, the device includes:
数据检测模块,用于周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件;The data detection module is used to periodically detect the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived. The preset archiving conditions include height value conditions and access frequency conditions;
数据分片模块,用于对待归档数据进行分片处理,得到分片数据;The data fragmentation module is used for fragmentation processing of the data to be archived to obtain fragmented data;
分布式存储引擎获取模块,用于识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及The distributed storage engine acquisition module is used to identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
数据存储模块,用于将分片数据存储至目标分布式存储引擎中的存储节点。The data storage module is used to store the fragmented data to the storage node in the target distributed storage engine.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device, including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件;Periodically detect the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived. The preset archiving conditions include height value conditions and access frequency conditions;
对待归档数据进行分片处理,得到分片数据;Perform fragmentation processing on the archived data to obtain fragmented data;
识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the sharded data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
将分片数据存储至目标分布式存储引擎中的存储节点。Store the sharded data to the storage node in the target distributed storage engine.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件;Periodically detect the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived. The preset archiving conditions include height value conditions and access frequency conditions;
对待归档数据进行分片处理,得到分片数据;Perform fragmentation processing on the archived data to obtain fragmented data;
识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the sharded data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
将分片数据存储至目标分布式存储引擎中的存储节点。Store the sharded data to the storage node in the target distributed storage engine.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为根据一个或多个实施例中区块链数据归档存储方法的应用环境图。Fig. 1 is an application environment diagram of a method for archiving and storing blockchain data according to one or more embodiments.
图2为根据一个或多个实施例中区块链数据归档存储方法的流程示意图。FIG. 2 is a schematic flowchart of a method for archiving and storing blockchain data according to one or more embodiments.
图3为根据一个或多个实施例中区块链数据归档存储的流程示意图。Fig. 3 is a schematic diagram of a process of archiving and storing blockchain data according to one or more embodiments.
图4为一个实施例中哈希环的示意图。Figure 4 is a schematic diagram of a hash ring in an embodiment.
图5为根据一个或多个实施例中区块链数据归档存储装置的框图。Fig. 5 is a block diagram of a blockchain data archive storage device according to one or more embodiments.
图6为根据一个或多个实施例中区块链数据归档存储装置的框图。Fig. 6 is a block diagram of a blockchain data archive storage device according to one or more embodiments.
图7为根据一个或多个实施例中计算机设备的框图。Figure 7 is a block diagram of a computer device according to one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.
本申请提供的区块链数据归档存储方法,可以应用于如图1所示的应用环境中。区块链网络包括多个区块链节点102,区块链节点102、控制系统104以及分布式存储系统106之间通过网络实现两两通信。其中,区块链节点102以及控制系统104可以是不同的计算机设备,计算机设备包括但不限于服务器、电脑主机、智能手机、平板电脑以及智能穿戴设备等,每台计算机设备作为区块链的节点设备可以参与数据库记录、并且各计算机设备之间可以快速地进行数据同步。每一个区块链节点102都有对应的控制系统104,分布式存储系统106可以包括多台存储设备,其中,多台存储设备通过应用程序或软件集合起来共同对外提供数据存储和访问功能。分布式存储系统106可以为至少一个区块链节点提供区块链块数据存储服务。其中,控制系统104以多台服务器为例进行说明,服务器周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,对待归档数据进行分片处理,得到分片数据,识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎,将分片数据存储至目标分布式存储引擎中的存储节点。上述方案,在实现对区块链数据进行归档处理的同时,又通过分布式存储的方式,实现区块链数据存储空间的扩容,满足区块链数据的线性增长对存储空间的需求。其中,服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The blockchain data archiving storage method provided in this application can be applied to the application environment shown in FIG. 1. The blockchain network includes multiple blockchain nodes 102, and the blockchain nodes 102, the control system 104, and the distributed storage system 106 realize pairwise communication through the network. Among them, the blockchain node 102 and the control system 104 can be different computer devices, which include but are not limited to servers, computer hosts, smart phones, tablets, smart wearable devices, etc. Each computer device serves as a node of the blockchain Devices can participate in database recording, and data can be synchronized quickly between various computer devices. Each blockchain node 102 has a corresponding control system 104, and the distributed storage system 106 may include multiple storage devices, where multiple storage devices are assembled through applications or software to provide data storage and access functions externally. The distributed storage system 106 can provide blockchain block data storage services for at least one blockchain node. Among them, the control system 104 takes multiple servers as an example. The server periodically detects the data that meets the preset archiving conditions in the blockchain data to obtain the data to be archived. The data to be archived is fragmented to obtain the fragmented data and identify The data source type of the fragmented data is obtained, the distributed storage engine corresponding to the data source type is obtained, the target distributed storage engine is obtained, and the fragmented data is stored to the storage node in the target distributed storage engine. The above scheme not only realizes the archive processing of the blockchain data, but also realizes the expansion of the blockchain data storage space through a distributed storage method, and meets the storage space requirements of the linear growth of the blockchain data. Among them, the server can be implemented by an independent server or a server cluster composed of multiple servers.
在其中一个实施例中,如图2所示,提供了一种区块链数据归档存储方法,以该方法应用于图1中的控制系统104中的服务器为例进行说明,包括以下步骤:In one of the embodiments, as shown in FIG. 2, a method for archiving and storing blockchain data is provided. Taking the method applied to the server in the control system 104 in FIG. 1 as an example for description, the method includes the following steps:
步骤200,周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件。Step 200: Periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived. The preset archiving conditions include height value conditions and access frequency conditions.
区块链,也被称为分布式账本技术,是一种由若干台计算机设备共同参与记账,共同维护一份完整的分布式数据库(可看作账本)的新技术。区块链数据是指按照时间顺序被装订成一个个区块文件(可看作帐页),并串起来形成链条形式的数据。具体的来说,区块链数据中每一个“块数据”就犹如账本中的“帐页”,每一页都记录了若干条数据交易记录,把一页页的账页按照时间顺序装订起来就形成一个完整的账本即分布式数据库。用户可通过终端区块链节点102中的区块链节点发起数据交易请求,提交待处理的数据,该区块链节点接收到数据处理请求后,对待处理的数据进行相应的数据处理,例如,与其它区块链节点进行数据的共识处理、存储等。数据归档是将不再经常使用的数据移到一个单独的存储设备来进行长期保存的过程。区块链中的块数据的高度值以及访问频频能够直观 的反映出区块链节点的块数据是否是经常使用的数据(热数据)或长期未使用的数据(冷数据)。故本实施例中,将数据高度值条件以及访问频率条件作为检测区块链数据中待归档数据的预设归档条件。具体的,按照周期时间检测区块链数据中满足预设归档条件的数据,将满足预设归档条件的数据确定为待归档数据,预设归档条件包括高度值条件以及访问频率条件。Blockchain, also known as distributed ledger technology, is a new technology in which several computer devices participate in bookkeeping and jointly maintain a complete distributed database (which can be regarded as a ledger). Blockchain data refers to data that are bound into block files (which can be regarded as account pages) in chronological order, and string together to form data in the form of a chain. Specifically, each "block data" in the blockchain data is like the "account page" in the ledger, and each page records several data transaction records, and the pages of the account are bound in chronological order It forms a complete ledger that is a distributed database. The user can initiate a data transaction request through the blockchain node in the terminal blockchain node 102 and submit the data to be processed. After the blockchain node receives the data processing request, the data to be processed performs corresponding data processing, for example, Consensus processing and storage of data with other blockchain nodes. Data archiving is the process of moving data that is no longer frequently used to a separate storage device for long-term preservation. The height value of the block data in the blockchain and the frequency of access can intuitively reflect whether the block data of the blockchain node is frequently used data (hot data) or data that has not been used for a long time (cold data). Therefore, in this embodiment, the data height value condition and the access frequency condition are used as the preset archiving conditions for detecting the data to be archived in the blockchain data. Specifically, the data that meets the preset archiving condition in the blockchain data is detected according to the cycle time, and the data that meets the preset archiving condition is determined as the data to be archived. The preset archiving condition includes a height value condition and an access frequency condition.
步骤400,对待归档数据进行分片处理,得到分片数据。Step 400: Perform fragmentation processing on the data to be archived to obtain fragmented data.
当得到待归档数据之后,可以是将待归档数据划分成多个(至少两个)数据组别,得到至少两组分组数据,将各分组数据都添加到至少两个数据分片中,得到分片数据,保证所有的分组数据(待归档数据)都至少存在于两个数据分片中,而每一个数据分片又不包括所有的分组数据。具体的,将待归档数据进行分组,可采用平均划分的分组方式,将待归档数据分成包括相同数据量的分组数据,或者通过随机划分的方式,将待归档数据划分成包括不同数据量的分组数据,然后将划分后的每个数据分组都添加到至少两个数据分片中。当某个数据分片遭到恶意破坏时不会丢失完整数据,并且利用其它数据分片能对文件进行修复。After the data to be archived is obtained, the data to be archived can be divided into multiple (at least two) data groups to obtain at least two groups of grouped data, and each grouped data is added to at least two data fragments to obtain the divided data. Fragment data, to ensure that all packet data (data to be archived) exist in at least two data fragments, and each data fragment does not include all packet data. Specifically, to group the data to be archived, the data to be archived can be divided into groups including the same amount of data by means of equal division, or the data to be archived can be divided into groups including different amounts of data by means of random division. Data, and then add each divided data packet to at least two data fragments. When a data fragment is maliciously damaged, the complete data will not be lost, and other data fragments can be used to repair the file.
步骤600,识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎。Step 600: Identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine.
存储引擎是用于将MySQL(Structured Query Language,构造化查询言语)中的数据用各种不同的技术存储在内存中,这些技术中的每一种技术都使用不同的存储机制、索引技巧、锁定水平并且最终提供广泛的不同功能和能力,通过选择不同的技术,获取额外的速度或者功能,从而改善应用整体功能。区块链数据按数据来源类型划分,包括公有链数据、私有链数据以及联盟链数据,具体的,可以通过识别区块链数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎。The storage engine is used to store data in MySQL (Structured Query Language, structured query language) in memory using a variety of different technologies. Each of these technologies uses different storage mechanisms, indexing techniques, and locking Level and ultimately provide a wide range of different functions and capabilities, by choosing different technologies, to obtain additional speed or functionality, thereby improving the overall function of the application. Blockchain data is divided by data source type, including public chain data, private chain data, and alliance chain data. Specifically, by identifying the type of blockchain data source, the distributed storage engine corresponding to the data source type can be obtained to obtain the target Distributed storage engine.
步骤800,将分片数据存储至目标分布式存储引擎中的存储节点。Step 800: Store the fragmented data to the storage node in the target distributed storage engine.
本实施例中,区块链数据的存储方式为分布式存储,是将区块链数据分散存储于多台独立的设备上,采用可扩展的系统结构,利用多台存储服务器分担存储负荷,它不但提高了系统的可靠性、可用性和存取效率,还易于扩展。存储节点即一台独立的用于存储数据的设备(存储服务器)。当获取到对应的分布式存储引擎后,将分片数据存储至目标分布式存储引擎中存储节点。具体的,可以是根据分片数据的哈希值,将分片数据存储至哈希值指向的存储节点。也可以是识别分片数据的数据量或者数据重要程度级别标识,根据数据量或数据重要程度级别标识,将分片数据分配至目标分布式引擎中对应的各存储节点进行存储。In this embodiment, the storage method of the blockchain data is distributed storage, which is to store the blockchain data on multiple independent devices. It adopts a scalable system structure and uses multiple storage servers to share the storage load. It not only improves the reliability, availability and access efficiency of the system, it is also easy to expand. A storage node is an independent device (storage server) used to store data. After the corresponding distributed storage engine is obtained, the fragmented data is stored to the storage node in the target distributed storage engine. Specifically, according to the hash value of the fragment data, the fragment data may be stored in the storage node pointed to by the hash value. It may also be to identify the data volume or data importance level identification of the fragmented data, and allocate the fragmented data to the corresponding storage nodes in the target distributed engine for storage according to the data volume or data importance level identification.
上述区块链数据归档存储方法,通过周期性检测待归档数据并对其进行分片处理,能够避免数据冗余,降低运算速度,根据分片数据来源能够灵活的选择存储方式,保全数据的安全可用性,并且将分片数据分配至对应的分布式存储引擎中的存储节点能够节省节点存储空间,保证存储空间良好性以及数据高可用性,提高数据处理效率。The above-mentioned blockchain data archiving storage method, by periodically detecting the data to be archived and performing fragmentation processing on it, can avoid data redundancy, reduce the computing speed, and can flexibly choose the storage method according to the source of the fragmented data to ensure data security Availability, and assigning fragmented data to storage nodes in the corresponding distributed storage engine can save node storage space, ensure good storage space and high data availability, and improve data processing efficiency.
如图3所示,在其中一个实施例中,周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据包括:步骤220,按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,周期时间小于生成满足预设总量的数据集的所需时间,当数据总量大于预设总量时,则从最低高度值依次获取满足预设总量的数据集,检测数据集中每个数据的访问频率,当数据集中每个数据均小于预设访问频率时,则将数据集中的数据确定为待归档数据。As shown in FIG. 3, in one of the embodiments, periodically detecting data that meets preset archiving conditions in the blockchain data to obtain the data to be archived includes: Step 220: Counting the lowest height value in the blockchain node according to the cycle time The total amount of data between the data and the data with the highest height value, the cycle time is less than the time required to generate a data set that meets the preset total amount. When the total amount of data is greater than the preset total amount, the lowest height value is obtained in turn For data sets that meet the preset total amount, the access frequency of each data in the data set is detected, and when each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
具体实施时,可以是查看区块链节点本地存储的个数,即最低高度值的数据与最高高度值的数据之间的所有数据总量(即数据总个数),若数据总量大于预设总量,则从最低高度值的数据开始获取预设总量的数据,若每个预设总量的数据访问量均小于预设访频率时,确定预设总量为满足归档条件的数据,其中,周期时间小于达到预设总量的数据的生成时间。预设数据总量可以为n个,具体可根据服务器的数据处理能力进行自由设置,生成预设总量的数据的所需时间由业务处理过程中数据流量可以确定。将检测的周期时间设置为小于预设总量的数据生成时间,才能在再次生成预设总量的数据前将已经满足预设总量的数据进行归档,避免满足归档条件的数据积累。具体的,例如本地存储的数据的最低高度值为H0,最高高度值数据为H31,则本地存储数据总量为H31-H0+1=31个,即最低高度值到最高高度值之间有31个数据,当预设总量为20个时,本地存储的最低高度值的数据与最高高度值之间的数据总量(31个)大于预设总量(20个),则从最低高度值的数据开始获取满足该预设总量20个的数据,即获取高度值为H0-H19的数据。进一步的,采用列表记录最近一段时间内访问频率超过预设访问频率的数据以及数据对应的高度值,预设访问频率为p,访问频率超过p的数据可以为{X1,...,X30},当预设总量的数据H0-H19均不在{X1,...,X30}时,确定该高度值为H0-H19的数据为待归档数据。本实施例中,将检测的周期时间设置为小于生成预设总量的数据的所需时间,能在再次生成预设总量的数据前将以满足预设总量的数据块进行数据归档,避免本地待归档数据的积累,造成数据冗余。In specific implementation, you can check the number of local storage of blockchain nodes, that is, the total amount of data between the lowest height value and the highest height value data (ie the total number of data), if the total amount of data is greater than the expected If the total amount is set, the data of the preset total amount is obtained from the data of the lowest height value. If the data access volume of each preset total amount is less than the preset access frequency, the preset total amount is determined to meet the archiving conditions. , Wherein the cycle time is less than the generation time of the data reaching the preset total amount. The total amount of preset data can be n, which can be set freely according to the data processing capability of the server. The time required to generate the preset total amount of data can be determined by the data flow in the business processing process. Set the detection cycle time to be less than the data generation time of the preset total amount, so that the data that has met the preset total amount can be archived before the preset total amount of data is generated again, so as to avoid the accumulation of data that meets the archiving conditions. Specifically, for example, the lowest height value of the locally stored data is H0, and the highest height value data is H31, then the total amount of locally stored data is H31-H0+1=31, that is, there are 31 between the lowest height value and the highest height value. When the preset total is 20, the total number of data between the lowest height value and the highest height value stored locally (31) is greater than the preset total (20), then the lowest height value Start acquiring data that meets the preset total of 20 data, that is, acquiring data with height values of H0-H19. Further, a list is used to record the data whose access frequency exceeds the preset access frequency in the most recent period and the corresponding height value of the data. The preset access frequency is p, and the data whose access frequency exceeds p can be {X1,...,X30} , When the preset total amount of data H0-H19 are not in {X1,...,X30}, it is determined that the data with the height value of H0-H19 is the data to be archived. In this embodiment, the detection cycle time is set to be less than the time required to generate the preset total amount of data, and the data blocks meeting the preset total amount can be archived before the preset total amount of data is generated again. Avoid the accumulation of local data to be archived, resulting in data redundancy.
如图3所示,在其中一个实施例中,对待归档数据进行分片处理,得到分片数据包括:步骤420,采用一致性哈希算法对待归档数据进行分片处理,得到分片数据。As shown in FIG. 3, in one of the embodiments, performing fragmentation processing on the data to be archived to obtain the fragmented data includes: step 420, using a consistent hash algorithm to perform fragmentation processing on the archived data to obtain the fragmented data.
以一致性哈希(Hash)算法为例,可采用一致性哈希的割环算法来实现数据分片,将哈希环切割为相同大小的分片,然后将这些分片交给不同的存储节点负责。具体的,通过哈希算法将对应的Key(键值,如图4所示的K1)哈希到环形哈希空间中,将分组数据通过特定的哈希函数计算得到数据对应的哈希值,将哈希值散列到哈希环(即Hash环,如图4所示的圆环)上,形成分片数据,当一个存储节点退出时,其所负责的分片并不需要顺时针合并之后交给存储节点,如图4所述的存储节点a,b以及c,而是可以更灵活的将整个分片作为一个整体交给任意存储节点。在实践中,一个分片多作为最小的数据迁移和备份单位。可以理解的是,在其他实施例中,分片处理方法还可以是轮流放置或区间划分等方式进行。本方案中,数据分组至少形成3个数据分片,每个数据分片都由部分分组数 据组成(不需要包括所有的分组数据),每个数据分片中的数据分组数量可以相同也可以不同,同样也需要保证每个数据分组至少添加至两个数据分片,形成至少两个存储副本,即使存储系统中某一个存储节点遭到攻击,也不会泄露完整的数据,能够根据其他存储节点的副本数据对丢失数据进行修复。通过对数据进行分片处理,提高数据处理速度和数据吞吐量,防止数据量过大造成阻塞,且一致性Hash可以很好的解决稳定性问题,可以将所有的存储节点排列在收尾相接的Hash环上。Taking the consistent hash (Hash) algorithm as an example, the consistent hashing ring-cutting algorithm can be used to implement data fragmentation. The hash ring is cut into fragments of the same size, and then these fragments are handed over to different storage. The node is responsible. Specifically, the corresponding Key (key value, K1 as shown in Figure 4) is hashed into the circular hash space through a hash algorithm, and the packet data is calculated through a specific hash function to obtain the hash value corresponding to the data. Hash the hash value onto the hash ring (ie, the Hash ring, as shown in Figure 4) to form fragmented data. When a storage node exits, the fragments it is responsible for do not need to be merged clockwise After that, it is handed over to storage nodes, such as storage nodes a, b, and c as shown in Figure 4, but the entire shard can be handed over to any storage node as a whole more flexibly. In practice, one shard is often used as the smallest data migration and backup unit. It can be understood that, in other embodiments, the fragmentation processing method may also be performed in a manner such as alternate placement or interval division. In this solution, data packets form at least 3 data fragments, and each data fragment is composed of part of the packet data (not including all the packet data), and the number of data packets in each data fragment can be the same or different , It is also necessary to ensure that each data packet is added to at least two data shards to form at least two storage copies. Even if a storage node in the storage system is attacked, the complete data will not be leaked. It can be based on other storage nodes. The copy of data to repair the missing data. By sharding the data, the data processing speed and data throughput are improved, and the amount of data is prevented from being blocked due to excessive data. The consistent Hash can solve the stability problem very well, and all the storage nodes can be arranged at the end. Hash ring.
如图3所示,在其中一个实施例中,分布式存储引擎包括公有式存储引擎、联盟式存储引擎以及私有式存储引擎,识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎包括:步骤620,当识别到分片数据来源于私有区块链时,修改通用私有式存储引擎对应的入口参数,获取私有式存储引擎,当识别到分片数据基于智能合约来源于联盟区块链时,修改通用联盟式存储引擎对应的入口参数,获取联盟式存储引擎,当识别到分片数据来源于公有式区块链时,修改通用公有式存储引擎对应的入口参数,获取公有式存储引擎。As shown in Figure 3, in one of the embodiments, the distributed storage engine includes a public storage engine, a federation storage engine, and a private storage engine to identify the data source type of the sharded data, and obtain the distribution corresponding to the data source type Storage engine to obtain the target distributed storage engine includes: step 620, when it is identified that the shard data comes from the private blockchain, modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine, and when the segment is identified When the slice data comes from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine. When it is identified that the slice data originates from the public blockchain, modify the general public storage The entry parameter corresponding to the engine is used to obtain the public storage engine.
分布式存储引擎包括公有式存储引擎、联盟式存储引擎以及私有式存储引擎,数据来源类型可以包括私有区块链(私有链)、联盟区块链(联盟链)以及公有式区块链(公有链),私有式存储引擎、联盟式存储引擎以及公有式存储引擎分别指的是用于存储上述私有区块链的数据、联盟区块链的数据以及公有式区块链的数据的存储引擎。当完成对待归档数据的分片处理,得到分片数据后,识别分片数据的数据来源类型,获取与数据来源对应的分布式存储引擎。具体的,当识别分片数据来源于私有区块链时,修改通用私有式存储引擎对应的入口参数,获取私有式存储引擎,私有式存储引擎用于存储来源私有区块链的数据,供自身使用;当识别分片数据基于智能合约来源于联盟区块链时,修改通用联盟式存储引擎对应的入口参数,获取联盟式存储引擎,联盟式存储引擎用于存储来源于联盟区块链的数据,供特定的多方使用;当识别分片数据来源于公有式区块链时,修改通用公有式存储引擎对应的入口参数,获取公有式存储引擎,公有式存储引擎用于存储来源于公有式区块链的数据,供多方使用。也可以是接收参数更改请求,根据参数修改请求携带的目标修改参数,搭建与目标修改参数对应的存储引擎。本实施例中,根据分片数据的数据来源类型,灵活的选择存储方式,保证数据的安全可用性。Distributed storage engines include public storage engines, alliance storage engines, and private storage engines. Data source types can include private blockchain (private chain), alliance blockchain (consortium chain), and public blockchain (public Chain), private storage engine, consortium storage engine, and public storage engine refer to storage engines used to store the data of the private blockchain, consortium blockchain, and public blockchain, respectively. When the fragmentation processing of the data to be archived is completed and the fragmented data is obtained, the data source type of the fragmented data is identified, and the distributed storage engine corresponding to the data source is obtained. Specifically, when it is recognized that the shard data comes from the private blockchain, the entry parameters corresponding to the general private storage engine are modified to obtain the private storage engine. The private storage engine is used to store the data of the source private blockchain for itself Use; when identifying the sharded data from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine, which is used to store the data from the alliance blockchain , For specific multi-party use; when identifying that the sharded data comes from a public blockchain, modify the entry parameters corresponding to the general public storage engine to obtain the public storage engine, which is used to store the data from the public zone The data of the block chain is used by multiple parties. It can also be receiving a parameter modification request, and building a storage engine corresponding to the target modification parameter according to the target modification parameter carried in the parameter modification request. In this embodiment, the storage method is flexibly selected according to the data source type of the fragmented data to ensure the safety and availability of the data.
如图3所示,在其中一个实施例中,将分片数据存储至目标分布式存储引擎中的存储节点包括:步骤820,根据哈希算法,计算分片数据的哈希值,将分片数据存储至目标分布式存储引擎中哈希值指向的存储节点。As shown in FIG. 3, in one of the embodiments, storing the sharded data to the storage node in the target distributed storage engine includes: step 820, calculating the hash value of the sharding data according to the hash algorithm, and dividing the sharded data The data is stored in the storage node pointed to by the hash value in the target distributed storage engine.
当获取对应的分布式存储引擎后,可以是根据哈希算法计算每一个分片数据的哈希值,哈希值指向分片数据的存储地址,将分片数据存储至目标分布式存储引擎中哈希值指向的存储节点。以私有式存储引擎为例,根据分片数据的哈希值指向的存储地址,将分片数据存储至私有式存储引擎中的存储地址对应的存储节点。在其他实施例中,也可以预设分片数据的预设重要程度级别,识别各个分片数据的重要等级标识,当分片数据的重要程 度级别为高等级时,选择计算能力强、算速快的存储节点进行存储。也可以是预设分片数据的数据量,识别分片数据所具有的数据量,当分片数据所携带的数据量高于预设阈值时,将分片数据分配至容量大的存储节点。本实施例中,根据分片数据的哈希值完成分片数据的存储节点的分配,对应的存储分配数据,能够便于记录分配数据与存储节点的对应关系。When the corresponding distributed storage engine is obtained, the hash value of each piece of data can be calculated according to the hash algorithm, and the hash value points to the storage address of the piece of data, and the piece of data is stored in the target distributed storage engine The storage node the hash value points to. Taking the private storage engine as an example, according to the storage address pointed to by the hash value of the fragmented data, the fragmented data is stored in the storage node corresponding to the storage address in the private storage engine. In other embodiments, the preset importance level of the fragmented data can also be preset to identify the importance level identifier of each fragmented data. When the importance level of the fragmented data is high, the choice is strong in computing power and fast in calculation. Storage node for storage. It may also be a preset data amount of the fragmented data, identifying the data amount of the fragmented data, and when the amount of data carried by the fragmented data is higher than a preset threshold, the fragmented data is allocated to a storage node with a large capacity. In this embodiment, the allocation of the storage nodes of the fragment data is completed according to the hash value of the fragment data, and the corresponding storage allocation data can facilitate the recording of the correspondence between the allocation data and the storage nodes.
在其中一个实施例中,将分片数据存储至目标分布式存储引擎中的存储节点包括:步骤840,对各存储节点进行评估归类,确定各存储节点的存储性能,对分片数据进行权值分配评估,确定分片数据的存储需求,根据各存储节点的存储性能以及各分片数据的存储需求,将分片数据存储至目标分布式存储引擎中的存储节点。In one of the embodiments, storing the sharded data in the storage node of the target distributed storage engine includes: step 840, evaluating and classifying each storage node, determining the storage performance of each storage node, and performing weighting on the sharded data. Value distribution evaluation, determine the storage requirements of the sharded data, and store the sharded data to the storage node in the target distributed storage engine according to the storage performance of each storage node and the storage requirements of each sharded data.
分片数据的存储方式还可以是预先对存储节点进行评估归类,具体的,根据存储节点的容量、算速以及计算能力等对存储节点做出评估,确定存储节点的存储性能;同时对分片数据的重要程度、数据量以及类型等进行权值分配评估,确定分片数据的存储需求,然后根据各候选节点的存储性能和各分片数据存储需求,进行对等选择,将分片数据存储至目标分布式存储引擎中合适的存储节点。其中,存储节点是各自独立工作的节点。具体的,可以是识别分片数据的数据保密程度、数据量以及数据类型等,根据各存储节点的计算能力、算速以及容量等性能对各存储节点进行编号,例如将计算能力最强、算速最快或容量最大的存储节点依次编号为1、2、3…m等,将重要数据或者数据量大或数据保密程度高的数据依次发送至编号为1、2、3…m的存储节点进行储存。本实施例中,通过对数据分片,以及将数据发送至对应的存储节点,实现数据的高吞吐,降低设备的技术成本。The storage method of sharded data can also be pre-evaluated and categorized storage nodes. Specifically, the storage nodes are evaluated according to their capacity, computing speed, and computing power to determine the storage performance of the storage nodes; at the same time The importance, data volume, and type of shard data are evaluated for weight distribution, and the storage requirements of shard data are determined. Then, based on the storage performance of each candidate node and the data storage requirements of each shard, peer selection is made, and the sharded data Store to the appropriate storage node in the target distributed storage engine. Among them, storage nodes are nodes that work independently. Specifically, it can be to identify the degree of data confidentiality, data volume, and data type of the fragmented data, and number each storage node according to the computing power, computing speed, and capacity of each storage node. The storage nodes with the fastest speed or the largest capacity are sequentially numbered 1, 2, 3...m, etc., and important data or data with a large amount of data or high data confidentiality are sequentially sent to the storage nodes numbered 1, 2, 3...m Store it. In this embodiment, by slicing the data and sending the data to the corresponding storage node, high data throughput is achieved and the technical cost of the device is reduced.
如图3所示,在其中一个实施例中,分片数据所在的数据分片包括多个数据分组;在将分片数据存储至目标分布式存储引擎中的存储节点之后,方法还包括:步骤900,计算各数据分组的哈希值,根据各数据分组的哈希值,构建与各数据分组所在的数据分片对应的梅克尔树,记录各梅克尔树与各数据分片所在的存储节点的对应关系。As shown in FIG. 3, in one of the embodiments, the data shard where the sharded data is located includes multiple data packets; after storing the sharded data to the storage node in the target distributed storage engine, the method further includes: 900. Calculate the hash value of each data group, construct a Merkle tree corresponding to the data segment where each data group is located according to the hash value of each data group, and record where each Merkle tree and each data segment are located Correspondence of storage nodes.
梅克尔树或称墨克树、哈希树,是一种树形数据结构,树的每个叶节点存储了数据块的哈希值。分片数据所在的数据分片包括多个数据分组,在计算得到各数据分组的哈希值后,可以各数据分组的哈希值为基础建立与各数据分组所在的数据分片对应的梅克尔树,为便于理解,可称作分片梅克尔树,每个数据分片对应一个分片梅克尔树,在获取到各数据分片对应的分片梅克尔树后,可以将各分片梅克尔树与各数据分片所在存储节点之间的对应关系进行记录。本实施例中,通过梅克尔树能够实现对数据的查找和验证,提高储存数据的私密性和安全性。Merkel tree, or Merck tree, hash tree, is a tree-shaped data structure, each leaf node of the tree stores the hash value of the data block. The data fragment where the fragment data is located includes multiple data packets. After the hash value of each data packet is calculated, the meck corresponding to the data fragment where each data packet is located can be established based on the hash value of each data packet For ease of understanding, it can be called a fragmented Merkle tree. Each data fragment corresponds to a fragmented Merkle tree. After obtaining the fragmented Merkle tree corresponding to each data fragment, you can The corresponding relationship between the Merkle tree of each fragment and the storage node where each data fragment is located is recorded. In this embodiment, the search and verification of data can be realized through the Merkle tree, which improves the privacy and security of stored data.
如图3所示,在其中一个实施例中,方法还包括:步骤950,对各存储节点进行容量检测,当有存储节点的容量达到预设存储容量时,发送存储节点扩展请求。As shown in FIG. 3, in one of the embodiments, the method further includes: step 950, performing capacity detection on each storage node, and sending a storage node expansion request when the capacity of a storage node reaches a preset storage capacity.
具体实施时,还可以包括:按照周期时间对各存储节点的容量进行检测,当有存储节点的容量达到预设存储容量时,则发送存储节点扩展请求。具体可以是发送存储节点扩展请求至存储节点中的主节点,主节点在接收到存储节点扩展请求后,主动增加其他存储节点来达到扩展节点的目的。其中,主节点具备管理集群中的一些变更,例如新建或删除索 引、增加或移除其他节点等。也可以是发送存储节点扩展请求至管理终端,管理终端接收存储节点扩展请求后,增加其他存储节点扩充存储容量。本实施例中,通过对存储节点定期进行容量检测,扩充存储空间,能够避免存储节点的存储空间不足造成数据丢失的问题。During specific implementation, it may also include: detecting the capacity of each storage node according to the cycle time, and when the capacity of a storage node reaches the preset storage capacity, sending a storage node expansion request. Specifically, it may be to send a storage node expansion request to the master node in the storage node. After receiving the storage node expansion request, the master node actively adds other storage nodes to achieve the purpose of expanding the node. Among them, the master node can manage some changes in the cluster, such as creating or deleting an index, adding or removing other nodes, and so on. It may also send a storage node expansion request to the management terminal, and after receiving the storage node expansion request, the management terminal adds other storage nodes to expand the storage capacity. In this embodiment, by periodically detecting the capacity of the storage node to expand the storage space, the problem of data loss caused by insufficient storage space of the storage node can be avoided.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of FIGS. 2-3 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least some of the steps in Figure 2-3 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
在其中一个实施例中,如图5所示,提供了一种区块链数据归档存储装置,包括:数据检测模块410、数据分片模块420、分布式存储引擎获取模块430和数据存储模块440,其中:In one of the embodiments, as shown in FIG. 5, a blockchain data archive storage device is provided, including: a data detection module 410, a data slicing module 420, a distributed storage engine acquisition module 430, and a data storage module 440 ,among them:
数据检测模块410,用于周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,预设归档条件包括高度值条件以及访问频率条件;The data detection module 410 is configured to periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived. The preset archiving conditions include height value conditions and access frequency conditions;
数据分片模块420,用于对待归档数据进行分片处理,得到分片数据;The data fragmentation module 420 is configured to perform fragmentation processing on the data to be archived to obtain fragmented data;
分布式存储引擎获取模块430,用于识别分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;The distributed storage engine obtaining module 430 is used to identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine;
数据存储模块440,用于将分片数据存储至目标分布式存储引擎中的存储节点。The data storage module 440 is used to store the fragmented data to the storage node in the target distributed storage engine.
如图6所示,在其中一个实施例中,区块链数据归档存储装置还包括关系记录模块450,用于计算各数据分组的哈希值,根据各数据分组的哈希值,构建与各数据分组所在的数据分片对应的梅克尔树,记录各梅克尔树与各数据分片所在的存储节点的对应关系。As shown in FIG. 6, in one of the embodiments, the blockchain data archiving storage device further includes a relationship recording module 450, which is used to calculate the hash value of each data group, and construct a relationship with each data group according to the hash value of each data group. The Merkle tree corresponding to the data segment where the data group is located records the correspondence between each Merkle tree and the storage node where each data segment is located.
如图6所示,在其中一个实施例中,区块链数据归档存储装置还包括容量检测模块460,用于对各存储节点进行容量检测,当各存储节点容量达到预设存储容量时,发送存储节点扩展请求。As shown in FIG. 6, in one of the embodiments, the blockchain data archive storage device further includes a capacity detection module 460, which is used to perform capacity detection on each storage node. When the capacity of each storage node reaches the preset storage capacity, send Storage node expansion request.
在其中一个实施例中,数据检测模块410还用于当数据总量大于预设总量时,则从最低高度值依次获取满足预设总量的数据集,检测数据集中每个数据的访问频率,当数据集中每个数据均小于预设访问频率时,则将数据集中的数据确定为待归档数据。In one of the embodiments, the data detection module 410 is further configured to, when the total amount of data is greater than the preset total amount, sequentially obtain data sets meeting the preset total amount from the lowest height value, and detect the access frequency of each data in the data set , When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
在其中一个实施例中,数据分片模块420还用于将待归档数据划分为至少两个组别,得到至少两组分组数据,将各分组数据添加到至少两个数据分片中,得到分片数据。In one of the embodiments, the data fragmentation module 420 is further configured to divide the data to be archived into at least two groups to obtain at least two groups of grouped data, and add each grouped data to the at least two data fragments to obtain the divided data. Piece data.
在其中一个实施例中,分布式存储引擎获取模块430还用于当识别到分片数据来源于私有区块链时,修改通用私有式存储引擎对应的入口参数,获取私有式存储引擎,当识别到分片数据基于智能合约来源于联盟区块链时,修改通用联盟式存储引擎对应的入口参数,获取联盟式存储引擎,当识别到分片数据来源于公有式区块链时,修改通用公有式存储引擎对应的入口参数,获取目标公有式存储引擎。In one of the embodiments, the distributed storage engine acquisition module 430 is also used to modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine when it is identified that the shard data comes from the private blockchain. When the shard data comes from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine. When it is recognized that the shard data comes from the public blockchain, modify the general public Obtain the target public storage engine corresponding to the entry parameters of the storage engine.
在其中一个实施例中,数据存储模块440还用于根据哈希算法,计算分片数据的哈希值,将分片数据存储至目标分布式存储引擎中哈希值指向的存储节点。In one of the embodiments, the data storage module 440 is further configured to calculate the hash value of the fragmented data according to the hash algorithm, and store the fragmented data to the storage node pointed to by the hash value in the target distributed storage engine.
在其中一个实施例中,数据存储模块440还用于识别分片数据的数据量或数据重要程度级别标识,根据数据量或数据重要程度级别标识,将分片数据分配至目标分布式存储引擎中的存储节点。In one of the embodiments, the data storage module 440 is also used to identify the data volume or data importance level identification of the fragmented data, and allocate the fragmented data to the target distributed storage engine according to the data volume or data importance level identification Storage node.
在其中一个实施例中,数据存储模块440还用于对各存储节点进行评估归类,确定各存储节点的存储性能,对分片数据进行权值分配评估,确定分片数据的存储需求,根据各存储节点的存储性能以及各分片数据的存储需求,将分片数据存储至目标分布式存储引擎中的存储节点。In one of the embodiments, the data storage module 440 is also used to evaluate and classify each storage node, determine the storage performance of each storage node, evaluate the weight distribution of the shard data, determine the storage requirements of the shard data, and The storage performance of each storage node and the storage requirements of each fragmented data are stored to the storage node in the target distributed storage engine.
关于区块链数据归档存储装置的具体限定可以参见上文中对于区块链数据归档存储方法的限定,在此不再赘述。上述区块链数据归档存储装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the blockchain data archive storage device, please refer to the above limitation on the blockchain data archive storage method, which will not be repeated here. Each module in the above-mentioned blockchain data archiving storage device can be implemented in whole or in part by software, hardware and a combination thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储区块链中的待归档数据等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种区块链数据归档存储方法。In one of the embodiments, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 7. The computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store the data to be archived in the blockchain. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to realize a blockchain data archiving and storage method.
本领域技术人员可以理解,图Y中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in Figure Y is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,计算机可读指令被处理器执行时实现本申请任意一个实施例中提供的区块链数据归档存储方法的步骤。A computer device, including a memory and one or more processors, in which computer readable instructions are stored, and when the computer readable instructions are executed by the processor, the blockchain data archive storage method provided in any embodiment of the present application is implemented A step of.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器本申请任意一个实施例中提供的区块链数据归档存储方法的步骤。One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, one or more processors are provided in any of the embodiments of the present application The steps of the blockchain data archive storage method.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任 何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, they should It is considered as the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims (20)

  1. 一种区块链数据归档存储方法,所述方法包括:A method for archiving and storing blockchain data, the method comprising:
    周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,所述预设归档条件包括高度值条件以及访问频率条件;Periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived, where the preset archiving conditions include height value conditions and access frequency conditions;
    对所述待归档数据进行分片处理,得到分片数据;Performing fragmentation processing on the data to be archived to obtain fragmented data;
    识别所述分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
    将所述分片数据存储至所述目标分布式存储引擎中的存储节点。Storing the fragmented data to a storage node in the target distributed storage engine.
  2. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据包括:The method for archiving and storing blockchain data according to claim 1, wherein the periodically detecting data that meets preset archiving conditions in the blockchain data to obtain the data to be archived comprises:
    按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,所述周期时间小于生成满足预设总量的数据集的所需时间;Count the total amount of data between the data with the lowest height value and the data with the highest height value in the blockchain node according to the cycle time, and the cycle time is less than the time required to generate a data set meeting the preset total amount;
    当所述数据总量大于所述预设总量时,则从最低高度值依次获取满足预设总量的数据集;When the total amount of data is greater than the preset total amount, sequentially acquiring data sets that meet the preset total amount from the lowest height value;
    查看所述数据集中每个数据的访问频率;及Check the access frequency of each data in the data set; and
    当所述数据集中每个数据均小于预设访问频率时,则将所述数据集中的数据确定为待归档数据。When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
  3. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述对所述待归档数据进行分片处理,得到分片数据包括:The method for archiving and storing blockchain data according to claim 1, wherein said performing fragmentation processing on the data to be archived to obtain fragmented data comprises:
    将所述待归档数据划分为至少两个组别,得到至少两组分组数据;及Divide the data to be archived into at least two groups to obtain at least two groups of grouped data; and
    将各分组数据添加到至少两个数据分片中,得到分片数据。Each packet data is added to at least two data fragments to obtain fragmented data.
  4. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述分布式存储引擎包括私有式存储引擎、联盟式存储引擎以及公有式存储引擎;The method for archiving and storing blockchain data according to claim 1, wherein the distributed storage engine comprises a private storage engine, an alliance storage engine, and a public storage engine;
    所述识别所述分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎包括:The identifying the data source type of the fragmented data and obtaining the distributed storage engine corresponding to the data source type includes:
    当识别到所述分片数据来源于私有区块链时,修改通用私有式存储引擎对应的入口参数,获取私有式存储引擎;When it is recognized that the sharding data comes from a private blockchain, modify the entry parameters corresponding to the general private storage engine to obtain the private storage engine;
    当识别到所述分片数据基于智能合约来源于联盟区块链时,修改通用联盟式存储引擎对应的入口参数,获取联盟式存储引擎;及When it is recognized that the sharded data is derived from the alliance blockchain based on the smart contract, modify the entry parameters corresponding to the general alliance storage engine to obtain the alliance storage engine; and
    当识别到所述分片数据来源于公有式区块链时,修改通用公有式存储引擎对应的入口参数,获取公有式存储引擎。When it is recognized that the shard data comes from a public blockchain, the entry parameters corresponding to the general public storage engine are modified to obtain the public storage engine.
  5. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述将所述分片数据存储至所述目标分布式存储引擎中的存储节点包括:The method for archiving and storing blockchain data according to claim 1, wherein the storing of the fragmented data in the storage node of the target distributed storage engine comprises:
    根据哈希算法,计算所述分片数据的哈希值;及Calculate the hash value of the fragmented data according to the hash algorithm; and
    将所述分片数据存储至目标分布式存储引擎中所述哈希值指向的存储节点。Store the fragmented data in the storage node pointed to by the hash value in the target distributed storage engine.
  6. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述将所述分片数据存储至所述目标分布式存储引擎中的存储节点包括:The method for archiving and storing blockchain data according to claim 1, wherein the storing of the fragmented data in the storage node of the target distributed storage engine comprises:
    识别所述分片数据的数据量或数据重要程度级别标识;及Identify the data volume or data importance level of the fragmented data; and
    根据所述数据量或数据重要程度级别标识,将所述分片数据分配至所述目标分布式存储引擎中的存储节点。According to the data amount or the data importance level identifier, the fragmented data is allocated to the storage nodes in the target distributed storage engine.
  7. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述将所述分片数据存储至所述目标分布式存储引擎中的存储节点包括:The method for archiving and storing blockchain data according to claim 1, wherein the storing of the fragmented data in the storage node of the target distributed storage engine comprises:
    对各存储节点进行评估归类,确定所述各存储节点的存储性能;Evaluate and classify each storage node, and determine the storage performance of each storage node;
    对所述分片数据进行权值分配评估,确定所述分片数据的存储需求;及Perform weight distribution evaluation on the fragmented data to determine the storage requirements of the fragmented data; and
    根据所述各存储节点的存储性能以及所述各分片数据的存储需求,将所述分片数据存储至目标分布式存储引擎中的存储节点。According to the storage performance of each storage node and the storage requirement of each fragmented data, the fragmented data is stored to the storage node in the target distributed storage engine.
  8. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述分片数据所在的数据分片包括多个数据分组;The method for archiving and storing blockchain data according to claim 1, wherein the data fragment in which the fragment data is located includes multiple data packets;
    在所述将所述分片数据存储至所述目标分布式存储引擎中的存储节点之后,所述方法还包括:After the storing the fragmented data to the storage node in the target distributed storage engine, the method further includes:
    计算所述各数据分组的哈希值;Calculating the hash value of each data packet;
    根据所述各数据分组的哈希值,构建与各数据分组所在的数据分片对应的梅克尔树;及According to the hash value of each data group, construct a Merkle tree corresponding to the data slice where each data group is located; and
    记录所述各梅克尔树与所述各数据分片所在的存储节点的对应关系。The corresponding relationship between each Merkle tree and the storage node where each data fragment is located is recorded.
  9. 根据权利要求1所述的区块链数据归档存储方法,其特征在于,所述方法还包括:The method for archiving and storing blockchain data according to claim 1, wherein the method further comprises:
    对各存储节点进行容量检测;及Perform capacity inspection on each storage node; and
    当检测有存储节点的容量达到预设存储容量时,发送存储节点扩展请求。When it is detected that the capacity of a storage node reaches the preset storage capacity, a storage node expansion request is sent.
  10. 一种区块链数据归档存储装置,所述装置包括:A block chain data archive storage device, the device includes:
    数据检测模块,用于周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,所述预设归档条件包括高度值条件以及访问频率条件;The data detection module is used to periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived, where the preset archiving conditions include height value conditions and access frequency conditions;
    数据分片模块,用于对所述待归档数据进行分片处理,得到分片数据;A data fragmentation module, which is used to perform fragmentation processing on the data to be archived to obtain fragmented data;
    分布式存储引擎获取模块,用于识别所述分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及The distributed storage engine obtaining module is used to identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
    数据存储模块,用于将所述分片数据存储至所述目标分布式存储引擎中的存储节点。The data storage module is used to store the fragmented data to the storage node in the target distributed storage engine.
  11. 根据权利要求10所述的装置,其特征在于,所述数据检测模块还用于按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,所述周期时间小于生成满足预设总量的数据集的所需时间,当所述数据总量大于所述预设总量时,则从最低高度值依次获取满足预设总量的数据集,查看所述数据集中每个数据的访问频率,当所述数据集中每个数据均小于预设访问频率时,则将所述数据集中的数据确定为待归档数据。The device according to claim 10, wherein the data detection module is further configured to count the total amount of data between the data with the lowest height value and the data with the highest height value in the blockchain node according to the cycle time, the The cycle time is less than the time required to generate a data set that meets the preset total amount. When the total amount of data is greater than the preset total amount, the data sets that meet the preset total amount are obtained sequentially from the lowest height value, and all The access frequency of each data in the data set, when each data in the data set is less than the preset access frequency, then the data in the data set is determined as the data to be archived.
  12. 根据权利要求10所述的装置,其特征在于,所述数据分片模块还用于将所述待归档数据划分为至少两个组别,得到至少两组分组数据,将各分组数据添加到至少两个数据分片中,得到分片数据。The device according to claim 10, wherein the data fragmentation module is further configured to divide the data to be archived into at least two groups to obtain at least two groups of grouped data, and add each grouped data to at least two groups. In the two data fragments, fragment data is obtained.
  13. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:
    周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,所述预设归档条件包括高度值条件以及访问频率条件;Periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived, where the preset archiving conditions include height value conditions and access frequency conditions;
    对所述待归档数据进行分片处理,得到分片数据;Performing fragmentation processing on the data to be archived to obtain fragmented data;
    识别所述分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
    将所述分片数据存储至所述目标分布式存储引擎中的存储节点。Storing the fragmented data to a storage node in the target distributed storage engine.
  14. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,所述周期时间小于生成满足预设总量的数据集的所需时间;Count the total amount of data between the data with the lowest height value and the data with the highest height value in the blockchain node according to the cycle time, and the cycle time is less than the time required to generate a data set meeting the preset total amount;
    当所述数据总量大于所述预设总量时,则从最低高度值依次获取满足预设总量的数据集;When the total amount of data is greater than the preset total amount, sequentially acquiring data sets that meet the preset total amount from the lowest height value;
    查看所述数据集中每个数据的访问频率;及Check the access frequency of each data in the data set; and
    当所述数据集中每个数据均小于预设访问频率时,则将所述数据集中的数据确定为待归档数据。When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
  15. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    根据哈希算法,计算所述分片数据的哈希值;及Calculate the hash value of the fragmented data according to the hash algorithm; and
    将所述分片数据存储至目标分布式存储引擎中所述哈希值指向的存储节点。Store the fragmented data in the storage node pointed to by the hash value in the target distributed storage engine.
  16. 根据权利要求13所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 13, wherein the processor further executes the following steps when executing the computer-readable instruction:
    对各存储节点进行评估归类,确定所述各存储节点的存储性能;Evaluate and classify each storage node, and determine the storage performance of each storage node;
    对所述分片数据进行权值分配评估,确定所述分片数据的存储需求;及Perform weight distribution evaluation on the fragmented data to determine the storage requirements of the fragmented data; and
    根据所述各存储节点的存储性能以及所述各分片数据的存储需求,将所述分片数据存储至目标分布式存储引擎中的存储节点。According to the storage performance of each storage node and the storage requirement of each fragmented data, the fragmented data is stored to the storage node in the target distributed storage engine.
  17. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
    周期性检测区块链数据中满足预设归档条件的数据,得到待归档数据,所述预设归档条件包括高度值条件以及访问频率条件;Periodically detect data that meets preset archiving conditions in the blockchain data to obtain data to be archived, where the preset archiving conditions include height value conditions and access frequency conditions;
    对所述待归档数据进行分片处理,得到分片数据;Performing fragmentation processing on the data to be archived to obtain fragmented data;
    识别所述分片数据的数据来源类型,获取与数据来源类型对应的分布式存储引擎,得到目标分布式存储引擎;及Identify the data source type of the fragmented data, obtain the distributed storage engine corresponding to the data source type, and obtain the target distributed storage engine; and
    将所述分片数据存储至所述目标分布式存储引擎中的存储节点。Storing the fragmented data to a storage node in the target distributed storage engine.
  18. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    按照周期时间统计区块链节点中最低高度值的数据与最高高度值的数据之间的数据总量,所述周期时间小于生成满足预设总量的数据集的所需时间;Count the total amount of data between the data with the lowest height value and the data with the highest height value in the blockchain node according to the cycle time, and the cycle time is less than the time required to generate a data set meeting the preset total amount;
    当所述数据总量大于所述预设总量时,则从最低高度值依次获取满足预设总量的数据集;When the total amount of data is greater than the preset total amount, sequentially acquiring data sets that meet the preset total amount from the lowest height value;
    查看所述数据集中每个数据的访问频率;及Check the access frequency of each data in the data set; and
    当所述数据集中每个数据均小于预设访问频率时,则将所述数据集中的数据确定为待归档数据。When each data in the data set is less than the preset access frequency, the data in the data set is determined as the data to be archived.
  19. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    根据哈希算法,计算所述分片数据的哈希值;及Calculate the hash value of the fragmented data according to the hash algorithm; and
    将所述分片数据存储至目标分布式存储引擎中所述哈希值指向的存储节点。Store the fragmented data in the storage node pointed to by the hash value in the target distributed storage engine.
  20. 根据权利要求17所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:18. The storage medium of claim 17, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:
    对各存储节点进行评估归类,确定所述各存储节点的存储性能;Evaluate and classify each storage node, and determine the storage performance of each storage node;
    对所述分片数据进行权值分配评估,确定所述分片数据的存储需求;及Perform weight distribution evaluation on the fragmented data to determine the storage requirements of the fragmented data; and
    根据所述各存储节点的存储性能以及所述各分片数据的存储需求,将所述分片数据存储至目标分布式存储引擎中的存储节点。According to the storage performance of each storage node and the storage requirement of each fragmented data, the fragmented data is stored to the storage node in the target distributed storage engine.
PCT/CN2019/123147 2019-07-08 2019-12-05 Blockchain data archiving storage method and apparatus, computer device and storage medium WO2021003985A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910611644.7A CN110442644A (en) 2019-07-08 2019-07-08 Block chain data filing storage method, device, computer equipment and storage medium
CN201910611644.7 2019-07-08

Publications (1)

Publication Number Publication Date
WO2021003985A1 true WO2021003985A1 (en) 2021-01-14

Family

ID=68429864

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/123147 WO2021003985A1 (en) 2019-07-08 2019-12-05 Blockchain data archiving storage method and apparatus, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN110442644A (en)
WO (1) WO2021003985A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988664A (en) * 2021-03-11 2021-06-18 中国平安财产保险股份有限公司 Data archiving method, device, equipment and storage medium
CN113190555A (en) * 2021-04-30 2021-07-30 北京沃东天骏信息技术有限公司 Data import method and device
CN113312663A (en) * 2021-05-31 2021-08-27 尧领有限公司 Distributed data storage method and system, and computer readable storage medium
CN113535849A (en) * 2021-07-08 2021-10-22 电子科技大学 Extensible consensus method for block chain
CN113538152A (en) * 2021-08-02 2021-10-22 浙江数秦科技有限公司 Data transaction platform for protecting data privacy
CN113691581A (en) * 2021-07-08 2021-11-23 杭州又拍云科技有限公司 Efficient CDN (content delivery network) fragment refreshing method
CN113867649A (en) * 2021-10-20 2021-12-31 上海万向区块链股份公司 Adaptive blockchain data storage plugins
CN114218304A (en) * 2021-12-30 2022-03-22 杭州趣链科技有限公司 Data archiving method and device, computing equipment and medium
CN117473020A (en) * 2023-12-27 2024-01-30 湖南天河国云科技有限公司 Data access method, system, computer storage medium and terminal device
CN118331967A (en) * 2024-06-13 2024-07-12 长江三峡集团实业发展(北京)有限公司 Merck tree generation method, device, equipment and storage medium
US12074981B2 (en) 2021-05-19 2024-08-27 Micro Focus Llc Blockchain consolidation with active archiving

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442644A (en) * 2019-07-08 2019-11-12 深圳壹账通智能科技有限公司 Block chain data filing storage method, device, computer equipment and storage medium
CN111163151A (en) * 2019-12-26 2020-05-15 联想(北京)有限公司 Information processing method and device and computer readable storage medium
CN111209336B (en) * 2019-12-30 2020-09-15 广州博士信息技术研究院有限公司 Data distribution method and device based on block chain and server
CN111221910A (en) * 2019-12-31 2020-06-02 杭州趣链科技有限公司 Fragment storage method for improving block chain read-write performance
CN111522660B (en) * 2020-04-16 2024-05-24 武汉有牛科技有限公司 Big data monitoring solution based on block chain technology
CN111524006A (en) * 2020-04-16 2020-08-11 武汉有牛科技有限公司 Cross-chain payment solution based on block chain technology
CN111611319A (en) * 2020-06-08 2020-09-01 杭州复杂美科技有限公司 Distributed data storage method, device and storage medium
CN111770149B (en) * 2020-06-23 2023-02-14 江苏荣泽信息科技股份有限公司 Novel alliance chain system based on distributed storage
CN111984735A (en) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 Data archiving method and device, electronic equipment and storage medium
CN112231398B (en) * 2020-09-25 2024-07-23 北京金山云网络技术有限公司 Data storage method, device, equipment and storage medium
CN112181945B (en) * 2020-09-28 2023-11-21 中国平安人寿保险股份有限公司 Data archiving processing method, device, computer equipment and storage medium
CN114500553B (en) * 2020-10-23 2024-07-02 中移(苏州)软件技术有限公司 Processing method, system, electronic equipment and storage medium of blockchain network
CN112558872A (en) * 2020-12-10 2021-03-26 东软集团股份有限公司 Data processing method and device, storage medium and electronic equipment
CN112631833A (en) * 2020-12-25 2021-04-09 苏州浪潮智能科技有限公司 Data archiving and querying method, system, storage medium and equipment
CN112685420A (en) * 2020-12-31 2021-04-20 北京存金所贵金属有限公司 Method, device, scheduling controller and system for expanding block chain data
CN112291376B (en) * 2020-12-31 2021-04-09 腾讯科技(深圳)有限公司 Data processing method and related equipment in block chain system
CN112783440B (en) * 2020-12-31 2021-11-30 深圳大学 Data storage method and device for user node of block chain
CN112948350B (en) * 2021-02-02 2023-08-01 中央财经大学 Distributed ledger model cold data archiving and migration storage method based on MPT verification
CN113342418B (en) * 2021-06-24 2022-11-22 国网黑龙江省电力有限公司 Distributed machine learning task unloading method based on block chain
CN114546980B (en) * 2022-04-25 2022-07-08 成都云祺科技有限公司 Backup method, system and storage medium of NAS file system
CN114595279B (en) * 2022-05-06 2022-08-12 中国信息通信研究院 Block chain data processing method and device
CN116755640B (en) * 2023-08-21 2024-02-09 腾讯科技(深圳)有限公司 Data processing method, device, computer equipment and storage medium of alliance chain

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109274752A (en) * 2018-10-10 2019-01-25 腾讯科技(深圳)有限公司 The access method and device, electronic equipment, storage medium of block chain data
CN109871366A (en) * 2019-01-17 2019-06-11 华东师范大学 A kind of storage of block chain fragment and querying method based on correcting and eleting codes
CN109885256A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 A kind of date storage method based on data fragmentation, equipment and medium
CN109885619A (en) * 2019-02-25 2019-06-14 篱笆墙网络科技有限公司 Data write-in and read method and device based on distributed data base
CN110442644A (en) * 2019-07-08 2019-11-12 深圳壹账通智能科技有限公司 Block chain data filing storage method, device, computer equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102368261A (en) * 2011-10-14 2012-03-07 成都勤智数码科技有限公司 Expandable running maintenance report generation method
CN107766147A (en) * 2016-08-23 2018-03-06 上海宝信软件股份有限公司 Distributed data analysis task scheduling system
KR101849912B1 (en) * 2017-05-25 2018-04-19 주식회사 코인플러그 Method for providing certificate service based on smart contract and server using the same
CN108664223B (en) * 2018-05-18 2021-07-02 百度在线网络技术(北京)有限公司 Distributed storage method and device, computer equipment and storage medium
CN108900314B (en) * 2018-07-19 2022-03-01 网宿科技股份有限公司 Request number charging method and device for network acceleration service
CN109376122A (en) * 2018-09-25 2019-02-22 深圳市元征科技股份有限公司 A kind of file management method, system and block chain node device and storage medium
CN109522362B (en) * 2018-10-17 2020-09-15 北京瑞卓喜投科技发展有限公司 Incomplete data synchronization method, system and equipment based on block chain data
CN109815209A (en) * 2019-03-20 2019-05-28 上海电力学院 A kind of distributed memory system for Hospital Logistic lean management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109274752A (en) * 2018-10-10 2019-01-25 腾讯科技(深圳)有限公司 The access method and device, electronic equipment, storage medium of block chain data
CN109871366A (en) * 2019-01-17 2019-06-11 华东师范大学 A kind of storage of block chain fragment and querying method based on correcting and eleting codes
CN109885256A (en) * 2019-01-23 2019-06-14 平安科技(深圳)有限公司 A kind of date storage method based on data fragmentation, equipment and medium
CN109885619A (en) * 2019-02-25 2019-06-14 篱笆墙网络科技有限公司 Data write-in and read method and device based on distributed data base
CN110442644A (en) * 2019-07-08 2019-11-12 深圳壹账通智能科技有限公司 Block chain data filing storage method, device, computer equipment and storage medium

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988664B (en) * 2021-03-11 2023-05-30 中国平安财产保险股份有限公司 Data archiving method, device, equipment and storage medium
CN112988664A (en) * 2021-03-11 2021-06-18 中国平安财产保险股份有限公司 Data archiving method, device, equipment and storage medium
CN113190555A (en) * 2021-04-30 2021-07-30 北京沃东天骏信息技术有限公司 Data import method and device
US12074981B2 (en) 2021-05-19 2024-08-27 Micro Focus Llc Blockchain consolidation with active archiving
CN113312663A (en) * 2021-05-31 2021-08-27 尧领有限公司 Distributed data storage method and system, and computer readable storage medium
CN113312663B (en) * 2021-05-31 2024-05-28 尧领有限公司 Distributed data storage method and system and computer readable storage medium
CN113535849A (en) * 2021-07-08 2021-10-22 电子科技大学 Extensible consensus method for block chain
CN113691581A (en) * 2021-07-08 2021-11-23 杭州又拍云科技有限公司 Efficient CDN (content delivery network) fragment refreshing method
CN113535849B (en) * 2021-07-08 2023-03-07 电子科技大学 Extensible consensus method for block chain
CN113538152A (en) * 2021-08-02 2021-10-22 浙江数秦科技有限公司 Data transaction platform for protecting data privacy
CN113538152B (en) * 2021-08-02 2024-01-05 浙江数秦科技有限公司 Data transaction platform for protecting data privacy
CN113867649B (en) * 2021-10-20 2024-05-10 上海万向区块链股份公司 System and method for adaptive blockchain data storage plugin
CN113867649A (en) * 2021-10-20 2021-12-31 上海万向区块链股份公司 Adaptive blockchain data storage plugins
CN114218304A (en) * 2021-12-30 2022-03-22 杭州趣链科技有限公司 Data archiving method and device, computing equipment and medium
CN117473020A (en) * 2023-12-27 2024-01-30 湖南天河国云科技有限公司 Data access method, system, computer storage medium and terminal device
CN117473020B (en) * 2023-12-27 2024-03-22 湖南天河国云科技有限公司 Data access method, system, computer storage medium and terminal device
CN118331967A (en) * 2024-06-13 2024-07-12 长江三峡集团实业发展(北京)有限公司 Merck tree generation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110442644A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
WO2021003985A1 (en) Blockchain data archiving storage method and apparatus, computer device and storage medium
US11288144B2 (en) Query optimized distributed ledger system
US10331641B2 (en) Hash database configuration method and apparatus
US11935015B2 (en) Data processing method and apparatus, computer device, and storage medium
EP3695303B1 (en) Log-structured storage systems
WO2019075978A1 (en) Data transmission method and apparatus, computer device, and storage medium
WO2021003935A1 (en) Data cluster storage method and apparatus, and computer device
US10938961B1 (en) Systems and methods for data deduplication by generating similarity metrics using sketch computation
TW202111520A (en) Log-structured storage systems
TW202113580A (en) Log-structured storage systems
WO2021057253A1 (en) Data separation and storage method and apparatus, computer device and storage medium
US10942852B1 (en) Log-structured storage systems
WO2017020576A1 (en) Method and apparatus for file compaction in key-value storage system
CN110908589B (en) Data file processing method, device, system and storage medium
AU2018355092A1 (en) Witness blocks in blockchain applications
US20210191911A1 (en) Systems and methods for sketch computation
CN105376277A (en) Data synchronization method and device
WO2019057081A1 (en) Data storage method, data query method, computer device, and storage medium
WO2016029441A1 (en) File scanning method and apparatus
US10503737B1 (en) Bloom filter partitioning
WO2021196463A1 (en) Blockchain data synchronization method and apparatus, and electronic device and storage medium
CN114969061A (en) Distributed storage method and device for industrial time sequence data
CN111026711A (en) Block chain based data storage method and device, computer equipment and storage medium
US10853892B2 (en) Social networking relationships processing method, system, and storage medium
Zhou et al. Hysteresis re-chunking based metadata harnessing deduplication of disk images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19937281

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 160522)

122 Ep: pct application non-entry in european phase

Ref document number: 19937281

Country of ref document: EP

Kind code of ref document: A1