WO2022095346A1 - 一种区块链数据存储方法、系统、设备及可读存储介质 - Google Patents

一种区块链数据存储方法、系统、设备及可读存储介质 Download PDF

Info

Publication number
WO2022095346A1
WO2022095346A1 PCT/CN2021/089875 CN2021089875W WO2022095346A1 WO 2022095346 A1 WO2022095346 A1 WO 2022095346A1 CN 2021089875 W CN2021089875 W CN 2021089875W WO 2022095346 A1 WO2022095346 A1 WO 2022095346A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
block
directory
target block
area
Prior art date
Application number
PCT/CN2021/089875
Other languages
English (en)
French (fr)
Inventor
林楷智
蔡志恺
黄柏学
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to US18/246,659 priority Critical patent/US20240045853A1/en
Publication of WO2022095346A1 publication Critical patent/WO2022095346A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files

Definitions

  • the present invention relates to the technical field of blockchain, and in particular, to a blockchain data storage method, system, device and readable storage medium.
  • Blockchain storage technology essentially refers to the management of each blockchain storage system (such as the simplest desktop computer, mid-to-high-end server) as the smallest storage unit. It is hoped that with the characteristics of blockchain and distributed storage, At the protocol layer or software application layer, confidentiality and data fault-tolerant backup can be achieved.
  • the current computer system when improving the blockchain storage technology, not only needs to consider data storage in the file system and storage device controller of the software layer, but also consider the execution of the program and compatibility with earlier systems. This limits the development of new technologies such as storage optimization for blockchain storage. Therefore, with the increasing amount of data storage and the demand for fast storage, the current blockchain storage technology still has the problem of slow data storage.
  • the purpose of the present invention is to provide a block chain data storage method, system, device and readable storage medium, so as to speed up the data storage speed of the block chain.
  • the present invention provides the following technical solutions:
  • a blockchain data storage method comprising:
  • the block file system obtains the target block serial number and target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is the same as the block chain area. The block size is the same, and the directory area stores the mapping relationship between the block serial number and the cluster address;
  • target block sequentially assigning target cluster addresses, and in the directory area, record the target mapping relationship between the target block sequence number and the target cluster address;
  • the content of the target block is sequentially written into the data area.
  • it also includes:
  • it also includes:
  • the target mapping relationship is recorded.
  • it also includes:
  • it also includes:
  • the object mapping relationship is recorded.
  • it also includes:
  • the content of the target block is sequentially written into the data area, including:
  • a block file system including:
  • the block to be stored acquisition module is used to acquire the target block serial number and target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is the same as the size of the target block.
  • the block size of the blockchain is the same, and the directory area stores the mapping relationship between the block serial number and the cluster address;
  • a mapping relationship recording module for sequentially assigning a target cluster address for the target block, and in the directory area, recording the target mapping relationship between the target block sequence number and the target cluster address;
  • the data writing module is used for sequentially writing the content of the target block into the data area according to the address of the target cluster.
  • An electronic device comprising:
  • the processor is configured to implement the steps of the above blockchain data storage method when executing the computer program.
  • the block file system obtains the target block serial number and target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is The same as the block size of the blockchain, the directory area stores the mapping relationship between the block serial number and the cluster address; sequentially assigns the target cluster address to the target block, and records the target block serial number and target cluster address in the directory area.
  • the target mapping relationship according to the target cluster address, the content of the target block is sequentially written into the data area.
  • the block file system sets the size of the cluster of the data area as the block size by using the unique serial number of each block of the block chain, that is, the block serial number, and the fixed size of the block, so,
  • the mapping relationship recorded in the directory area corresponds to the mapping relationship between the block serial number and the cluster address. That is to say, blocks correspond to clusters, and the directory area directly uses the block serial number as the cluster index. Simple and clear blockchain directory structure and the mapping relationship with the data area, and one block corresponds to one cluster.
  • the content of the target block can be directly written into the data area in sequence. Because the directory is simple, the data can be written in sequence, which can save a lot of searching. Time and data writing time can speed up data writing in the blockchain.
  • the embodiments of the present invention also provide a block file system, a device, and a readable storage medium corresponding to the above-mentioned block chain data storage method, which have the above-mentioned technical effects, and are not repeated here.
  • FIG. 1 is an implementation flow chart of a method for storing blockchain data in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the format of a block file system according to an embodiment of the present invention.
  • Fig. 3 is the format schematic diagram of a kind of FAT file system
  • FIG. 4 is a schematic diagram of a data access in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of data access based on a cache directory in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of data access based on a CAM directory in an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a CAM directory in an embodiment of the present invention.
  • FIG. 8 is another schematic diagram of data access based on a CAM directory in an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a block file system according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of an electronic device in an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a specific structure of an electronic device in an embodiment of the present invention.
  • Blockchain refers to a series of written records (also known as blocks) that are connected and protected by cryptography.
  • IPFS InterPlanetary File System
  • IPFS InterPlanetary File System
  • the IPFS protocol combines the advantages of blockchain technology and various network protocols to store immutable data. Remove duplicate files on the network, and obtain the address information of storage nodes to search for files in the network. IPFS divides files into fixed-size blocks, and each block is indexed with a unique set of serial numbers (CID/ID). Block content is stored in a distributed file system. The distributed file system reads all blocks based on the CID and restores them to the original file.
  • CID/ID serial numbers
  • the File System is a module in the current operating system, responsible for managing and storing data in the form of files in the computer system.
  • File Allocation Table refers to a file system invented and partially patented by Microsoft, which is used by MS-DOS and all Windows systems.
  • Ext2 Linux's most traditional disk file system, followed by Ext3/4, etc. Its basic principle is the same as FAT, but it can be regarded as an improvement of FAT.
  • Combined storage is a special type of computer storage used for some very high-speed search programs.
  • CAM not only has a data comparator and an address encoder, but also has the characteristics of general memory accessible data, and provides high-speed data search capabilities.
  • a common application is to search for specific network packet content in semiconductor integrated circuits.
  • the CAM memory uses static memory (SRAM), and the delay of each query can be as small as 5 nanoseconds. If the CAM memory uses dynamic memory (DRAM) due to large storage space requirements, each query delay is about 100 to 150 nanoseconds.
  • SRAM static memory
  • DRAM dynamic memory
  • System Call A series of functions provided by the kernel. These system calls are implemented in the kernel, and then the system calls are given to the user in a certain way. System calls are the interface between user programs and the kernel
  • Device Driver refers to Device Driver, which is a special program that enables computers and devices/devices to communicate with each other. It is equivalent to the interface of the hardware. Only through this interface can the operating system control the work of the hardware device/equipment.
  • Logical Block Address (LBA, Logical Block Address) is a general mechanism used to represent the location of data on a computer system data storage device.
  • Physical Block Address is the data sector of the mechanical hard disk or the data page of the solid-state hard disk.
  • the software program will use the logical block address to request the file from the system when reading and writing files, and the system's file system and storage device controller will convert it into a physical block address to find the real address of the data and read and write.
  • Sector the smallest storage unit of traditional mechanical hard disks
  • the basic unit of disk read and write is sector
  • the basic unit of solid-state hard disk is paging.
  • Cluster/cluster that is, a group of consecutive sectors
  • each cluster/cluster can include 2, 4, 8, 16, 32, 64..., that is, 2 to the nth power of sectors, the minimum size of the file system operation file Units are clusters.
  • File Directory Table (FDT, File Directory Table).
  • a linked list refers to a data structure.
  • a linked list is a non-consecutive and non-sequential storage structure on a physical storage unit. The logical order of data elements is achieved through the link order of pointers in the linked list.
  • a linked list consists of a series of nodes (each element in the linked list is called a node), and nodes can be dynamically generated at runtime. Each node consists of two parts: one is the data field that stores the data element, and the other is the pointer field that stores the address of the next node. Compared with the linear table sequential structure, the operation is complicated.
  • a linked list can achieve O(1) complexity when inserting, which is much faster than another linear list sequential list, but it takes O(n) to find a node or access a specific number of nodes time, and the corresponding time complexity of linear table and sequential table are O(logn) and O(1) respectively.
  • FIG. 1 is a flowchart of a method for storing blockchain data in an embodiment of the present invention
  • FIG. 4 is a schematic diagram of a data access in an embodiment of the present invention.
  • the process of writing and reading blocks of the blockchain needs to go through software programs, system calls, and file systems, and finally to the physical hard disk.
  • the file system (with the storage device controller) plays a key role in converting the logical block addresses corresponding to the software program and the operating system core layer into the physical block addresses of the hard disk.
  • the method can be applied to the block file system in the schematic data access architecture shown in FIG. 4 , and includes the following steps:
  • the block file system obtains the target block serial number and the target block content of the target block.
  • the block file system includes a directory area and a data area.
  • the size of each cluster in the data area is the same as the block size of the blockchain, and the directory area stores the mapping relationship between the block serial number and the cluster address.
  • the target block can be any block to be stored.
  • FIG. 2 is a schematic diagram of the format of a block file system according to an embodiment of the present invention. It can be seen that the mapping relationship between the CID and the cluster address is stored in the directory area, and the size of each cluster in the data area is consistent with the block size. That is, the block file system has the following characteristics/characteristics:
  • the size of the cluster in the block file system is the block size, and it is faster to find block data
  • the block file system simplifies the directory structure and the mapping with the data area
  • Each serial number (CID) in the directory area represents each block
  • Each cluster address in the directory area the cluster address of the data area.
  • FIG. 3 is a schematic diagram of the format of a FAT file system. As shown in Figure 3, assuming that the hard disk space is arranged linearly, the simplified FAT file system from left to right is:
  • Boot sector located at the very beginning, it mainly records important information about system startup and file system;
  • FAT1/FAT2 Two file allocation tables, this is for the sake of system redundancy, FAT indicates how the cluster is stored;
  • Root directory area directory table for storing files and directory information
  • Data area The area where data is actually stored.
  • the directory structure of the block file system in the embodiment of the present invention is simpler and clearer.
  • the directory is simply the correspondence between block size and actual data read and write.
  • the block size is used as the cluster size, one block corresponds to one cluster, and the block serial number can also be searched in the directory, which can save the search process, especially the directory and data writing.
  • Sequential writing (refer to S102 and S103 for details) can speed up data writing.
  • the target cluster address can be sequentially allocated to the target block.
  • the sequential allocation of cluster addresses can be specifically to obtain the target serial number and the content of the target block, and the target block serial number can be sequentially written into the directory area according to the directory area and the cluster address of the data area (corresponding to the address of each cluster in the directory area).
  • the first received blockchain block cluster address is assigned as 0, and the hard disk cluster address is written to 0;
  • the second received blockchain block cluster address is assigned as 1, and the hard disk cluster address 1 is written to And so on, write to the next available space.
  • there is no need to search for available space on the hard disk and sequential writing saves time for disk search.
  • the target cluster addresses are sequentially allocated to the target blocks, and only one cluster address needs to be allocated, that is, one cluster is convenient for one block to correspond to.
  • the mapping relationship between the address and the target block can be clarified, that is, there is a mapping relationship between the target block serial number and the target cluster address.
  • the target block serial number is The mapping relationship with the target cluster address is called the target mapping relationship.
  • the target mapping relationship can be stored in the directory area. It can be seen that, in this embodiment, the directory area only needs to store the mapping relationship between the target block number and the target cluster address, and the storage location of the block content can be directly retrieved based on the block number.
  • the content of the target block can be sequentially written into the data area in a sequential writing manner.
  • the sequential writing of data may refer to the sequential writing of block serial numbers, which will not be repeated here.
  • sequentially writing the content of the target block into the data area includes: writing the content of the target block into the consecutive hard disk sectors corresponding to the data area.
  • the block file system obtains the target block serial number and the target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is The same as the block size of the blockchain, the directory area stores the mapping relationship between the block serial number and the cluster address; sequentially assigns the target cluster address to the target block, and records the target block serial number and target cluster address in the directory area.
  • the target mapping relationship according to the target cluster address, the content of the target block is sequentially written into the data area.
  • the block file system sets the size of the cluster of the data area as the block size by using the unique serial number of each block of the block chain, that is, the block serial number, and the fixed size of the block, so,
  • the mapping relationship recorded in the directory area corresponds to the mapping relationship between the block serial number and the cluster address. That is to say, blocks correspond to clusters, and the directory area directly uses the block serial number as the cluster index. Simple and clear blockchain directory structure and the mapping relationship with the data area, and one block corresponds to one cluster.
  • the content of the target block can be directly written into the data area in sequence. Because the directory is simple, the data can be written in sequence, which can save a lot of searching. Time and data writing time can speed up data writing in the blockchain.
  • the embodiments of the present invention also provide corresponding improvement solutions.
  • the same steps or corresponding steps in the above-mentioned embodiments can be referred to each other, and corresponding beneficial effects can also be referred to each other, which will not be repeated in the preferred/improved embodiments herein.
  • a processing flow for quickly reading the target block is also proposed, and the specific implementation process includes:
  • Step 1 receiving a read request for reading the target block
  • Step 2 query the target cluster address corresponding to the target block serial number in the directory area;
  • Step 3 utilize the target cluster address to read the target block content from the data area
  • Step 4 Output the content of the target block.
  • the target mapping relationship between the sequence number of the target block and the address of the target cluster that stores the content of the target block is recorded in the directory area, after receiving the read request for reading the target block, you can directly query the target block in the directory area.
  • the target cluster address corresponding to the block serial number After obtaining the target cluster address, the content of the target block can be directly read from the data area, and then the content of the target block can be output. That is to say, after receiving the request to read the target block, it is only necessary to search for the target cluster address corresponding to the target block block in the directory area, and then the block content can be obtained and returned in the hard disk data area.
  • the time of searching for a directory and waiting for reading can be saved.
  • the data access speed can also be improved by caching the directory.
  • the specific implementation method includes: reading the directory in the directory area, and writing the directory into the cache to obtain the cache directory; correspondingly, in the directory area, recording the target mapping relationship between the target block serial number and the target cluster address, including : In the cache directory, record the target mapping relationship.
  • the data reading process may specifically include:
  • Step 1 receiving a read request for reading the target block
  • Step 2 query the target cluster address corresponding to the target block serial number in the cache directory
  • Step 3 utilize the target cluster address to read the target block content from the data area
  • Step 4 Output the content of the target block.
  • FIG. 5 is a schematic diagram of data access based on a cache directory according to an embodiment of the present invention.
  • the hard disk directory table (that is, the directory table stored in the directory area) can be completely or partially copied to the cache directory.
  • the cache is large enough or the system volatile memory is used together, all directories can be copied to the cache directory; of course, only the directory corresponding to the hot data can be copied to the cache directory.
  • Blocks written to the blockchain Obtain the target block serial number and target block content, first update the cache directory (sequential writing can also be used to update), and then convert the target block serial number and target block content according to the available hard disk space. Sequentially write the hard disk directory area and data area. Further, when updating the cache directory, it is not necessary to update the directory of the hard disk synchronously every time, but only when the system is shut down, the system is about to be powered off, or the system is currently idle (that is, no blockchain blocks are written/read). The directory of the hard disk is updated only after the request is made. It should be noted that to use the cache directory to speed up data access efficiency, it is necessary to ensure that the cache directory and the hard disk directory area keep data synchronization.
  • Read blockchain block After determining the sequence number of the target block to be read, it is only necessary to go to the cache directory to find the target cluster address corresponding to the sequence number of the target block, and then the content of the target block can be obtained from the hard disk and output. Because the lookup speed of the cache directory is faster than that of the hard disk directory, it can improve the read speed.
  • a CAM directory can also be used to speed up the storage and reading speed of blockchain data.
  • the specific implementation method includes: reading the directory in the directory area, and copying the directory to the CAM to obtain the CAM directory; correspondingly, in the directory area, recording the target mapping relationship between the target block serial number and the target cluster address, including : In the CAM directory, record the object mapping relationship. Accordingly, the data reading process includes:
  • Step 1 receiving a read request for reading the target block
  • Step 2 in the CAM directory, query the target cluster address corresponding to the target block serial number;
  • Step 3 utilize the target cluster address to read the target block content from the data area
  • Step 4 Output the content of the target block.
  • FIG. 6 is a schematic diagram of data access based on a CAM directory according to an embodiment of the present invention.
  • CAM can cooperate with the block file directory to quickly find the address in the CAM directory with the CID as the content, that is, the cluster address.
  • FIG. 7 is a schematic diagram of a CAM directory in an embodiment of the present invention.
  • the upper part of Figure 7 below is a schematic block of the CAM catalogue in this paper, and the lower part is a schematic diagram of a general design of CAM in a semiconductor integrated circuit.
  • the CAM includes a data comparator, a memory (CELL) and an address encoder.
  • the example shown in Figure 7 is the address where the query content Qm345678... is located.
  • adding the CAM directory can realize block read and write acceleration at the hardware layer.
  • the hard disk When the computer system is powered on, the hard disk will completely copy the directory table to the CAM directory of the storage device controller.
  • Blocks written to the blockchain the program receives the target block serial number and target block content, first updates the CAM directory (using sequential write update), and then according to the available hard disk space, the target block serial number and target block content Sequentially write the hard disk directory area and data area. Since the storage device controller and the CAM directory are highly integrated hardware circuits, data read and write speeds are faster, and updating the CAM directory has less impact on the performance of writing blockchain blocks. Of course, in practical applications, when updating the CAM directory, it is not necessary to update the directory of the hard disk synchronously each time, but only when the system is shut down, the system is about to be powered off, or the system is currently idle (for example, no blocks are written/read). Chain block request), only update the hard disk directory.
  • Read block chain block To determine the target block serial number to be read, just go to the CAM directory to find the target cluster address corresponding to the target block serial number, and then the block content can be obtained from the hard disk and returned. Because the CAM directory lookup speed is faster than the hard disk directory and faster than the cache directory, it can maximize the read speed.
  • a CAM memory may be used, and a dynamic memory (DRAM, Dynamic Random Access Memory) or a static memory (SRAM, Static Random Access Memory) may also be used.
  • DRAM Dynamic Random Access Memory
  • SRAM Static Random Access Memory
  • FIG. 8 shows the system in the embodiment of the present invention with the block file system in the software core layer, and the storage device controller plus the CAM directory can be designed and produced with the current semiconductor integrated circuit technology; software, driver and storage device controller plus CAM directory, each storage device controller plus CAM directory, can access multiple hard disk system levels can also be combined with multiple, storage device controller plus CAM directory,
  • the embodiments of the present invention further provide a block file system, and the block file system described below and the block chain data storage method described above may refer to each other correspondingly.
  • the system includes the following modules:
  • the block to be stored acquisition module 101 is used to acquire the target block serial number and target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is related to the block chain The size of the block is the same, and the directory area stores the mapping relationship between the block serial number and the cluster address;
  • the mapping relationship recording module 102 is used to sequentially allocate the target cluster address for the target block, and in the directory area, record the target mapping relationship of the target block serial number and the target cluster address;
  • the data writing module 103 is used for sequentially writing the content of the target block into the data area according to the target cluster address.
  • the block file system obtains the target block serial number and target block content of the target block; wherein, the block file system includes a directory area and a data area, and the size of each cluster in the data area is The same as the block size of the blockchain, the directory area stores the mapping relationship between the block serial number and the cluster address; sequentially assigns the target cluster address to the target block, and records the target block serial number and target cluster address in the directory area.
  • the target mapping relationship according to the target cluster address, the content of the target block is sequentially written into the data area.
  • the block file system uses the unique serial number of each block of the block chain, that is, the block serial number, and the fixed size of the block, and sets the size of the cluster in the data area as the block size, so,
  • the mapping relationship recorded in the directory area corresponds to the mapping relationship between the block serial number and the cluster address. That is to say, blocks correspond to clusters, and the directory area directly uses the block serial number as the cluster index.
  • the content of the target block can be directly written into the data area in sequence. Due to the simple directory, the data can be written in sequence, which can save a lot of searching. Time and data writing time can speed up data writing in the blockchain.
  • the method further includes: a data read module for receiving a read request for reading a target block; querying the target cluster address corresponding to the sequence number of the target block in the directory area; using the target cluster address , read the content of the target block from the data area; output the content of the target block.
  • the cache acceleration module is used to read the directory in the directory area, and write the directory into the cache to obtain the cache directory;
  • mapping relationship recording module 102 is specifically configured to record the target mapping relationship in the cache directory.
  • a data read module configured to receive a read request for reading the target block; query the target cluster address corresponding to the sequence number of the target block in the cache directory; use the target cluster address , read the content of the target block from the data area; output the content of the target block.
  • a CAM acceleration module configured to read the directory in the directory area, and copy the directory to the CAM to obtain the CAM directory;
  • mapping relationship recording module 102 is specifically configured to record the target mapping relationship in the CAM directory.
  • a data read module for receiving a read request for reading a target block; querying the target cluster address corresponding to the target block serial number in the CAM directory; using the target block Cluster address, read the content of the target block from the data area; output the content of the target block.
  • the data writing module 103 is specifically configured to write the content of the target block into the continuous hard disk sector corresponding to the data area.
  • the block size is 256KB
  • FAT uses 4KB (corresponding to 1 sector size) and 64KB (corresponding to 16 sector size) for each cluster size.
  • Block file system size per cluster 256KB (corresponding to 64 sectors).
  • SAS ie SAS hard disk, is a hard disk using SCSI technology, SAS is Serial Attached SCSI, SCSI is Small Computer System Interface, small computer system interface
  • SAS is 15000rpm, that is, the hard disk 4KB/64KB random read and write assumptions reach 12MB/s, and the 256KB read and write or sequential read and write assumptions can reach 120MB/s.
  • the block file system can be written sequentially, so that the hard disk sectors are also continuous. Compared with the FAT file system, the sector cannot be guaranteed to be continuous.
  • 4KB/64KB is randomly read and written. efficacy.
  • Table 2 and Table 3 show the performance comparison results between the block file system in the embodiment of the present invention and the file system based on solid-state hard disks:
  • the block size is 256KB
  • the size of each sector of the hard disk 4KB
  • FAT uses 4KB and 64KB for each cluster size
  • the average access time (ie seek time) of SSDs is about 100us
  • .M.2PCIe SSD 256K or more read and write or sequential read and write assumptions can reach more than 3000MB/s; 4KB random read; write assumptions are about: 50MB/s; 150MB/s;
  • the block file system can be written sequentially, which can make the hard disk sectors continuous. Compared with the FAT file system, the sector cannot be guaranteed to be continuous.
  • the read and write performance of the block file system is 10.1 times to 22.95 times faster;
  • block file system read performance is 6.22 times to 26.93 times faster; block file system write performance is 4.86 times to 17.85 times faster.
  • the above performance comparison uses a 256KB blockchain block stored in the mechanical hard disk and solid-state hard disk for analysis, and obtains quantitative analysis and comparison data.
  • the storage and search of thousands of blocks can be accelerated and saved. The total time can still be obtained from this quantitative data.
  • the embodiments of the present invention further provide an electronic device, and an electronic device described below and a blockchain data storage method described above can refer to each other correspondingly.
  • the electronic device includes:
  • memory 332 for storing computer programs
  • the processor 322 is configured to implement the steps of the blockchain data storage method of the above method embodiments when executing the computer program.
  • FIG. 11 is a schematic diagram of a specific structure of an electronic device provided in this embodiment.
  • the electronic device may vary greatly due to different configurations or performances, and may include one or more processors ( central processing units (CPU) 322 (eg, one or more processors) and memory 332 that stores one or more computer applications 342 or data 344.
  • the memory 332 may be short-lived storage or persistent storage.
  • the programs stored in memory 332 may include one or more modules (not shown), each of which may include a series of instructions to operate on a data processing device.
  • the central processing unit 322 may be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301 .
  • Electronic device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input and output interfaces 358, and/or, one or more operating systems 341.
  • the steps in the blockchain data storage method described above can be implemented by the structure of the electronic device.
  • the embodiments of the present invention further provide a readable storage medium, and a readable storage medium described below and a blockchain data storage method described above can be referred to each other correspondingly.
  • the readable storage medium may specifically be a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, etc. Readable storage medium.

Abstract

一种区块链数据存储方法、系统、设备及可读存储介质,该方法包括:区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系;为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系;按照目标簇地址,将目标区块内容循序写入数据区。在该方法中,区块文件系统在存储区块过程中,在确定了目标簇地址之后,可直接将目标区块内容循序写入数据区中,目录和数据写入可循序写入,可省去大量的查找时间以及数据写入时间,能够加快区块链中数据写入速度。

Description

一种区块链数据存储方法、系统、设备及可读存储介质
本申请要求于2020年11月06日提交中国专利局、申请号为202011229981.9、发明名称为“一种区块链数据存储方法、系统、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及区块链技术领域,特别是涉及一种区块链数据存储方法、系统、设备及可读存储介质。
背景技术
区块链存储技术,本质上指以每个区块链存储系统(如最简单的台式机,中高端服务器)为最小存储单位去管理,希望以区块链的特性,加上分布式存储,在协议层或软件应用层能做到保密,及资料容错备份等等。
目前的计算机系统,在对区块链存储技术进行改进时,不仅要在软件层的文件系统与存储装置控制器考虑数据存储,也要考虑程序的执行,以及与更早系统的兼容。这限制了区块链存储在存储优化等新技术的发展。因而,在日益增长的数据存储量,以及快速存储的需求下,当前的区块链存储技术仍然存在数据存储速度慢的问题。
综上所述,如何有效地解决区块链中数据存储速度慢等问题,是目前本领域技术人员急需解决的技术问题。
发明内容
本发明的目的是提供一种区块链数据存储方法、系统、设备及可读存储介质,以加快区块链的数据存储速度。
为解决上述技术问题,本发明提供如下技术方案:
一种区块链数据存储方法,包括:
区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,所述区块文件系统包括目录区和数据区,所述数据区中每个簇的大小与区 块链的区块大小相同,所述目录区存储区块序号和簇地址之间的映射关系;
为所述目标区块循序分配目标簇地址,并在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系;
按照所述目标簇地址,将所述目标区块内容循序写入所述数据区。
优选地,还包括:
接收读取所述目标区块的读取请求;
在所述目录区查询与所述目标区块序号对应的所述目标簇地址;
利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
输出所述目标区块内容。
优选地,还包括:
读取所述目录区中的目录,并将所述目录写入缓存中,得到缓存目录;
相应地,在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系,包括:
在所述缓存目录中,记录所述目标映射关系。
优选地,还包括:
接收读取所述目标区块的读取请求;
在所述缓存目录查询与所述目标区块序号对应的所述目标簇地址;
利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
输出所述目标区块内容。
优选地,还包括:
读取所述目录区中的目录,并将所述目录复制到CAM中,得到CAM目录;
相应地,在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系,包括:
在所述CAM目录中,记录所述目标映射关系。
优选地,还包括:
接收读取所述目标区块的读取请求;
在所述CAM目录中,查询与所述目标区块序号对应的所述目标簇地址;
利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
输出所述目标区块内容。
优选地,将所述目标区块内容循序写入所述数据区,包括:
将所述目标区块内容写入所述数据区对应的连续硬盘扇区。
一种区块文件系统,包括:
待存区块获取模块,用于获取目标区块的目标区块序号和目标区块内容;其中,所述区块文件系统包括目录区和数据区,所述数据区中每个簇的大小与区块链的区块大小相同,所述目录区存储区块序号和簇地址之间的映射关系;
映射关系记录模块,用于为所述目标区块循序分配目标簇地址,并在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系;
写数据模块,用于按照所述目标簇地址,将所述目标区块内容循序写入所述数据区。
一种电子设备,包括:
存储器,用于存储计算机程序;
处理器,用于执行所述计算机程序时实现上述区块链数据存储方法的步骤。
一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述区块链数据存储方法的步骤。
应用本发明实施例所提供的方法,区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系;为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系;按照目标簇地址,将目标区块内容循序写入数据区。
在本方法中,区块文件系统借助区块链的每个区块有唯一的序号,即区块序号,以及区块的固定大小,将数据区的簇的大小设置为区块大小,如此,目录区记录的映射关系,则对应为区块序号和簇地址之间的映射关系。也就是说,区块与簇对应,目录区即直接将区块序号作为簇索引。简 单明了的区块链目录结构以及与数据区的对映关系,以及一个区块对应一个簇。本方法,在存储区块过程中,在确定了目标簇地址之后,可直接将目标区块内容循序写入数据区中,由于目录简单,数据写入可循序写入,可省去大量的查找时间以及数据写入时间,能够加快区块链中数据写入速度。
相应地,本发明实施例还提供了与上述区块链数据存储方法相对应的区块文件系统、设备和可读存储介质,具有上述技术效果,在此不再赘述。
附图说明
为了更清楚地说明本发明实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例中一种区块链数据存储方法的实施流程图;
图2为本发明实施例中一种区块文件系统的格式示意图;
图3为一种FAT文件系统的格式示意图;
图4为本发明实施例中一种数据存取示意图;
图5为本发明实施例中一种基于缓存目录进行数据存取示意图;
图6为本发明实施例中一种基于CAM目录进行数据存取示意图;
图7为本发明实施例中一种CAM目录示意图;
图8本发明实施例中另一种基于CAM目录进行数据存取示意图;
图9为本发明实施例中一种区块文件系统的结构示意图;
图10为本发明实施例中一种电子设备的结构示意图;
图11为本发明实施例中一种电子设备的具体结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具 体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为便于理解本发明实施例所提供的技术方案,下面对涉及的相关技术术语进行简要说明:
区块链(Block Chain),指藉由密码学串接并保护内容的串连文字记录(又称区块)。
星际文件系统(IPFS,InterPlanetary File System)是一个分布式文件系统,它尝试为所有计算装置连接同一个文件系统,IPFS协议结合区块链技术与各种网络协议的优势来储存不可更改的数据,移除网络上的重复档案,以及取得储存节点的地址信息,用以搜寻网络中的档案IPFS将文件分割为固定大小区块,每个区块以唯一的一组序号(CID/ID)索引与区块内容存储在分布式文件系统。分布式文件系统基于CID读取所有区块并还原为原文件。
文件系统(File System),为目前操作系统中的模块,负责计算机系统中将数据以档案的形式管理及储存。
文件分配表(FAT,File Allocation Table)指一种由微软发明并拥有部分专利的文件系统,供MS-DOS及所有的Windows系统使用的文件系统。
Ext2:Linux最传统的磁盘文件系统,后续有Ext3/4等,其基本原理同FAT,但可视为对FAT之改良。
结合存储(CAM,Content Addressable Memory),或称内容可寻址内存,是一种特殊类型的计算机存储,应用于某些非常高速的搜索程序。CAM不仅具有数据比较器及地址编码器,还具有一般内存可存取数据的特性之外,提供高速的数据搜寻能力。常见的应用为在半导体集成电路中,对特定网络封包内容做搜寻,以现今的半导体技术,其查询效能可达到200M/s(每秒200兆)以上。CAM内存使用静态内存(SRAM),每一次查询延迟可以达到小到5奈秒,若因存储空间需求较大CAM内存使用动态内存(DRAM),每一次查询延迟大约100~150奈秒。
系统调用(System Call):内核提供的一系列的函数。这些系统调用是在内核中实现的,再通过一定的方式把系统调用给用户。系统调用是用户程序和内核交互的接口
驱动程序(Device Driver)指设备驱动程序(Device Driver),是一种可以使计算机和装置/设备进行相互通信的特殊程序。相当于硬件的接口,操作系统只有通过这个接口,才能控制硬件装置/设备的工作
逻辑区块地址(LBA,Logical Block Address)是计算机系统数据存储装置上用来表示数据所在位置的通用机制。
物理区块地址(PBA,Physics Block Address)即机械硬盘的数据扇区或固态硬盘的数据分页。软件程序为了兼容性,在读写档案时会以逻辑区块地址向系统请求档案,系统的文件系统及存储装置控制器将之转换为物理区块地址,找到数据真正的地址并进行读写。机械硬盘的数据扇区或固态硬盘资料分页的LBA到PBA的转换方式。
扇区(Sector),传统机械硬盘的最小存储单位,磁盘读写基本单位是扇区,固态硬盘基本单位是分页。
簇/丛集(Cluster),即一群连续的扇区,每个簇/丛集可以包括2、4、8、16、32、64…,即2的n次方个扇区,文件系统操作文件的最小单位是簇。
档案目录表(FDT,File Directory Table)。
链表(linked list),指一种数据结构,链表是一种物理存储单元上非连续、非顺序的存储结构,数据元素的逻辑顺序是通过链表中的指针链接次序实现的。链表由一系列结点(链表中每一个元素称为结点)组成,结点可以在运行时动态生成。每个结点包括两个部分:一个是存储数据元素的数据域,另一个是存储下一个结点地址的指针域。相比于线性表顺序结构,操作复杂。由于不必须按顺序存储,链表在插入的时候可以达到O(1)的复杂度,比另一种线性表顺序表快得多,但是查找一个节点或者访问特定编号的节点则需要O(n)的时间,而线性表和顺序表相应的时间复杂度分别是O(logn)和O(1)。
请参考图1和图4,其中图1为本发明实施例中一种区块链数据存储方法的流程图,图4为本发明实施例中一种数据存取示意图。
从图4可见,区块链的区块的写入和读取流程需要透过软件程序,系统调用,以及文件系统,最后到实体的硬盘。其中。文件系统(搭配存储装置控制器)起到一个关键的作用,即将软件程序及操作系统核心层对映的逻辑区块地址,转换为硬盘的物理区块地址。实际对硬盘读写,以得到所需档案(一个档案对应一个或多个区块)对应的区块内容。该方法可应用于如图4所示的数据存取示意架构中的区块文件系统中,包括以下步骤:
S101、区块文件系统获取目标区块的目标区块序号和目标区块内容。
其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系。
其中,目标区块可以为任意一个待存储的区块。
区块文件系统的格式可参考图2,图2为本发明实施例中一种区块文件系统的格式示意图。可见,在目录区存储CID与簇地址的映射关系,在数据区中每个簇的大小与区块大小一致。即区块文件系统具有以下特征/特点:
区块文件系统中簇的大小为区块大小寻找区块数据更快;
区块文件系统简化了目录结构以及与数据区的对映;
目录区中每个序号(CID)代表每个区块;
目录区中每个簇地址=数据区的簇地址。
为便于理解本发明实施例所提供的区块文件系统的特征/特点,可对比参考图3,图3为一种FAT文件系统的格式示意图。如图3所示,假设硬盘空间以线性排列,简化的FAT文件系统由左到右分别为:
引导扇区:位于最开始的位置,主要记录了系统开机及文件系统的重要信息;
FAT1/FAT2:两份文件分配表,这是出于系统冗余考虑,FAT指示簇(Cluster)是如何储存的;
根目录区:储存档案和目录信息的目录表;
数据区:数据真正存储的区域。
相较于FAT文件系统而言,本发明实施例中的区块文件系统目录结构更加简单明了。
举例说明,目录简单与区块大小与实际数据读取和写入的对应关系。
假设文件系统要对档案文件名FILE1做读写,首先在根目录区搜寻,由FDT(档案目录表),找到FILE1的起始丛集在第3个簇,接下来由FAT1中的链表可以知道FILE1的下一个簇,直到到最后结束的簇,分别是5,7,8。也就是说,FILE1需要4个簇做为存储空间。假设硬盘每个扇区是512Byte,文件系统规划每个簇是8个扇区,即每个丛集是4096Byte=4K Byte,那FILE1在数据区占据了4个丛集共16K Byte。可见,在该文件系统中读写档案实际上要分别在根目录区及FAT1中不断花时间搜寻以存取数据,这过程包括了软件程序,文件系统及硬件控制器与硬盘的之间的各种协同工作。
而本发明实施例中的区块文件系统中以区块大小为簇大小,一个区块对应一个簇,在目录中也可以区块序号进行检索,可省去查找过程,特别是目录、数据写入循序写入(可具体参见S102和S103)可加快数据写速度。
S102、为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系。
获得目标区块的目标区块序号和目标区块内容之后,便可为目标区块循序分配目标簇地址。
其中,循序分配簇地址,可具体为获得目标序号与目标区块内容,可依照目录区与数据区簇地址(对映目录区中每个簇地址),将目标区块序号循序写入目录区,即第一个收到的区块链区块簇地址分配为0,写入硬盘簇地址0,第二个收到的区块链区块簇地址分配为1,写入硬盘簇地址1以此类推,往下一个可用的空间写入。对比,常用的文件系统,本实施例无需寻找硬盘可用空间,循序写入节省了磁盘寻找时间。
需要注意的是,在本实施例中,为目标区块循序分配目标簇地址,仅需分配一个簇地址即可,也就是说,一个簇便于一个区块对应。
分配了目标簇地址之后,便可明确地址与目标区块之间的映射关系, 即目标区块序号与目标簇地址之间存在映射关系,为便于区别,在本实施例中将目标区块序号与目标簇地址之间的映射关系称为目标映射关系。
明确该目标映射关系后,便可在目录区存储该目标映射关系。可见,在本实施例中,目录区仅需存储目标区块号与目标簇地址的映射关系即可,即可直接基于区块号检索到区块内容的存储位置。
S103、按照目标簇地址,将目标区块内容循序写入数据区。
按照目标簇地址,可按照循序写入的方式将目标区块内容循序写入数据区。
其中,数据循序续写可参照区块序号的循序写入,在此不再一一赘述。
由于一个区块对应一个簇,因而在将目标区块内容写入数据区的过程中,是将目标区块内容写入磁盘的连续空间,可提高数据读写速度。也就是说,将目标区块内容循序写入数据区,包括:将目标区块内容写入数据区对应的连续硬盘扇区。
应用本发明实施例所提供的方法,区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系;为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系;按照目标簇地址,将目标区块内容循序写入数据区。
在本方法中,区块文件系统借助区块链的每个区块有唯一的序号,即区块序号,以及区块的固定大小,将数据区的簇的大小设置为区块大小,如此,目录区记录的映射关系,则对应为区块序号和簇地址之间的映射关系。也就是说,区块与簇对应,目录区即直接将区块序号作为簇索引。简单明了的区块链目录结构以及与数据区的对映关系,以及一个区块对应一个簇。本方法,在存储区块过程中,在确定了目标簇地址之后,可直接将目标区块内容循序写入数据区中,由于目录简单,数据写入可循序写入,可省去大量的查找时间以及数据写入时间,能够加快区块链中数据写入速度。
需要说明的是,基于上述实施例,本发明实施例还提供了相应的改进 方案。在优选/改进实施例中涉及与上述实施例中相同步骤或相应步骤之间可相互参考,相应的有益效果也可相互参照,在本文的优选/改进实施例中不再一一赘述。
在本发明的一种具体实施例方式中,还提出了快速读取目标区块的处理流程,具体实现过程,包括:
步骤一、接收读取目标区块的读取请求;
步骤二、在目录区查询与目标区块序号对应的目标簇地址;
步骤三、利用目标簇地址,从数据区中读取目标区块内容;
步骤四、输出目标区块内容。
为便于描述,下面将上述四个步骤结合起来进行说明。
由于目录区中记录了目标区块序号与存储了目标区块内容的目标簇地址的目标映射关系,因此,在接收到读取目标区块的读取请求后,可直接在目录区查询与目标区块序号对应的目标簇地址。得到目标簇地址之后,便可直接从数据区中读取到目标区块内容,然后将目标区块内容输出。也就是说,收到要读取目标区块的请求后,只需要到目录区查找与目标区块区块对映的目标簇地址,即可在硬盘数据区中取得并回复区块内容。
相对于目前常用之文件系统(如以FAT为例),可省去寻找目录及等待读取的时间。
在本发明的一种具体实施方式中,还可通过缓存目录来提高数据存取速度。具体的实现方式,包括:读取目录区中的目录,并将目录写入缓存中,得到缓存目录;相应地,在目录区中,记录目标区块序号和目标簇地址的目标映射关系,包括:在缓存目录中,记录目标映射关系。相应地,数据读取过程,可具体包括:
步骤一、接收读取目标区块的读取请求;
步骤二、在缓存目录查询与目标区块序号对应的目标簇地址;
步骤三、利用目标簇地址,从数据区中读取目标区块内容;
步骤四、输出目标区块内容。
为便于描述,下面将基于缓存目录进行数据存储和进行数据读取结合起来进行说明。
请参考图5,图5为本发明实施例中基于缓存目录进行数据存取示意图。
具体的,可在计算机系统上电开机时,将硬盘目录表(即目录区存放的目录表)完全或部分复制到缓存目录。具体的,若缓存足够大或搭配使用系统易失存储器,则可将全部目录复制到缓存目录;当然,也可仅将热数据对应目录复制到缓存目录。
写入区块链的区块:获得目标区块序号与目标区块内容,首先更新缓存目录(可同样采用循序写入更新),再依照可用硬盘空间,将目标区块序号与目标区块内容循序写入硬盘目录区与数据区。进一步地,在更新缓存目录时,可不用每次同步更新硬盘的目录,而是只有在例如系统关机,系统即将断电时或者系统目前空闲时(即没有写入/读取区块链区块之请求,才更新硬盘的目录。需要注意的是,采用缓存目录加快数据存取效率,需保证缓存目录与硬盘目录区保持数据同步。
读取区块链区块:确定要读取的目标区块序号后,只需要到缓存目录查找对映目标区块序号的目标簇地址,即可由硬盘中取得并输出目标区块内容。因为缓存目录的查找速度比硬盘的目录要快,因此可以提升读取速度。
可见,对于数据存储和读取整体而言,使用缓存目标可提升写入/读取区块链区块的整体效能。
在本发明的一种具体实施方式中,还可采用CAM目录来加速区块链数据的存储和读取速度。具体的实现方式,包括:读取目录区中的目录,并将目录复制到CAM中,得到CAM目录;相应地,在目录区中,记录目标区块序号和目标簇地址的目标映射关系,包括:在CAM目录中,记录目标映射关系。相应地,数据读取过程,包括:
步骤一、接收读取目标区块的读取请求;
步骤二、在CAM目录中,查询与目标区块序号对应的目标簇地址;
步骤三、利用目标簇地址,从数据区中读取目标区块内容;
步骤四、输出目标区块内容。
为便于描述,下面将基于CAM目录进行数据存储和进行数据读取结合起来进行说明。
请参考图6,图6为本发明实施例中基于CAM目录进行数据存取示意图。
CAM以内容快速寻址的结构与特点,可以配合区块文件目录,在CAM目录中以CID为内容快速寻找地址,即簇地址。
其中,CAM目录可具体参考图7,图7为本发明实施例中一种CAM目录示意图。其中,下图7的上方为本文中CAM目录的示意方块,下方则为半导体集成电路中CAM通用设计的示意图。如图7所示CAM包括数据比较器,内存(CELL)和地址编码器,图7所示的例子为查询内容Qm345678…所在之地址。
在存储装置控制器的基础上,加上CAM目录则可以实现在硬件层的区块读写加速。计算机系统上电开机时,由硬盘将目录表完全复制到存储装置控制器的CAM目录。
写入区块链的区块:程序收到目标区块序号与目标区块内容,首先更新CAM目录(采用循序写入更新),再依照可用硬盘空间,将目标区块序号与目标区块内容循序写入硬盘目录区与数据区。由于存储装置控制器与CAM目录是度高整合的硬件电路,数据读取和写入速度更快,更新CAM目录对写入区块链区块的效能影响更小。当然,在实际应用中,更新CAM目录时,不用每次同步更新硬盘的目录,而是只有在例如系统关机,系统即将断电时或者系统目前空闲时(如,没有写入/读取区块链区块的请求),才更新硬盘的目录。
需要说明的是,不论如何对目录进行更新,需保障CAM目录与硬盘目录区保持数据同步。
读取区块链区块:确定读取的目标区块序号,只需要到CAM目录查找对映目标区块序号的目标簇地址,即可由硬盘中取得并回复区块内容。因为CAM目录查找速度比硬盘的目录还要快,也比缓存目录快,因此可 以最大程度提升读取速度。
在本实施例中,可以采用CAM内存,也可以采用动态内存(DRAM,Dynamic Random Access Memory)或静态内存(SRAM,Static Random Access Memory)。
进一步地,请参考图8,图8为本发明实施例中系统在软件核心层搭配区块文件系统,存储装置控制器加上CAM目录能以目前的半导体集成电路技术进行设计与生产;搭配对应之软件,驱动程序与存储装置控制器加上CAM目录,每个存储装置控制器加上CAM目录,可存取多个硬盘系统层级也可搭配多个,存储装置控制器加上CAM目录,
相应于上面的方法实施例,本发明实施例还提供了一种区块文件系统,下文描述的区块文件系统与上文描述的区块链数据存储方法可相互对应参照。
参见图9所示,该系统包括以下模块:
待存区块获取模块101,用于获取目标区块的目标区块序号和目标区块内容;其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系;
映射关系记录模块102,用于为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系;
写数据模块103,用于按照目标簇地址,将目标区块内容循序写入数据区。
应用本发明实施例所提供的系统,区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,区块文件系统包括目录区和数据区,数据区中每个簇的大小与区块链的区块大小相同,目录区存储区块序号和簇地址之间的映射关系;为目标区块循序分配目标簇地址,并在目录区中,记录目标区块序号和目标簇地址的目标映射关系;按照目标簇地址,将目标区块内容循序写入数据区。
在本系统中,区块文件系统借助区块链的每个区块有唯一的序号,即 区块序号,以及区块的固定大小,将数据区的簇的大小设置为区块大小,如此,目录区记录的映射关系,则对应为区块序号和簇地址之间的映射关系。也就是说,区块与簇对应,目录区即直接将区块序号作为簇索引。简单明了的区块链目录结构以及与数据区的对映关系,以及一个区块对应一个簇。本系统,在存储区块过程中,在确定了目标簇地址之后,可直接将目标区块内容循序写入数据区中,由于目录简单,数据写入可循序写入,可省去大量的查找时间以及数据写入时间,能够加快区块链中数据写入速度。
在本发明的一种具体实施方式中,还包括:读数据模块,用于接收读取目标区块的读取请求;在目录区查询与目标区块序号对应的目标簇地址;利用目标簇地址,从数据区中读取目标区块内容;输出目标区块内容。
在本发明的一种具体实施方式中,还包括:
缓存加速模块,用于读取目录区中的目录,并将目录写入缓存中,得到缓存目录;
相应地,映射关系记录模块102,具体用于在缓存目录中,记录目标映射关系。
在本发明的一种具体实施方式中,还包括:读数据模块,用于接收读取目标区块的读取请求;在缓存目录查询与目标区块序号对应的目标簇地址;利用目标簇地址,从数据区中读取目标区块内容;输出目标区块内容。
在本发明的一种具体实施方式中,还包括:CAM加速模块,用于读取目录区中的目录,并将目录复制到CAM中,得到CAM目录;
相应地,映射关系记录模块102,具体用于在CAM目录中,记录目标映射关系。
在本发明的一种具体实施方式中,还包括:读数据模块,用于接收读取目标区块的读取请求;在CAM目录中,查询与目标区块序号对应的目标簇地址;利用目标簇地址,从数据区中读取目标区块内容;输出目标区块内容。
在本发明的一种具体实施方式中,写数据模块103,具体用于将目标区块内容写入数据区对应的连续硬盘扇区。
为便于本领域技术人员更好地,理解本发明实施例所提供的区块链数据存储方法、区块文件系统的技术效果,下面结合相关技术对本发明实施例所提出的区块文件系统进行详细说明。
本发明实施例中的区块文件系统,与文件系统基于机械硬盘的效能比较结果如表1所示:
Figure PCTCN2021089875-appb-000001
表1
其中,比较条件中只列出了跟文件系统及硬盘相关之所需时间。
FAT目录及FAT1/2链表皆需由到硬盘中读取;
读取写入一个区块链区块,区块大小256KB;
机械硬盘每个扇区大小=4KB;
FAT每簇大小使用4KB(对应1个扇区大小)及64KB(对应16个扇区大小)。
区块文件系统每簇大小=256KB(对应64个扇区)。
SAS(即SAS硬盘,是一种是采用SCSI技术的硬盘,SAS即Serial Attached SCSI,SCSI即Small Computer System Interface,小型计算机系统接口)为15000rpm,即硬盘平均访问时间=旋转滞后时间+磁头寻道时间=大约2+3.5=5.5ms;
SAS为15000rpm,即硬盘4KB/64KB随机读写假设达到12MB/s,256KB以上读写或循序读写假设可达120MB/s。
在区块文件系统除了扇区损坏之特殊情况,其可循序写入的特性,可以让硬盘扇区也是连续的,相对FAT文件系统皆无法保证扇区一定连续,在此4KB/64KB随机读写效能。
本发明实施例中的区块文件系统,与文件系统基于固态硬盘的效能比较结果如表2和表3所示:
Figure PCTCN2021089875-appb-000002
表2
Figure PCTCN2021089875-appb-000003
表3
比较条件中只列出了跟文件系统及硬盘相关之所需时间。
FAT目录及FAT1/2链表皆需到硬盘中读取;
读取写入一个区块链区块,区块大小256KB;
硬盘每个扇区大小=4KB;
FAT每簇大小使用4KB及64KB;
区块文件系统的每簇大小=256KB;
固态硬盘平均访问时间(即寻找时间)大约100us;
.M.2PCIe固态硬盘:256K以上读写或循序读写假设可达3000MB/s以上;4KB随机读取;写入假设大约分别是:50MB/s;150MB/s;
64KB随机读取;写入假设大约分别是:250MB/s:500MB/s;
区块文件系统除了扇区损坏的特殊情况,其可循序写入的特性,可以让硬盘扇区也是连续的,相对FAT文件系统皆无法保证扇区一定连续。
综上,本发明实施例中的区块文件系统相对于文件系统提升效能情况如下:
基于机械硬盘:区块文件系统的读写效能快10.1倍到22.95倍;
基于固态硬盘:区块文件系统读取效能快6.22倍到26.93倍;区块文件系统写入效能快4.86倍到17.85倍。
以上的效能比较以一个256KB的区块链区块存储在机械硬盘与固态硬盘进行分析,得到量化的分析比较数据,实际运作时,对成千上万的区块存储与搜寻,能加速而节省的总时间,仍然可以由此得到量化的数据。
相应于上面的方法实施例,本发明实施例还提供了一种电子设备,下文描述的一种电子设备与上文描述的一种区块链数据存储方法可相互对应参照。
参见图10所示,该电子设备包括:
存储器332,用于存储计算机程序;
处理器322,用于执行计算机程序时实现上述方法实施例的区块链数据存储方法的步骤。
具体的,请参考图11,图11为本实施例提供的一种电子设备的具体结构示意图,该电子设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)322(例如,一个或一个以上处理器)和存储器332,存储器332存储有一个或一个以上的计算机应用程序342或数据344。其中,存储器332可以是短暂存储或持久存储。存储在存储器332的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对数据处理设备中的一系列指令操作。更进一步地,中央处理器322可以设置为与存储器332通信,在电子设备301上执行存储器332中的一系列指令操作。
电子设备301还可以包括一个或一个以上电源326,一个或一个以上有线或无线网络接口350,一个或一个以上输入输出接口358,和/或,一个或 一个以上操作系统341。
上文所描述的区块链数据存储方法中的步骤可以由电子设备的结构实现。
相应于上面的方法实施例,本发明实施例还提供了一种可读存储介质,下文描述的一种可读存储介质与上文描述的一种区块链数据存储方法可相互对应参照。
一种可读存储介质,可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例的区块链数据存储方法的步骤。
该可读存储介质具体可以为U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的可读存储介质。
本领域技术人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。

Claims (10)

  1. 一种区块链数据存储方法,其特征在于,包括:
    区块文件系统获取目标区块的目标区块序号和目标区块内容;其中,所述区块文件系统包括目录区和数据区,所述数据区中每个簇的大小与区块链的区块大小相同,所述目录区存储区块序号和簇地址之间的映射关系;
    为所述目标区块循序分配目标簇地址,并在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系;
    按照所述目标簇地址,将所述目标区块内容循序写入所述数据区。
  2. 根据权利要求1所述的区块链数据存储方法,其特征在于,还包括:
    接收读取所述目标区块的读取请求;
    在所述目录区查询与所述目标区块序号对应的所述目标簇地址;
    利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
    输出所述目标区块内容。
  3. 根据权利要求1所述的区块链数据存储方法,其特征在于,还包括:
    读取所述目录区中的目录,并将所述目录写入缓存中,得到缓存目录;
    相应地,在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系,包括:
    在所述缓存目录中,记录所述目标映射关系。
  4. 根据权利要求3所述的区块链数据存储方法,其特征在于,还包括:
    接收读取所述目标区块的读取请求;
    在所述缓存目录查询与所述目标区块序号对应的所述目标簇地址;
    利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
    输出所述目标区块内容。
  5. 根据权利要求1所述的区块链数据存储方法,其特征在于,还包括:
    读取所述目录区中的目录,并将所述目录复制到CAM中,得到CAM目录;
    相应地,在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系,包括:
    在所述CAM目录中,记录所述目标映射关系。
  6. 根据权利要求5所述的区块链数据存储方法,其特征在于,还包括:
    接收读取所述目标区块的读取请求;
    在所述CAM目录中,查询与所述目标区块序号对应的所述目标簇地址;
    利用所述目标簇地址,从所述数据区中读取所述目标区块内容;
    输出所述目标区块内容。
  7. 根据权利要求1至6任一项所述的区块链数据存储方法,其特征在于,将所述目标区块内容循序写入所述数据区,包括:
    将所述目标区块内容写入所述数据区对应的连续硬盘扇区。
  8. 一种区块文件系统,其特征在于,包括:
    待存区块获取模块,用于获取目标区块的目标区块序号和目标区块内容;其中,所述区块文件系统包括目录区和数据区,所述数据区中每个簇的大小与区块链的区块大小相同,所述目录区存储区块序号和簇地址之间的映射关系;
    映射关系记录模块,用于为所述目标区块循序分配目标簇地址,并在所述目录区中,记录所述目标区块序号和所述目标簇地址的目标映射关系;
    写数据模块,用于按照所述目标簇地址,将所述目标区块内容循序写入所述数据区。
  9. 一种电子设备,其特征在于,包括:
    存储器,用于存储计算机程序;
    处理器,用于执行所述计算机程序时实现如权利要求1至7任一项所述区块链数据存储方法的步骤。
  10. 一种可读存储介质,其特征在于,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述区块链数据存储方法的步骤。
PCT/CN2021/089875 2020-11-06 2021-04-26 一种区块链数据存储方法、系统、设备及可读存储介质 WO2022095346A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/246,659 US20240045853A1 (en) 2020-11-06 2021-04-26 Blockchain data storage method, system, device, and readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011229981.9A CN112463753B (zh) 2020-11-06 2020-11-06 一种区块链数据存储方法、系统、设备及可读存储介质
CN202011229981.9 2020-11-06

Publications (1)

Publication Number Publication Date
WO2022095346A1 true WO2022095346A1 (zh) 2022-05-12

Family

ID=74825887

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/089875 WO2022095346A1 (zh) 2020-11-06 2021-04-26 一种区块链数据存储方法、系统、设备及可读存储介质

Country Status (3)

Country Link
US (1) US20240045853A1 (zh)
CN (1) CN112463753B (zh)
WO (1) WO2022095346A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114925401A (zh) * 2022-06-14 2022-08-19 北京师范大学 一种基于区块链及分布式存储的学情记录系统及方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463753B (zh) * 2020-11-06 2023-01-10 苏州浪潮智能科技有限公司 一种区块链数据存储方法、系统、设备及可读存储介质
CN113420083B (zh) * 2021-06-02 2024-03-19 湖南大学 一种具有可拓展分布式账本的异构并行区块链结构的系统
CN113806803B (zh) * 2021-09-17 2023-06-02 厦门服云信息科技有限公司 一种数据存储方法、系统、终端设备及存储介质
CN116893787B (zh) * 2023-09-06 2023-12-05 四川易利数字城市科技有限公司 一种基于区块链大数据应用的磁盘存储方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008206A (zh) * 2019-03-22 2019-07-12 深圳前海微众银行股份有限公司 一种基于区块链系统的数据处理方法及装置
CN111782656A (zh) * 2020-06-30 2020-10-16 北京海益同展信息科技有限公司 数据读写方法及装置
US20200344044A1 (en) * 2019-04-25 2020-10-29 WAY2BIT Co. Ltd. Method and Device for Providing Blockchain Network Capable of Optimizing Storages of Respective Nodes Included Therein
CN112463753A (zh) * 2020-11-06 2021-03-09 苏州浪潮智能科技有限公司 一种区块链数据存储方法、系统、设备及可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794024A (zh) * 2015-04-15 2015-07-22 四川神琥科技有限公司 一种数据恢复方法
CN107644056B (zh) * 2017-08-04 2021-02-12 武汉烽火众智数字技术有限责任公司 一种文件存储方法、装置及系统
CN110286859B (zh) * 2019-06-28 2020-04-14 中国海洋大学 基于fat文件系统的数据存储方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008206A (zh) * 2019-03-22 2019-07-12 深圳前海微众银行股份有限公司 一种基于区块链系统的数据处理方法及装置
US20200344044A1 (en) * 2019-04-25 2020-10-29 WAY2BIT Co. Ltd. Method and Device for Providing Blockchain Network Capable of Optimizing Storages of Respective Nodes Included Therein
CN111782656A (zh) * 2020-06-30 2020-10-16 北京海益同展信息科技有限公司 数据读写方法及装置
CN112463753A (zh) * 2020-11-06 2021-03-09 苏州浪潮智能科技有限公司 一种区块链数据存储方法、系统、设备及可读存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114925401A (zh) * 2022-06-14 2022-08-19 北京师范大学 一种基于区块链及分布式存储的学情记录系统及方法

Also Published As

Publication number Publication date
CN112463753B (zh) 2023-01-10
US20240045853A1 (en) 2024-02-08
CN112463753A (zh) 2021-03-09

Similar Documents

Publication Publication Date Title
WO2022095346A1 (zh) 一种区块链数据存储方法、系统、设备及可读存储介质
EP3217294B1 (en) File access method and apparatus and storage device
US9213721B1 (en) File server system having tiered storage including solid-state drive primary storage and magnetic disk drive secondary storage
US7676628B1 (en) Methods, systems, and computer program products for providing access to shared storage by computing grids and clusters with large numbers of nodes
US8966476B2 (en) Providing object-level input/output requests between virtual machines to access a storage subsystem
JP5530863B2 (ja) ストレージシステムのためのi/o変換方法及び装置
CN114860163B (zh) 一种存储系统、内存管理方法和管理节点
US20150067001A1 (en) Cache management in a computerized system
US8694563B1 (en) Space recovery for thin-provisioned storage volumes
JP2013242908A (ja) ソリッドステートメモリ、それを含むコンピュータシステム及びその動作方法
US20130111182A1 (en) Storing a small file with a reduced storage and memory footprint
US11144508B2 (en) Region-integrated data deduplication implementing a multi-lifetime duplicate finder
WO2019127135A1 (zh) 文件页表管理技术
EP4105770A1 (en) B+ tree access method and apparatus, and computer-readable storage medium
JP2004127295A (ja) 仮想記憶システムおよびその動作方法
US10936243B2 (en) Storage system and data transfer control method
US11366609B2 (en) Technique for encoding deferred reference count increments and decrements
Son et al. Design and evaluation of a user-level file system for fast storage devices
WO2022262381A1 (zh) 一种数据压缩方法及装置
EP4307129A1 (en) Method for writing data into solid-state hard disk
US8918621B1 (en) Block address isolation for file systems
US11875152B2 (en) Methods and systems for optimizing file system usage
Zhou et al. A file system bypassing volatile main memory: Towards a single-level persistent store
US11847100B2 (en) Distributed file system servicing random-access operations
TW201610853A (zh) 用於儲存虛擬化的系統和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888050

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21888050

Country of ref document: EP

Kind code of ref document: A1