WO2023197937A1 - 数据处理方法及其装置、存储介质、计算机程序产品 - Google Patents

数据处理方法及其装置、存储介质、计算机程序产品 Download PDF

Info

Publication number
WO2023197937A1
WO2023197937A1 PCT/CN2023/086720 CN2023086720W WO2023197937A1 WO 2023197937 A1 WO2023197937 A1 WO 2023197937A1 CN 2023086720 W CN2023086720 W CN 2023086720W WO 2023197937 A1 WO2023197937 A1 WO 2023197937A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
backup
local database
write
Prior art date
Application number
PCT/CN2023/086720
Other languages
English (en)
French (fr)
Inventor
肖蓉
陈正华
屠要峰
韩银俊
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2023197937A1 publication Critical patent/WO2023197937A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Definitions

  • the present application relates to the field of storage technology, and in particular to a data processing method and its device, storage media, and computer program products.
  • data redundancy mode can be used to store data.
  • data redundancy modes include copy redundancy mode and erasure coding (Erasure Coding, EC) redundancy mode.
  • the replica redundancy mode simply means that multiple copies of data are stored on different nodes.
  • EC redundancy mode refers to dividing the original data into N original data blocks. According to the EC algorithm and N original data blocks, M check blocks are generated to form an EC strip composed of N+M data blocks, and These N+M data blocks are stored on N+M storage nodes in the cluster.
  • the lost data blocks can be recovered based on the EC algorithm through the remaining data blocks in the EC stripe. Compared with copy mode, EC mode has better storage efficiency while achieving the same redundancy ratio.
  • Embodiments of the present application provide a data processing method and device, a storage medium, and a computer program product.
  • embodiments of the present application provide a data processing method, including: receiving first data sent by a data node, where the first data is data to be written to the data node; after receiving the first data, After a data, according to the Obtain backup data corresponding to the first data from the first data, wherein the backup data is data waiting to be updated; obtain target data according to the first data and the backup data, wherein the target data Used to calculate redundant data.
  • embodiments of the present application also provide a data processing device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • a data processing device including: a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the computer program, Data processing method as above.
  • embodiments of the present application also provide a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are used to execute the above data processing method.
  • embodiments of the present application further provide a computer program product, which includes a computer program or computer instructions.
  • the computer program or computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device obtains the information from the computer program or computer instructions.
  • the computer readable storage medium reads the computer program or the computer instructions, and the processor executes the computer program or the computer instructions, so that the computer device performs the data processing method as described above.
  • Figure 1 is an architectural diagram of an implementation environment of a data processing method provided by an embodiment of the present application
  • Figure 2 is a schematic diagram of a scenario in which data cannot be recovered due to concurrent updates by multiple clients involved in the embodiment of this application;
  • Figure 3 is an architectural diagram of a distributed storage system of a data processing method provided by an embodiment of the present application
  • Figure 4 is a flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 5 is a flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of the data storage mode in the data processing method provided by an embodiment of the present application.
  • Figure 7 is a flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 8 is a flow chart of a data processing method provided by an embodiment of the present application.
  • Figures 9a, 9b, 9c, and 9d are schematic diagrams of data redundancy in the data processing method provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of a data processing device provided by an embodiment of the present application.
  • Figure 1 is an architectural diagram of an implementation environment of a data processing method provided by an embodiment of the present application.
  • the implementation environment includes a computing device 101, a storage device 102 and a client 103.
  • the computing device 101 may be a server or a storage array controller, or the like.
  • Storage device 102 may So solid state drive SSD, mechanical hard drive (hard disk drive, HDD), etc.
  • the client 103 can be any data access device, such as an application server, a host or a terminal.
  • the computing device 101 and the storage device 102 can provide data storage services for the client 103.
  • the client 103 can provide data to be stored to the computing device 101, and the computing device 101 can obtain the data from the client 103 and store the data in the storage device 102.
  • the client 103 can send an input/output (I/O) request to the computing device 101.
  • the I/O request carries data that needs to be stored, and the computing device 101 can receive I/O from the client 103. /O request, obtain data from I/O request, store data.
  • storage device 102 may be provided to users as a cloud storage service.
  • the storage device 102 may run in a cloud environment, such as a public cloud, a private cloud, or a hybrid cloud.
  • the user can use the terminal 103 to apply for a certain amount of storage space in the cloud storage service.
  • the computing device 101 may allocate storage space of a corresponding capacity to the user, for example, allocate one or more storage devices 102 to the user, thereby storing data in the allocated storage space.
  • the storage device 102 may be provided as an object storage service, a cloud hard disk, a cloud database, or the like.
  • the number of computing devices 101, storage devices 102 and clients 103 in the above implementation environment may be more or less. For example, there may be only one computing device 101, or there may be dozens, hundreds, or more computing devices 101. In this case, the above implementation environment also includes other computing devices 101.
  • the number of storage devices 102 connected to each computing device 101 may be greater than or equal to N+M, N represents the number of data blocks, M represents the number of check blocks, and each data block is placed on the corresponding node.
  • the node of the data block is the data node, and the node where the check block is placed is the check node.
  • Multiple computing devices 101 and multiple storage devices 102 can form a storage cluster and jointly provide storage services through coordinated operation.
  • Erasure Code is a technology for redundant storage of data.
  • the original data is encoded through the erasure coding algorithm to obtain redundant check blocks, and each data block and check block are separated into Stored on different storage nodes to store data.
  • the data to be stored is divided into N data blocks, a redundant algorithm is used to perform EC encoding on the N data blocks, and M check blocks are generated.
  • the N data blocks and M check blocks are generated.
  • the parity blocks form an EC stripe.
  • Each data block or each parity block can be called a data block in the EC stripe, and each data block can be distributed to different storage nodes for storage.
  • Each EC strip can tolerate the loss of up to M data blocks.
  • any storage node fails, as long as the number of failed storage nodes does not exceed M, the failed node can be restored based on the data blocks on the non-faulty storage node.
  • Stored data blocks therefore a distributed storage system that uses EC technology to store data will have higher security and reliability.
  • the data block and all parity blocks need to be updated.
  • M 2 as an example, if during the process of updating a certain data block D1 to D1', the check block on the first check node is updated according to D1', and the second check node loses the data block. D1', then the redundancy of D1' decreases at this time. If the data node where D1' is located and the first check node fail at the same time, the lost data block D1' cannot be recovered through the EC algorithm. If multiple data blocks are updated concurrently, the recoverability of the data may be further reduced.
  • one of the following two methods can be used to update the check block: the reconstruction and writing full update method, or the incremental update method.
  • the rewrite full update method recalculates M check blocks based on the N data blocks of the current version in the EC strip.
  • Check block That is, the system needs to obtain the data blocks updated by the write operation and the data blocks not involved in the write request (also called old data blocks).
  • Data blocks are recalculated to M check blocks. Concurrent updates may cause multiple data blocks among N data blocks to be updated. In this case, multiple data block update operations need to be converted into sequential execution. An EC calculation is required for each data block update to ensure EC striping. In this case, the performance of data update will be greatly reduced.
  • the incremental update method is to calculate the difference block based on the current version of the data block of the EC stripe involved in the write request (that is, the data after the write operation is completed) and the old data, and based on the difference block and check block Calculate the new parity block based on the old data.
  • Concurrent update operations under this update method may cause data to be difficult to recover.
  • Figure 2 shows a schematic diagram of a scenario in which data cannot be recovered due to concurrent updates by multiple clients. As shown in the figure, client 1 and client 2 update data block 1 and data block 2 respectively at the same time. The difference between the updated data and the old data is represented by Diff D1 and Diff D2. Since the check block R1 and the check block The check block R2 also needs to be updated as the data block is updated.
  • the check node needs to obtain the difference data Diff D1 and the difference data Diff D2.
  • the check node The order of obtaining the difference data Diff D1 and the difference data Diff D2 may be inconsistent. That is, the check block R1 first obtains the difference data Diff D1, and then obtains the difference data Diff D2. Therefore, the update method of the check block R1 is first. Use the differential data Diff D1 to update, and then based on the first update result, use the differential data Diff D2 to update again, while the check block R2 is the opposite.
  • the update method of the check block R2 is to first use the differential data Diff D2 is updated, and then based on the first update result, it is updated again using the differential data Diff D1. Since the update requires some nonlinear operations, this leads to different update results of the check block R1 and the check block R2 under different update orders, which in turn makes it difficult to restore the data before concurrent update based on the check block.
  • embodiments of the present application provide a data processing method, which includes: after the data node obtains the first data to be written, it first sends the first data to the verification node, and the verification node receives the first data sent by the data node. After the first data, the backup data corresponding to the first data is obtained according to the first data, wherein the backup data is data waiting to be updated, and the target data is obtained according to the first data and the obtained backup data, and the target data is stored in After checking the updated data of the node, the target data can be used to calculate redundant data.
  • the data processing method of the embodiment of the present application in the scenario where the data node and the check node need to update data, it can be ensured that the check node has backup data before performing the update operation, and based on the backup data and the newly written first
  • the data obtains the target data that can be used to update the verification node. Since the verification node stores backup data, in a system with frequent concurrent updates, the time-consuming time of obtaining the backup data is less than the time-consuming time of waiting for the EC calculation sequence to be executed in real time. Therefore,
  • the data processing method provided by the embodiment of the present application has lower latency.
  • both the verification node and the data node can immediately provide the latest updated data, so that the system has better read performance. performance, and EC stripes are recoverable in the event of data updates, improving the security of data storage overall.
  • Figure 3 provides an application scenario of the data processing method involved in the embodiment of the present application.
  • the distributed block storage system architecture includes a metadata service (MetaData Service, MDS) 201, a client interface (Client Interface, CLI) 202 and multiple block storage services (Chunk Storage Daemon, CSD) 203.
  • the metadata service 201 is mainly used for cluster configuration and metadata management.
  • the client interface 202 provides an access interface for block storage, which can be in user mode or kernel mode.
  • the block storage service 203 provides the actual block data storage function.
  • a virtual block device interface is provided to the application layer.
  • the application layer can see a unified virtual data volume.
  • the virtual data volume is divided into multiple data blocks and stored in different block storage services 203, and each The data block corresponds to at least one physical CSD device.
  • the CSD device here may be one or more of the storage devices in the embodiment of FIG. 1 .
  • Data block logical grouping (ChunkGroup, CG) is used to implement group mapping and can calculate the corresponding physical storage location of the data block.
  • the physical storage of the data block is actually responsible for a group of CSDs to store multiple copies of the data block.
  • Figure 4 is a flow chart of a data processing method provided by an embodiment of the present application.
  • the data processing method may include but is not limited to step S100, step S200 and step S300.
  • Step S100 Receive the first data sent by the data node, where the first data is data to be written to the data node.
  • the client when it needs to write new data, it needs to calculate the data block logical group CG to which the current write request belongs according to the routing configuration, as well as the data nodes in this data block logical group CG, and select the corresponding data
  • the node sends a new write data request; when the data node receives the new write data request, it forwards the new write data request to all verification nodes in the same EC strip through routing, and the new write data request carries The first newly written data.
  • Step S200 After receiving the first data, obtain backup data corresponding to the first data according to the first data, where the backup data is data waiting to be updated.
  • step S200 is further described. As shown in Figure 5, step S200 may include but is not limited to the following steps:
  • Step S210 Perform query processing on the local database based on the first data to query whether the local database contains backup data corresponding to the first data.
  • the backup data is data waiting to be updated, which can be understood as the original data before the update.
  • the verification node first queries whether the local database contains the backup data corresponding to the first data, in order to understand the backup situation of the original data in the local database of its own node and ensure that before updating, it has at least one copy of the original data before the update. After all the update data is obtained, the verification data is updated uniformly to ensure the consistency of the update.
  • querying whether the local database contains backup data corresponding to the first data can be implemented by querying the key value of the database.
  • the storage of data blocks in the database has a unique key value, but the key values of the first data and its backup data are related.
  • the low bits of the key values of the two are the same and the high bits are different. In this way, during query processing, the low bits of the key values can be queried. Obtain two sets of related data, and then further compare the high bits to determine whether the two are the relationship between the backup data and the first data.
  • multiple clients concurrently update data blocks stored in multiple data nodes.
  • the embodiments of the present application can still be executed on each data node separately.
  • the provided data processing method is that the check node receives the first data sent by each data node respectively, and performs subsequent update operations after the reception of the first data sent by all data nodes is completed. Therefore, the data processing method provided by the embodiment of the present application can solve the problem of inconsistent verification data updates under concurrent updates of multiple clients.
  • the query result is yes, that is, the local database contains backup data corresponding to the first data, then proceed to step S211.
  • Step S211 Obtain backup data from the local database.
  • the backup data is obtained directly without requesting the data node.
  • the advantage of this is that when writes occur frequently, as long as the backup data exists, there is no need to read the backup data from the data node every time, which saves network overhead to a certain extent and improves storage update performance.
  • the query result if the query result is no, that is, the local database does not contain backup data corresponding to the first data, then proceed to step S212.
  • Step S212 Send a data acquisition request for requesting backup data to the data node, and receive the backup data sent by the data node according to the data acquisition request.
  • a data acquisition request needs to be sent to the data node to obtain the backup data.
  • Step S300 Obtain target data according to the first data and backup data, where the target data is used to calculate redundant data.
  • the verification node writes the first data to the local storage, and merges the first data and the backup data to obtain the target data.
  • the target data also needs to be written to the verification node. After the verification node writes the target data, it will also notify the data node that its writing operation is completed, and the data node will then write the first data.
  • the data processing method provided by the embodiment of the present application sacrifices part of the storage space of the check node to store the original data of the updated data as backup data. Therefore, the check node can decide on its own when to calculate redundant data. Before redundancy calculation, the data block exists in 1 data node and M check nodes in the form of copies, which is equivalent to M+1 redundancy. After redundancy calculation, the data blocks are stored in the form of EC stripes, and the redundancy is still M+1.
  • the timing for calculating redundant data can be when the check node receives updated data for all data blocks of the entire EC strip, or when the check node storage capacity threshold is reached.
  • Figure 6 is a schematic diagram of the data storage mode in the data processing method provided by one embodiment of the present application, showing the application of the data processing method provided by the embodiment of the present application based on two different data storage modes.
  • the figure shows the process of converting from hot data copy storage mode to cold data erasure code EC storage mode.
  • the three data nodes 301 store data block a, data block b, and data block c respectively.
  • the two check nodes 302 all store Data block a, data block b, and data block c are stored.
  • the data processing method provided by the embodiment of the present application applies this copy storage mode on the check node, thereby ensuring that the check node has complete backup data.
  • the hot data in the check node is regarded as cold data, and the erasure coding EC storage mode is used for the cold data, that is, the check node is stored
  • the data uses an erasure coding algorithm to perform redundancy calculations to obtain a new verification data block, which corresponds to the target data used for redundancy calculations in the embodiment of the present application. Therefore, the combination of hot data copy storage mode and cold data erasure code EC storage mode is an application scenario of the embodiment of the present application.
  • hot data is generally online data that needs to be frequently accessed by computing nodes
  • cold data is generally offline data that is not frequently accessed, such as enterprise backup data, business and operation log data, call records and statistical data. wait.
  • the amount of data stored in the local database reaches a preset threshold, it means that the data pool capacity is full and needs to be processed. That is, EC calculation can be started on the data stored in the current node. More storage space can be obtained after EC calculation.
  • the data stored in the local database can also be considered to be cold data, and the EC calculation is also performed.
  • any preset conditions can be used to determine whether the data has been converted from hot data to cold data.
  • any preset conditions can be stored in the local database using an erasure coding algorithm. Trigger conditions for redundant calculation processing of data in the data.
  • Figures 7 and 8 illustrate two writing and updating situations in detail.
  • FIG. 7 is a flow chart of a data processing method provided by an embodiment of the present application.
  • the data processing method may include but is not limited to steps S701 to S713.
  • Step S701 Send a first data update request.
  • the client When the client needs to write new data, it needs to calculate the data block logical group CG to which the current write request belongs according to the routing configuration, as well as the data nodes in this data block logical group CG, and select the corresponding data node to send the new write.
  • Data request when the data node receives a new write data request, it forwards the new write data request to all verification nodes through routing, and the new write data request carries the newly written first data.
  • Step S702 Forward the first data update request.
  • the data node After receiving the first data update request sent by the client, the data node will forward the first data update request to all verification nodes, and the forwarded first data update request also carries the first data.
  • Step S703 Create a data node write-ahead log.
  • the data node After forwarding the first data update request, the data node creates a write-ahead log (WAL) on its own node.
  • WAL write-ahead log
  • Step S704 Create a check node write-ahead log.
  • the verification node After receiving the first data update request, the verification node also creates a write-ahead log (Write-Ahead Log, WAL) on its own node.
  • a write-ahead log (Write-Ahead Log, WAL) on its own node.
  • Step S705 Query whether backup data is included.
  • the verification node needs to query whether the local database under its own node contains backup data.
  • the backup data is the data waiting to be updated, which is the first The original data corresponding to the data.
  • Step S706 Obtain backup data.
  • the verification node directly obtains the backup data from the local database for writing and updating in subsequent steps.
  • Step S707 Merge the first data and backup data to obtain target data.
  • the data block size stored in the data node may be 1M, but the data written and updated is often only 4k or 8k, it is necessary to merge the first data written and updated with the backup data to obtain a block size of 1M The target data block is stored.
  • Step S708 Write target data.
  • the verification node writes the merged target data to the local database, thereby completing the data writing and updating of the verification node.
  • Step S709 Send a writing completion notification.
  • the check node Since the data node will not perform the write update before the check node completes the write update, the check node will send a write completion notification to the data node to notify the data node to perform the write update.
  • Step S710 Write first data.
  • the data node After receiving the write completion notification sent by the verification node, the data node writes the first data that needs to be written and updated into the local database of the data node itself, and completes the writing and updating of the data node.
  • Step S711 Send a writing completion notification.
  • the data node sends a write completion notification to the client to inform the client that the data node has completed the write update.
  • Step S712 Delete the data node write-ahead log.
  • the data node deletes the write-ahead log corresponding to its own node, which indicates that the write updates of the data node have been completed.
  • Step S713 Send a write-ahead log deletion notification.
  • the data node sends the write-ahead log deletion notification to the verification node. After receiving the write-ahead log deletion notification, the verification node learns that the write updates of the data node have been completed.
  • the verification node obtains the delete write-ahead log from multiple data nodes. After notification, it is deemed that the update of all data nodes has been completed, and based on all updated data, the redundant calculation of the check node is performed.
  • the data processing method in the scenario where the data node and the check node need to update data, it can be ensured that the check node has backup data before performing the update operation, and based on the backup data and the newly written third Once the data is obtained, the target data can be used to update the check node. Since the check node stores backup data, the EC strip is recoverable in the case of data updates, which improves the security of data storage.
  • FIG. 8 is a flow chart of a data processing method provided by an embodiment of the present application.
  • the data processing method may include but is not limited to steps S801 to S815.
  • Step S801 Send a first data update request.
  • the client When the client needs to write new data, it needs to calculate the data block logical group CG to which the current write request belongs according to the routing configuration, as well as the data nodes in this data block logical group CG, and select the corresponding data node to send the new write.
  • Data request when the data node receives a new write data request, it forwards the new write data request to all verification nodes through routing, and the new write data request carries the newly written first data.
  • Step S802 Forward the first data update request.
  • the data node After receiving the first data update request sent by the client, the data node will forward the first data update request to all verification nodes, and the forwarded first data update request also carries the first data.
  • Step S803 Create a data node write-ahead log.
  • the data node After forwarding the first data update request, the data node creates a write-ahead log (WAL) on its own node.
  • WAL write-ahead log
  • Step S804 Create a check node write-ahead log.
  • the verification node After receiving the first data update request, the verification node also creates a write-ahead log (Write-Ahead Log, WAL) on its own node.
  • a write-ahead log (Write-Ahead Log, WAL) on its own node.
  • Step S805 Query whether backup data is included.
  • the verification node needs to query whether the local database under its own node contains backup data.
  • the backup data is the data waiting to be updated, which is the first The original data corresponding to the data.
  • Step S806 Send a backup data acquisition request.
  • the query result of the verification node is that the local database of its own node does not contain backup data, it needs to request the data node to obtain the backup data.
  • the verification node only needs to perform the action of obtaining backup data from the data node once, because the verification node will store the requested backup data in the local database.
  • the advantage of this is that when writes occur frequently, as long as If the backup data exists, there is no need to read the backup data from the data node every time, which saves network overhead to a certain extent and improves storage update performance.
  • Step S807 Send backup data.
  • the data node After receiving the backup data acquisition request sent by the verification node, the data node will send the backup data to the verification node.
  • Step S808 Write backup data.
  • the verification node stores the requested backup data in the local database, that is, writes the backup data.
  • Step S809 Merge the first data and backup data to obtain target data.
  • the data block size stored in the data node may be 1M, but the data written and updated is often only 4k or 8k, it is necessary to merge the first data written and updated with the backup data to obtain a block size of 1M The target data block is stored.
  • Step S810 Write target data.
  • the verification node writes the merged target data to the local database, thereby completing the data writing and updating of the verification node.
  • Step S811 Send a writing completion notification.
  • the check node Since the data node will not perform the write update before the check node completes the write update, the check node will send a write completion notification to the data node to notify the data node to perform the write update.
  • Step S812 Write first data.
  • the data node After receiving the write completion notification sent by the verification node, the data node writes the first data that needs to be written and updated into the local database of the data node itself, and completes the writing and updating of the data node.
  • Step S813 Send a writing completion notification.
  • the data node sends a write completion notification to the client to inform the client that the data node has completed the write update.
  • Step S814 Delete the data node write-ahead log.
  • the data node deletes the write-ahead log corresponding to its own node, which indicates that the write updates of the data node have been completed.
  • Step S815 Send a write-ahead log deletion notification.
  • the data node sends the write-ahead log deletion notification to the verification node. After receiving the write-ahead log deletion notification, the verification node learns that the write updates of the data node have been completed.
  • the erasure coding algorithm is used to perform redundant calculation processing on the data stored in the local database of the verification node to obtain the verification data block, and then based on the data stored in the local database and check data blocks to obtain erasure code strips, and store the erasure code strips in the local database.
  • the data processing method in the scenario where the data node and the check node need to update data, it can be ensured that the check node has backup data before performing the update operation, and based on the backup data and the newly written third Once the data is obtained, the target data can be used to update the check node. Since the check node stores backup data, the EC strip is recoverable in the case of data updates, which improves the security of data storage.
  • one embodiment of the present application also provides a schematic diagram of a data processing method, showing the data redundancy of data nodes and check nodes when facing concurrent updates.
  • it includes a first data node 911, a second data node 912, a third data node 913 and a first check node 921 and a second check node 922.
  • the three data nodes respectively store data.
  • Blocks D1, D2, and D3, the two check nodes and the corresponding data nodes store copies of the data respectively. Therefore, the redundancy of the data is 3.
  • the updated data block D1' and data block D2' are obtained, and the order in which the updated data blocks arrive at the two verification nodes is different.
  • the update data block D1' is first backed up in the check node 921, and then the update data block D2' is backed up.
  • the update data block D1' is backed up first.
  • the update data block D2' is backed up first, and then the update data block D1' is backed up.
  • the check node 921 has performed EC calculation and obtained the updated check data block P1', but the check node 922 has not been updated yet, as shown in Figure 9d.
  • data block D1' there are 2 copies and one verification data block P1'.
  • data block D2' there are 2 copies and one verification data block P1'.
  • data block D3 there is one copy.
  • the data block updates in the data processing method provided by the embodiments of the present application do not need to be executed sequentially, and the update of the check nodes does not need to be in a certain order, resulting in data redundancy. The degree remains unchanged throughout the update and EC calculation process, ensuring data recoverability.
  • the data processing device 400 includes a memory 410, a processor 420, and a program stored on the memory 410 and capable of running on the processor 420. Computer program.
  • Processor 420 and memory 410 may be connected via bus 430 or other means.
  • the memory 410 can be used to store non-transitory software programs and non-transitory computer executable programs.
  • memory 410 may include high-speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device.
  • the memory 410 may include memory located remotely relative to the processor 420, and these remote memories may be connected to the processor 420 through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
  • the data processing device 400 in this embodiment can be used to implement the implementation environment in the embodiment shown in Figure 1.
  • the data processing device 400 in this embodiment can constitute the implementation environment in the embodiment shown in Figure 1. Part of the implementation environment, these embodiments all belong to the same inventive concept, so these embodiments have the same implementation principles and technical effects, which will not be described in detail here.
  • the non-transitory software programs and instructions required to implement the data processing method of the above embodiment are stored in the memory 410.
  • the data processing method in the above embodiment is executed, for example, executing the above-described FIG. 4
  • an embodiment of the present application also provides a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are executed by a processor or controller, for example, by the above-mentioned Execution by a processor in the network element embodiment can cause the above processor to execute the data processing method in the above embodiment, for example, execute the above-described method steps S100 to S300 in Figure 4 and method steps S210 to S210 in Figure 5 S212, method steps S701 to S713 in Figure 7, and method steps S801 to S815 in Figure 8.
  • embodiments of the present application also provide a computer program product, including a computer program or computer instructions.
  • the computer program or computer instructions are stored in a computer-readable storage medium.
  • the processor of the computer device can read the computer program from the computer. Reading the storage medium reads the computer program or the computer instructions, and the processor executes the computer program or the computer instructions, so that the computer device performs the data processing method as described above, for example, performs the above-described Method steps S100 to S300 in FIG. 4 , method steps S210 to S212 in FIG. 5 , method steps S701 to S713 in FIG. 7 , and method steps S801 to S815 in FIG. 8 .
  • Embodiments of the present application provide a data processing method and its device, storage media, and computer program products, which can effectively improve the consistency and recoverability of EC stripes in the case of concurrent data updates.
  • Embodiments of this application include: after the data node obtains the first data to be written, it first sends the first data to the verification node. After the verification node receives the first data sent by the data node, it obtains and The backup data corresponding to the first data, wherein the backup data is data waiting to be updated, the target data is obtained according to the first data and the obtained backup data, the target data is the updated data stored in the check node, the target data Can be used to calculate redundant data.
  • the data node and the check node need to update data, it can be ensured that the check node has backup data before performing the update operation, and based on the backup data and the newly written first data, the The target data that can be used to update the check node. Since the check node stores backup data, in a system with frequent concurrent updates, it takes less time to obtain the backup data than to wait in real time for the EC calculation sequence to be executed. Therefore, this application
  • the data processing method provided by the embodiment has lower latency.
  • both the verification node and the data node can immediately provide the latest updated data, so that the system has better read performance.
  • EC stripes are recoverable in the event of data updates, which overall improves the security of data storage.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种数据处理方法、装置、计算机存储介质及计算机程序产品,数据处理方法包括:接收数据节点发送的第一数据,所述第一数据为待写入到所述数据节点的数据(S100);在接收完成所述第一数据之后,根据所述第一数据得到与所述第一数据对应的备份数据,其中,所述备份数据为等待被更新的数据(S200);根据所述第一数据和所述备份数据得到目标数据,其中,所述目标数据用于计算冗余数据(S300)。

Description

数据处理方法及其装置、存储介质、计算机程序产品
相关申请的交叉引用
本申请基于申请号为202210395783.2、申请日为2022年04月15日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及存储技术领域,特别涉及一种数据处理方法及其装置、存储介质、计算机程序产品。
背景技术
在分布式存储系统中,为了保证数据的安全性,可以采用数据冗余模式来存储数据。目前常用的数据冗余模式包括副本冗余模式和纠删码(Erasure Coding,EC)冗余模式。副本冗余模式是指简单的将数据复制多份存储在不同的节点上。EC冗余模式是指将原始数据分成N个原始数据块,根据EC算法和N个原始数据块,生成M个校验块,形成一个由N+M个数据块构成的EC条带,并将这N+M个数据块分别存储到集群中的N+M个存储节点上。当丢失的数据块的数量小于或等于M时,可以通过EC条带中剩余的数据块,基于EC算法恢复出丢失的数据块。相比于副本模式,EC模式在实现相同冗余比的条件下,具有更好的存储效率。
但是,对于EC冗余模式,当对EC条带中的任一数据块进行更新时,该数据块和该EC条带内的全部校验块都应同步更新以保持数据一致性。因此如果在数据更新的过程中,因网络或节点故障等原因,破坏了数据的一致性,将会降低EC条带的可恢复性。例如,以M=2为例,如果在将某一数据块D1更新成D1’的过程中,第一校验节点上的校验块根据D1’进行了更新,而第二校验节点丢失了数据块D1’,那么此时D1’的冗余度下降,若D1’所在的数据节点和第一校验节点同时故障,将无法通过EC算法恢复丢失的数据块D1’。这样的问题在高并发更新的分布式系统中越发凸显。常用的数据处理方法主要分为两类:第一,由主节点加锁或者每个节点分别加锁的方式将并发更新转化成顺序更新,这种方式将降低数据更新的性能;第二,通过增加日志或者缓存等方法将更新数据备份,这种方法虽然保证了数据的可恢复性以及更新的性能,但读取数据的时候需要将原始数据与日志或者缓存中的更新数据进行合并,降低了数据的读性能。所以,如何在高并发更新的分布式EC系统中,有效地平衡读写性能和可恢复性,是一个亟待解决的问题。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供了一种数据处理方法及其装置、存储介质、计算机程序产品。
第一方面,本申请实施例提供了一种数据处理方法,包括:接收数据节点发送的第一数据,所述第一数据为待写入到所述数据节点的数据;在接收完成所述第一数据之后,根据所 述第一数据得到与所述第一数据对应的备份数据,其中,所述备份数据为等待被更新的数据;根据所述第一数据和所述备份数据得到目标数据,其中,所述目标数据用于计算冗余数据。
第二方面,本申请实施例还提供了一种数据处理装置,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上的数据处理方法。
第三方面,本申请实施例还提供了一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行如上的数据处理方法。
第四方面,本申请实施例还提供了一种计算机程序产品,包括计算机程序或计算机指令,所述计算机程序或所述计算机指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机程序或所述计算机指令,所述处理器执行所述计算机程序或所述计算机指令,使得所述计算机设备执行如上所述的数据处理方法。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在说明书、权利要求书以及附图中所特别指出的结构来实现和获得。
附图说明
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。
图1是本申请实施例提供的一种数据处理方法的实施环境的架构图;
图2是本申请实施例涉及的由于多个客户端并发更新导致数据无法恢复的场景示意图;
图3是本申请实施例提供的一种数据处理方法的分布式存储系统的架构图;
图4是本申请一个实施例提供的数据处理方法的流程图;
图5是本申请一个实施例提供的数据处理方法的流程图;
图6是本申请一个实施例提供的数据处理方法中的数据存储模式示意图;
图7是本申请一个实施例提供的数据处理方法的流程图;
图8是本申请一个实施例提供的数据处理方法的流程图;
图9a、9b、9c、9d是本申请一个实施例提供的数据处理方法中数据冗余度示意图;
图10是本申请一个实施例提供的数据处理装置的示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
图1是本申请实施例提供的一种数据处理方法的实施环境的架构图,该实施环境包括计算设备101、存储设备102以及客户端103。
在一实施方式中,计算设备101可以是服务器或者存储阵列控制器等。存储设备102可 以是固态硬盘SSD、机械硬盘(hard disk drive,HDD)等。客户端103可以是任何数据访问设备,如应用服务器、主机或终端等。
计算设备101以及存储设备102可以为客户端103提供数据存储服务。在一实施方式中,客户端103可以向计算设备101提供待存储的数据,计算设备101可以从客户端103获取数据,将数据存储至存储设备102。在一实施方式中,客户端103可以向计算设备101发送输入/输出(input/output,I/O)请求,该I/O请求携带需要存储的数据,计算设备101可以从客户端103接收I/O请求,从I/O请求中获取数据,存储数据。
在一些实施例中,存储设备102可以作为云存储服务提供给用户。在一实施方式中,存储设备102可以运行在云环境中,例如可以运行在公有云、私有云或混合云上。用户可以使用终端103,在云存储服务中申请一定容量的存储空间。计算设备101可以为用户分配相应容量的存储空间,例如为用户分配一个或多个存储设备102,从而在分配的存储空间中,存储数据。作为示例,存储设备102可以提供为对象存储服务、云硬盘、云数据库等。
本领域技术人员可以知晓,上述实施环境中的计算设备101、存储设备102以及客户端103的数量可以更多或更少。比如,计算设备101可以仅为一个,或者计算设备101为几十个或几百个或者更多数量,此时上述实施环境还包括其他计算设备101。
其中,每个计算设备101连接的存储设备102的数量可以大于或等于N+M,N表示数据块的数量,M表示校验块的数量,每个数据块被放置到对应的节点上,放置数据块的节点是数据节点,放置校验块的节点是校验节点,多个计算设备101以及多个存储设备102可以组成存储集群,通过协同运行来共同提供存储服务。
纠删码(Erasure Code,EC)是一种对数据进行冗余存储的技术,通过纠删码算法对原始的数据进行编码,得到冗余的校验块,将各个数据块和校验块分别存储在不同的存储节点上,来存储数据。在一实施方式中,会将待存储的数据切分为N个数据块,采用冗余算法,对N个数据块进行EC编码,生成M个校验块,该N个数据块以及M个校验块组成一个EC条带,每个数据块或每个校验块可以称为EC条带中的一个数据块,可以将每个数据块分发至不同的存储节点以进行存储。每个EC条带中最多可以容忍M个数据块的丢失,一旦任一存储节点故障,只要故障的存储节点不超过M个,就可以根据非故障的存储节点上的数据块,恢复故障节点上存储的数据块,因此采用EC技术存储数据的分布式存储系统会具有较高的安全性与可靠性。
对于一个EC条带,在无更新操作时,最多允许丢失M个块,而更新操作则可能降低其可恢复性。在一实施方式中,当需要更新条带中任一数据块时,该数据块和全部的校验块都需要被更新。以M=2为例,如果在将某一数据块D1更新成D1’的过程中,第一校验节点上的校验块根据D1’进行了更新,而第二校验节点丢失了数据块D1’,那么此时D1’的冗余度下降,若D1’所在的数据节点和第一校验节点同时故障,将此无法通过EC算法恢复丢失的数据块D1’。如果并发更新多个数据块,则可能进一步降低数据的可恢复性。
纠删码存储系统中更新校验块可以采用如下两种方式之一:重构写全量更新方式、或增量更新方式。
对于重构写方式,以配比为N+M的纠删码存储系统为例,重构写全量更新方式更新校验块是根据EC条带中当前版本的N个数据块重新计算出M个校验块。即系统需要获得写操作所更新的数据块、和写请求不涉及的数据块(也称为旧数据块),根据EC条带中的所有N 个数据块重新计算出M个校验块。并发更新可能导致N个数据块中的多个数据块发生更新,此时需要将多个数据块更新操作转化成顺序执行,对每个数据块的更新均需要进行一次EC计算来保证EC条带的一致性,此种情形下,数据更新的性能将极大的下降。
增量更新方式则是根据写请求所涉及的EC条带的数据块的当前版本的数据(即写操作完成后的数据)、和旧数据计算出差量块,并根据差量块和校验块上的旧数据计算新的校验块。此种更新方式下的并发更新操作会有数据难以恢复的问题,图2示出了一种由于多个客户端并发更新导致数据无法恢复的场景示意图。如图所示,客户端1与客户端2在同一时间分别更新数据块1与数据块2,更新后的数据与旧数据的差量用Diff D1与Diff D2表示,由于校验块R1与校验块R2也需要随着数据块的更新而更新,因此增量更新方式下,校验节点需要获得差量数据Diff D1与差量数据Diff D2,但是,在并发更新的情形下,校验节点获得差量数据Diff D1与差量数据Diff D2的顺序可能不一致,即校验块R1先获得了差量数据Diff D1,然后获得了差量数据Diff D2,因此校验块R1的更新方式为先使用差量数据Diff D1更新,然后基于第一次的更新结果,再次使用差量数据Diff D2进行更新,而校验块R2与之相反,校验块R2的更新方式为先使用差量数据Diff D2更新,然后基于第一次的更新结果,再次使用差量数据Diff D1进行更新。由于更新需要进行一些非线性运算,这导致了校验块R1与校验块R2在不同更新顺序下的不同的更新结果,进而导致难以根据校验块恢复并发更新前的数据。
基于此,本申请实施例提供了一种数据处理方法,包括:数据节点在获得待写入的第一数据后,先将第一数据发送给校验节点,校验节点接收到数据节点发送的第一数据后,根据第一数据得到与所述第一数据对应的备份数据,其中,备份数据为等待被更新的数据,根据第一数据和得到的备份数据得到目标数据,目标数据为存储至校验节点的更新后的数据,目标数据能够被用于计算冗余数据。根据本申请实施例的数据处理方法,在数据节点与校验节点需要进行数据更新的场景下,能够确保校验节点在进行更新操作前具有备份数据,并基于备份数据与新写入的第一数据得到可用于校验节点更新的目标数据,由于校验节点存储有备份数据,在频繁并发更新的系统中,获取备份数据的耗时相比较实时等待EC计算顺序执行的耗时更少,因此本申请实施例提供的数据处理方法具有更低的延时性,同时,当客户端具有读请求时,校验节点和数据节点均可以立即提供最近一次更新的数据,使系统具有更好的读性能,并且EC条带在数据更新的情况下具有可恢复性,整体上提高了数据存储的安全性。
图3提供了一种本申请实施例涉及的数据处理方法的应用场景,如图所示,分布式块存储系统架构中包括元数据服务(MetaData Service,MDS)201,客户端接口(Client Interface,CLI)202以及多个块存储服务(Chunk Storage Daemon,CSD)203。元数据服务201主要用于集群配置和元数据管理,客户端接口202提供块存储的访问接口,可以是用户态,也可以是内核态,块存储服务203提供实际的块数据存储功能。
基于上述存储系统架构,提供虚拟块设备接口给应用层,应用层可以看到统一的虚拟数据卷,虚拟数据卷被切分成多个数据块,分散存储在不同块存储服务203上,且每个数据块至少对应一个物理CSD设备,这里的CSD设备可以是图1实施例中的存储设备中的一种或多种。数据块逻辑分组(ChunkGroup,CG),用于实现分组映射,能够计算得到数据块相应的物理存储位置,数据块的物理存储实际由一组CSD负责,存储该数据块的多份副本。
如图4所示,图4是本申请一个实施例提供的数据处理方法的流程图,在图4的实施例 中,该数据处理方法可以包括但不限于步骤S100、步骤S200与步骤S300。
步骤S100:接收数据节点发送的第一数据,第一数据为待写入到数据节点的数据。
需要说明的是,客户端在需要新写入数据时,需根据路由配置计算当前写入请求所属于的数据块逻辑分组CG,以及这个数据块逻辑分组CG中的数据节点,并选择对应的数据节点发送新写入数据请求;当数据节点收到新写入数据请求时,通过路由,再将新写入数据请求转发给同一EC条带所有校验节点,且新写入数据请求中携带有新写入的第一数据。
步骤S200:在接收完成第一数据之后,根据第一数据得到与第一数据对应的备份数据,其中,备份数据为等待被更新的数据。
在一实施例中,对步骤S200进行进一步的说明,如图5所示,步骤S200可以包括但不限于以下步骤:
步骤S210:根据第一数据对本地数据库进行查询处理,查询本地数据库是否包含与第一数据对应的备份数据。
需要说明的是,备份数据为等待被更新的数据,可以理解为是更新前的原始数据。校验节点先查询本地数据库是否包含与第一数据对应的备份数据,是为了了解自身节点的本地数据库中原始数据的备份情况,确保在更新前,至少具有一份更新前的原始数据,配合在全部更新数据均获得后,再统一更新校验数据,能够保证更新的一致性。
在一实施例中,查询本地数据库是否包含与第一数据对应的备份数据,可以通过数据库的key值查询来实现。数据块在数据库中的存储具有唯一的key值,但是第一数据与其备份数据的key值具有相关性,两者的key值的低位相同,高位不同,这样在查询处理时,可以通过低位的查询获得两组相关的数据,再进一步通过比较高位,来确定两者是否为备份数据与第一数据的关系。
需要说明的是,在一些实施例中,是多个客户端分别对多个数据节点中存储的数据块进行并发更新,在这种情形下,仍然可以对每个数据节点分别执行本申请实施例提供的数据处理方法,即,校验节点分别接收每个数据节点发送的第一数据,并在所有的数据节点发送的第一数据接收完成后,再进行后续的更新操作。因此,本申请实施例提供的数据处理方法可以解决多客户端并发更新下校验数据更新不一致的问题。
对于查询结果,如果查询结果为是,即本地数据库包含与第一数据对应的备份数据,则进行步骤S211。
步骤S211:从本地数据库获取备份数据。
如果本地数据库已具有与第一数据对应的备份数据,则直接获取备份数据,无需向数据节点请求。这样做的好处是,当写频繁发生时,只要备份数据存在,就不需要每次都从数据节点读取备份数据,一定程度上节省了网络开销,提高存储更新性能。
对于查询结果,如果查询结果为否,即本地数据库不包含与第一数据对应的备份数据,则进行步骤S212。
步骤S212:向数据节点发送用于请求备份数据的数据获取请求,接收数据节点根据数据获取请求发送的备份数据。
如果本地数据库不具有与第一数据对应的备份数据,则需要向数据节点发送数据获取请求,以获取备份数据。
步骤S300:根据第一数据和备份数据得到目标数据,其中,目标数据用于计算冗余数据。
校验节点将第一数据写入本地存储,并将第一数据和备份数据合并,得到目标数据,目标数据也需要写入校验节点。校验节点在写入目标数据后,也会通知数据节点其写入动作已完成,数据节点再将第一数据写入。
本申请实施例提供的数据处理方法通过牺牲校验节点的一部分存储空间,将更新数据的原始数据作为备份数据存储起来,因此,校验节点可以自行决定何时进行冗余数据的计算。在冗余计算之前,数据块以副本的形式存在于1个数据节点和M个校验节点,相当于M+1冗余。冗余计算之后,数据块以EC条带的形式存储,冗余度依然是M+1。进行冗余数据计算的时机,可以是当校验节点接收完成整个EC条带所有数据块的更新数据后,或者是当校验节点存储容量阈值的达到。
图6是本申请一个实施例提供的数据处理方法中的数据存储模式示意图,示出了本申请实施例提供的数据处理方法基于两种不同数据存储模式下的应用。
图中示出了由热数据副本存储模式转换为冷数据纠删码EC存储模式的过程。以N=3,M=2为例,对于热数据,采用副本模式进行存储,三个数据节点301存储有分别存储有数据块a、数据块b、数据块c,两个校验节点302均存储有数据块a、数据块b、数据块c,本申请实施例提供的数据处理方法在校验节点上应用了这种副本存储模式,从而保证校验节点具有完整的备份数据。而当热数据满足一定的预设条件,即满足冷数据的标准时,则将校验节点中的数据看作是冷数据,而对冷数据采用纠删码EC存储模式,即对校验节点存储的数据使用纠删码算法进行冗余计算,得到新的校验数据块,对应了本申请实施例中用于进行冗余计算的目标数据。因此,热数据副本存储模式与冷数据纠删码EC存储模式的结合为本申请实施例的一种应用场景。
本领域技术人员知晓,热数据一般是需要被计算节点频繁访问的在线类数据,冷数据一般是对于离线类不经常访问的数据,比如企业备份数据、业务与操作日志数据、话单与统计数据等。
在一实施例中,若存储在本地数据库中的数据的数据量达到预设的阈值时,则表示数据池容量已满需要进行处理,即可以开始对存储在当前节点的数据进行EC计算,进行EC计算后可以获取更多的存储空间。
在一实施例中,若存储在本地数据库中的数据在预设的访问时长内未被访问,也可认为存储在当前节点的数据已经是冷数据,同样进行EC计算。
可以理解的是,任何提前预设的条件均可以被用来判断数据是否已经由热数据转化为冷数据,也就是说,任何的预设条件均可以作为使用纠删码算法对存储在本地数据库中的数据进行冗余计算处理的触发条件。
为了示出数据写入更新的全流程,图7和图8对两种写入更新情况进行了详细说明。
图7是本申请一个实施例提供的数据处理方法的流程图,在图7的实施例中,该数据处理方法可以包括但不限于步骤S701至步骤S713。
步骤S701:发送第一数据更新请求。
客户端在需要新写入数据时,需根据路由配置计算当前写入请求所属于的数据块逻辑分组CG,以及这个数据块逻辑分组CG中的数据节点,并选择对应的数据节点发送新写入数据请求;当数据节点收到新写入数据请求时,通过路由,再将新写入数据请求转发给所有校验节点,且新写入数据请求中携带有新写入的第一数据。
步骤S702:转发第一数据更新请求。
数据节点在接收到客户端发送的第一数据更新请求后,会将第一数据更新请求转发给所有校验节点,且转发的第一数据更新请求中也携带有第一数据。
步骤S703:创建数据节点预写日志。
数据节点在转发完成第一数据更新请求后,在自身节点创建预写日志(Write-Ahead Log,WAL)。
本领域技术人员知晓,预写日志WAL的中心思想是对数据文件的修改必须是只能发生在这些修改已经记录了日志之后,也就是说,在描述这些变化的日志记录冲刷到永久存储器之后,这样不需要在每次事务提交的时候都把数据页冲刷到磁盘,因为在出现崩溃的情况下,可以用日志来恢复数据库;同时日志一般很小,并且是顺序写入,因而效率更高。
步骤S704:创建校验节点预写日志。
校验节点在接收到第一数据更新请求后,也在自身节点创建预写日志(Write-Ahead Log,WAL)。
步骤S705:查询是否包含备份数据。
由于要保证数据更新后仍能保留更新前的原始数据,也就是备份数据,校验节点需要查询自身节点下的本地数据库是否包含备份数据,其中,备份数据为等待被更新的数据,是第一数据对应的原始数据。
步骤S706:获取备份数据。
当查询结果为本地数据库已经包含备份数据,则校验节点直接从本地数据库中获取备份数据,用于后续步骤中的写入更新。
步骤S707:合并第一数据与备份数据,得到目标数据。
需要说明的是,由于存储在数据节点的数据块大小可能为1M,但是写入更新的数据往往只有4k或8k,因此需要将写入更新的第一数据与备份数据合并,得到一个大小为1M的目标数据块进行存储。
步骤S708:写入目标数据。
校验节点将合并后的目标数据写入到本地数据库,由此完成校验节点的数据写入更新。
步骤S709:发送写入完成通知。
由于数据节点不会在校验节点完成写入更新前进行写入更新,因此校验节点会发送写入完成通知给数据节点,以通知数据节点进行写入更新。
步骤S710:写入第一数据。
数据节点在接收到校验节点发送的写入完成通知后,将需要写入更新的第一数据写入到数据节点自身的本地数据库,完成数据节点的写入更新。
步骤S711:发送写入完成通知。
数据节点发送写入完成通知给客户端,告知客户端数据节点完成写入更新。
步骤S712:删除数据节点预写日志。
数据节点删除自身节点对应的预写日志,表征该数据节点的写入更新已全部完成。
步骤S713:发送删除预写日志通知。
数据节点将删除预写日志通知发送给校验节点,校验节点在接收到删除预写日志通知后,得知该数据节点的写入更新已全部完成。
需要说明的是,对于涉及多个数据节点并发更新的情况,本申请实施例提供的数据处理方法仍然可以应用,此种情形下,校验节点在从多个数据节点处均获得删除预写日志通知后,才认定全部数据节点的更新均已完成,并基于全部更新后的数据,执行校验节点的冗余计算。
根据本申请实施例提供的数据处理方法,在数据节点与校验节点需要进行数据更新的场景下,能够确保校验节点在进行更新操作前具有备份数据,并基于备份数据与新写入的第一数据得到可用于校验节点更新的目标数据,由于校验节点存储有备份数据,因此EC条带在数据更新的情况下具有可恢复性,提高了数据存储的安全性。
图8是本申请一个实施例提供的数据处理方法的流程图,在图8的实施例中,该数据处理方法可以包括但不限于步骤S801至步骤S815。
步骤S801:发送第一数据更新请求。
客户端在需要新写入数据时,需根据路由配置计算当前写入请求所属于的数据块逻辑分组CG,以及这个数据块逻辑分组CG中的数据节点,并选择对应的数据节点发送新写入数据请求;当数据节点收到新写入数据请求时,通过路由,再将新写入数据请求转发给所有校验节点,且新写入数据请求中携带有新写入的第一数据。
步骤S802:转发第一数据更新请求。
数据节点在接收到客户端发送的第一数据更新请求后,会将第一数据更新请求转发给所有校验节点,且转发的第一数据更新请求中也携带有第一数据。
步骤S803:创建数据节点预写日志。
数据节点在转发完成第一数据更新请求后,在自身节点创建预写日志(Write-Ahead Log,WAL)。
本领域技术人员知晓,预写日志WAL的中心思想是对数据文件的修改必须是只能发生在这些修改已经记录了日志之后,也就是说,在描述这些变化的日志记录冲刷到永久存储器之后,这样不需要在每次事务提交的时候都把数据页冲刷到磁盘,因为在出现崩溃的情况下,可以用日志来恢复数据库;同时日志一般很小,并且是顺序写入,因而效率更高。
步骤S804:创建校验节点预写日志。
校验节点在接收到第一数据更新请求后,也在自身节点创建预写日志(Write-Ahead Log,WAL)。
步骤S805:查询是否包含备份数据。
由于要保证数据更新后仍能保留更新前的原始数据,也就是备份数据,校验节点需要查询自身节点下的本地数据库是否包含备份数据,其中,备份数据为等待被更新的数据,是第一数据对应的原始数据。
步骤S806:发送备份数据获取请求。
校验节点的查询结果为自身节点的本地数据库不包含备份数据,则需要向数据节点请求获取备份数据。
值得注意的是,校验节点向数据节点获取备份数据的动作只需进行一次,因为校验节点会将请求到的备份数据存储到本地数据库,这样做的好处是,当写频繁发生时,只要备份数据存在,就不需要每次都从数据节点读取备份数据,一定程度上节省了网络开销,提高存储更新性能。
步骤S807:发送备份数据。
数据节点在接收到校验节点发送的备份数据获取请求后,会向校验节点发送备份数据。
步骤S808:写入备份数据。
校验节点将请求到的备份数据存储到本地数据库,即写入备份数据。
步骤S809:合并第一数据与备份数据,得到目标数据。
需要说明的是,由于存储在数据节点的数据块大小可能为1M,但是写入更新的数据往往只有4k或8k,因此需要将写入更新的第一数据与备份数据合并,得到一个大小为1M的目标数据块进行存储。
步骤S810:写入目标数据。
校验节点将合并后的目标数据写入到本地数据库,由此完成校验节点的数据写入更新。
步骤S811:发送写入完成通知。
由于数据节点不会在校验节点完成写入更新前进行写入更新,因此校验节点会发送写入完成通知给数据节点,以通知数据节点进行写入更新。
步骤S812:写入第一数据。
数据节点在接收到校验节点发送的写入完成通知后,将需要写入更新的第一数据写入到数据节点自身的本地数据库,完成数据节点的写入更新。
步骤S813:发送写入完成通知。
数据节点发送写入完成通知给客户端,告知客户端数据节点完成写入更新。
步骤S814:删除数据节点预写日志。
数据节点删除自身节点对应的预写日志,表征该数据节点的写入更新已全部完成。
步骤S815:发送删除预写日志通知。
数据节点将删除预写日志通知发送给校验节点,校验节点在接收到删除预写日志通知后,得知该数据节点的写入更新已全部完成。
在校验节点完成目标数据的写入后,采用纠删码算法对存储在校验节点的本地数据库中的数据进行冗余计算处理,得到校验数据块,然后根据存储在本地数据库中的数据和校验数据块得到纠删码条带,并将纠删码条带存储至本地数据库。
根据本申请实施例提供的数据处理方法,在数据节点与校验节点需要进行数据更新的场景下,能够确保校验节点在进行更新操作前具有备份数据,并基于备份数据与新写入的第一数据得到可用于校验节点更新的目标数据,由于校验节点存储有备份数据,因此EC条带在数据更新的情况下具有可恢复性,提高了数据存储的安全性。
参照图9a-9d所示,本申请的一个实施例还提供了一种数据处理方法的示意图,示出了数据节点与校验节点在面对并发更新时的数据冗余度。在图9a对应的实施例中,包含第一数据节点911、第二数据节点912、第三数据节点913以及第一校验节点921、第二校验节点922,三个数据节点分别存储有数据块D1、D2、D3,两个校验节点以及对应的数据节点分别存储有数据的副本,因此,数据的冗余度为3。
如图9b所示,对当前数据节点的数据进行EC计算,得到校验数据块P1与P2,两个校验数据块P1与P2分别存储在两个校验节点。
当两个客户端Client1和Client2均更新数据块D1以及数据块D2时,得到更新后的数据块D1’以及数据块D2’,且更新的数据块到达两个校验节点的顺序不同。如图9c所示,校验节点921中先备份了更新数据块D1’,随后备份了更新数据块D2’,然而在校验节点922,先 先备份了更新数据块D2’,随后备份了更新数据块D1’。
校验节点921已经进行EC计算,得到了更新的校验数据块P1’,但是校验节点922还未更新,如图9d所示。此时,对于数据块D1’,存在2个副本和一个校验数据块P1’,对于数据块,D2’存在2个副本和一个校验数据块P1’,对于数据块D3,存在一个副本,一个P1’(由D1',D2'和D3编码得出)和一个P2(D1,D2和D3编码得出),因此三个数据块D1’、D2’和D3,数据冗余度仍然为3,因此任意2个节点的数据发生丢失或者故障,均能通过冗余数据恢复。
相比于相关技术中需要多个数据块更新操作转化成顺序执行,本申请实施例提供的数据处理方法中的数据块更新无需顺序执行,校验节点的更新也无需按照确定顺序,数据冗余度在整个更新及EC计算过程中均保持不变,保证了数据的可恢复性。
另外,参照图10所示,本申请的一个实施例还提供了一种数据处理装置,该数据处理装置400包括存储器410、处理器420及存储在存储器410上并可在处理器420上运行的计算机程序。
处理器420和存储器410可以通过总线430或者其他方式连接。
存储器410作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器410可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器410可包括相对于处理器420远程设置的存储器,这些远程存储器可以通过网络连接至该处理器420。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
需要说明的是,本实施例中的数据处理装置400,可以用于实现如图1所示实施例中的实施环境,本实施例中的数据处理装置400能够构成图1所示实施例中的实施环境的一部分,这些实施例均属于相同的发明构思,因此这些实施例具有相同的实现原理以及技术效果,此处不再详述。
实现上述实施例的数据处理方法所需的非暂态软件程序以及指令存储在存储器410中,当被处理器420执行时,执行上述实施例中的数据处理方法,例如,执行以上描述的图4中的方法步骤S100至S300、图5中的方法步骤S210至S212、图7中的方法步骤S701至S713、图8中的方法步骤S801至S815。
此外,本申请的一个实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机可执行指令,该计算机可执行指令被一个处理器或控制器执行,例如,被上述网元实施例中的一个处理器执行,可使得上述处理器执行上述实施例中的数据处理方法,例如,执行以上描述的图4中的方法步骤S100至S300、图5中的方法步骤S210至S212、图7中的方法步骤S701至S713、图8中的方法步骤S801至S815。
此外,本申请实施例还提供了一种计算机程序产品,包括计算机程序或计算机指令,所述计算机程序或所述计算机指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机程序或所述计算机指令,所述处理器执行所述计算机程序或所述计算机指令,使得所述计算机设备执行如上所述的数据处理方法,例如,执行以上描述的图4中的方法步骤S100至S300、图5中的方法步骤S210至S212、图7中的方法步骤S701至S713、图8中的方法步骤S801至S815。
本申请实施例提供了一种数据处理方法及其装置、存储介质、计算机程序产品,能够有效地提高EC条带在数据并发更新的情况下的一致性和可恢复性。本申请实施例包括:数据节点在获得待写入的第一数据后,先将第一数据发送给校验节点,校验节点接收到数据节点发送的第一数据后,根据第一数据得到与所述第一数据对应的备份数据,其中,备份数据为等待被更新的数据,根据第一数据和得到的备份数据得到目标数据,目标数据为存储至校验节点的更新后的数据,目标数据能够被用于计算冗余数据。根据本申请实施例的方案,在数据节点与校验节点需要进行数据更新的场景下,能够确保校验节点在进行更新操作前具有备份数据,并基于备份数据与新写入的第一数据得到可用于校验节点更新的目标数据,由于校验节点存储有备份数据,在频繁并发更新的系统中,获取备份数据的耗时相比较实时等待EC计算顺序执行的耗时更少,因此本申请实施例提供的数据处理方法具有更低的延时性,同时,当客户端具有读请求时,校验节点和数据节点均可以立即提供最近一次更新的数据,使系统具有更好的读性能,并且EC条带在数据更新的情况下具有可恢复性,整体上提高了数据存储的安全性。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
以上是对本申请的若干实施方式进行了说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请范围的前提下还可作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。

Claims (11)

  1. 一种数据处理方法,包括:
    接收数据节点发送的第一数据,所述第一数据为待写入到所述数据节点的数据;
    在接收完成所述第一数据之后,根据所述第一数据得到与所述第一数据对应的备份数据其中,所述备份数据为等待被更新的数据;
    根据所述第一数据和所述备份数据得到目标数据,其中,所述目标数据用于计算冗余数据。
  2. 根据权利要求1所述的方法,其中,所述根据所述第一数据得到与所述第一数据对应的备份数据,包括:
    根据所述第一数据对本地数据库进行查询处理,得到所述本地数据库是否包含与所述第一数据对应的备份数据的查询结果;
    根据所述查询结果获取所述备份数据。
  3. 根据权利要求2所述的方法,其中,所述根据所述查询结果获取所述备份数据,包括:
    当所述查询结果为所述本地数据库包含所述备份数据,从所述本地数据库获取所述备份数据;
    或者,
    当所述查询结果为所述本地数据库不包含所述备份数据,向所述数据节点发送用于请求所述备份数据的数据获取请求,接收所述数据节点根据所述数据获取请求发送的所述备份数据。
  4. 根据权利要求1所述的方法,其中,所述根据所述第一数据和所述备份数据得到目标数据,包括:
    对所述第一数据和所述备份数据进行数据合并处理,得到所述目标数据;
    将所述目标数据写入本地数据库。
  5. 根据权利要求4所述的方法,其中,所述根据所述第一数据得到与所述第一数据对应的备份数据之前,所述方法还包括:
    生成与待写入所述第一数据的逻辑数据块对应的预写日志信息;
    所述将所述目标数据写入本地数据库之后,所述方法还包括:
    向所述数据节点发送用于表征完成写入所述目标数据的完成通知信息,其中,所述完成通知信息用于使所述数据节点写入所述第一数据,以及用于使所述数据节点发送用于通知删除所述预写日志信息的删除通知信息;
    接收所述数据节点根据所述完成通知信息发送的所述删除通知信息;
    根据所述删除通知信息删除所述预写日志信息。
  6. 根据权利要求5所述的方法,其中,所述根据所述删除通知信息删除所述预写日志信息之后,所述方法还包括:
    采用纠删码算法对存储在所述本地数据库中的数据进行冗余计算处理,得到校验数据块;
    根据所述存储在所述本地数据库中的数据和所述校验数据块得到纠删码条带;
    将所述纠删码条带存储至所述本地数据库。
  7. 根据权利要求6所述的方法,其中,所述采用纠删码算法对存储在所述本地数据库中的数据进行冗余计算处理,得到校验数据块,包括:
    判断存储在所述本地数据库中的数据是否满足预设条件;
    当所述存储在所述本地数据库中的数据满足所述预设条件,采用纠删码算法对所述存储在所述本地数据库中的数据进行冗余计算处理,得到校验数据块。
  8. 根据权利要求7所述的方法,其中,所述预设条件包括:
    存储在所述本地数据库中的数据的数据量达到预设的阈值;
    或者,
    存储在所述本地数据库中的数据在预设的访问时长内未被访问。
  9. 一种数据处理装置,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至8中任意一项所述的数据处理方法。
  10. 一种计算机可读存储介质,存储有计算机可执行指令,所述计算机可执行指令用于执行权利要求1至8任一项所述的数据处理方法。
  11. 一种计算机程序产品,包括计算机程序或计算机指令,所述计算机程序或所述计算机指令存储在计算机可读存储介质中,计算机设备的处理器从所述计算机可读存储介质读取所述计算机程序或所述计算机指令,所述处理器执行所述计算机程序或所述计算机指令,使得所述计算机设备执行如权利要求1至8任一项所述的数据处理方法。
PCT/CN2023/086720 2022-04-15 2023-04-06 数据处理方法及其装置、存储介质、计算机程序产品 WO2023197937A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210395783.2A CN114676000A (zh) 2022-04-15 2022-04-15 数据处理方法及其装置、存储介质、计算机程序产品
CN202210395783.2 2022-04-15

Publications (1)

Publication Number Publication Date
WO2023197937A1 true WO2023197937A1 (zh) 2023-10-19

Family

ID=82077683

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/086720 WO2023197937A1 (zh) 2022-04-15 2023-04-06 数据处理方法及其装置、存储介质、计算机程序产品

Country Status (2)

Country Link
CN (1) CN114676000A (zh)
WO (1) WO2023197937A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114676000A (zh) * 2022-04-15 2022-06-28 中兴通讯股份有限公司 数据处理方法及其装置、存储介质、计算机程序产品

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190220356A1 (en) * 2016-09-30 2019-07-18 Huawei Technologies Co., Ltd. Data Processing Method, System, and Apparatus
CN112328435A (zh) * 2020-12-07 2021-02-05 武汉绿色网络信息服务有限责任公司 目标数据备份和恢复的方法、装置、设备及存储介质
CN112463450A (zh) * 2020-11-27 2021-03-09 北京浪潮数据技术有限公司 一种增量备份管理方法、系统、电子设备及存储介质
CN114676000A (zh) * 2022-04-15 2022-06-28 中兴通讯股份有限公司 数据处理方法及其装置、存储介质、计算机程序产品

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190220356A1 (en) * 2016-09-30 2019-07-18 Huawei Technologies Co., Ltd. Data Processing Method, System, and Apparatus
CN112463450A (zh) * 2020-11-27 2021-03-09 北京浪潮数据技术有限公司 一种增量备份管理方法、系统、电子设备及存储介质
CN112328435A (zh) * 2020-12-07 2021-02-05 武汉绿色网络信息服务有限责任公司 目标数据备份和恢复的方法、装置、设备及存储介质
CN114676000A (zh) * 2022-04-15 2022-06-28 中兴通讯股份有限公司 数据处理方法及其装置、存储介质、计算机程序产品

Also Published As

Publication number Publication date
CN114676000A (zh) 2022-06-28

Similar Documents

Publication Publication Date Title
US10229011B2 (en) Log-structured distributed storage using a single log sequence number space
US11003533B2 (en) Data processing method, system, and apparatus
US10698881B2 (en) Database system with database engine and separate distributed storage service
US9396073B2 (en) Optimizing restores of deduplicated data
WO2018040591A1 (zh) 一种远程数据复制方法及系统
US10769035B2 (en) Key-value index recovery by log feed caching
US10372537B2 (en) Elastic metadata and multiple tray allocation
US10229009B2 (en) Optimized file system layout for distributed consensus protocol
US10268593B1 (en) Block store managamement using a virtual computing system service
US9785510B1 (en) Variable data replication for storage implementing data backup
KR101694984B1 (ko) 비대칭 클러스터링 파일시스템에서의 패리티 산출 방법
US10185507B1 (en) Stateless block store manager volume reconstruction
WO2017049764A1 (zh) 数据读写方法及分布式存储系统
US20170308437A1 (en) Parity protection for data chunks in an object storage system
US11698728B2 (en) Data updating technology
US20110055471A1 (en) Apparatus, system, and method for improved data deduplication
WO2019001521A1 (zh) 数据存储方法、存储设备、客户端及系统
US10803012B1 (en) Variable data replication for storage systems implementing quorum-based durability schemes
US11507283B1 (en) Enabling host computer systems to access logical volumes by dynamic updates to data structure rules
US10223184B1 (en) Individual write quorums for a log-structured distributed storage system
WO2023197937A1 (zh) 数据处理方法及其装置、存储介质、计算机程序产品
US20230350760A1 (en) Physical size api for snapshots backed up to object store
US10921991B1 (en) Rule invalidation for a block store management system
US11194501B2 (en) Standby copies withstand cascading fails
US10809920B1 (en) Block store management for remote storage systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23787582

Country of ref document: EP

Kind code of ref document: A1