WO2024001863A1 - 一种数据处理方法及相关设备 - Google Patents

一种数据处理方法及相关设备 Download PDF

Info

Publication number
WO2024001863A1
WO2024001863A1 PCT/CN2023/101259 CN2023101259W WO2024001863A1 WO 2024001863 A1 WO2024001863 A1 WO 2024001863A1 CN 2023101259 W CN2023101259 W CN 2023101259W WO 2024001863 A1 WO2024001863 A1 WO 2024001863A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
block
data block
request
Prior art date
Application number
PCT/CN2023/101259
Other languages
English (en)
French (fr)
Inventor
李航
惠卫锋
陈续强
韩峰哲
罗日新
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024001863A1 publication Critical patent/WO2024001863A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems

Definitions

  • the present application relates to the field of storage technology, and in particular, to a data processing method, a data processing device, a computing device cluster, a computer-readable storage medium, and a computer program product.
  • EC erasure code
  • a set of data blocks and a set of check blocks generated through verification based on this set of data blocks are called EC stripes.
  • data read and update operations are usually completed by the node where a data block (for example, the first data block) in the EC stripe is located.
  • the node where the first data block in the EC stripe is located can also be called the master node, and the node where the other data blocks in the EC stripe are located and the node where the check block is located are called slave nodes.
  • the master node when updating an EC strip, for example, updating a data block in an EC strip, the master node usually needs to perform a large number of read operations and write operations. Multiple read operations and write operations can occupy a large amount of network resources. , reducing the system performance of the distributed storage system.
  • This application provides a data processing method. This method offloads the operation of updating data blocks from the master node to the slave node, reducing the number of read operations in the process of updating EC strips, avoiding occupying a large amount of network resources, and ensuring distributed storage. System performance of the system.
  • This application also provides data processing devices, computing equipment clusters, computer-readable storage media, and computer program products corresponding to the above methods.
  • this application provides a data processing method.
  • This method can be executed by the master node in the distributed storage system.
  • the master node obtains the first request, which is used to update the data block in the EC strip, and then determines the first data block according to the first request.
  • the first data block is the data block associated with the first request, and then the master node
  • the node sends a processing request to a slave node set including at least one slave node in the distributed storage system to instruct the master node to offload the operation of updating the data block to one or more slave nodes in the slave node set.
  • the number of read operations during the process of updating data blocks in the EC stripe can be reduced, network transmission overhead can be reduced, and system performance can be guaranteed.
  • the master node may send a second request including the second data block to the first slave node, and then receive the first data returned by the first slave node updating the first data block to the second data block. block, then determine the check block update information based on the first data block and the second data block, and then send a third request including the check block update information to the second slave node, where the check block update information is used to update the check block piece.
  • This method pushes down the operator part for calculating the new check block to the slave node, preventing the master node from reading the check block from the second slave node where the check block is located, reducing the number of read operations and reducing network transmission overhead. Ensure system performance.
  • the master node may send a second request including the second data block to the first slave node, the second request is used to instruct the first slave node to update the first data block to the second data block, and Check block update information is determined according to the first data block and the second data block. Then, the master node sends a third request including the check block update information to the second slave node through the first slave node, and the check block update information is used to update the check block.
  • This method pushes down all the operators for calculating the new check block to the slave node, specifically to the first slave node where the first data block is located (which can also be called the update node) and the third slave node where the check block is located.
  • the second slave node prevents the first slave node from reading the check block from the second slave node, reduces the number of read operations, reduces network transmission overhead, and ensures system performance.
  • the second request sent by the master node to the first slave node is an update request
  • the return value of the update request is the first data block, which is used to instruct the first slave node to update the first data block to second data block and returns the first data block.
  • the first data block may be stored in the first slave node, and the master node and the first slave node may be the same node.
  • the check block may be stored in the second master node, and the master node may be the same node as the second slave node.
  • the master node can read the first data block or check block locally, reducing the number of remote read operations, thereby reducing the occupied network resources and ensuring system performance.
  • the master node before the master node obtains the first request, it can also obtain a fourth request including a data stream. Then the master node divides the data in the data stream into multiple data blocks, and divides the multiple data blocks into Columns are written to a data block storage node in a distributed storage system.
  • the data block storage node includes a master node and a first slave node. Then the master node calculates a check block based on each group of data blocks in multiple data blocks, and converts the check block Write to a check block storage node in the distributed storage system, where the check block storage node includes a second slave node.
  • storing data blocks in columns in a distributed storage system can reduce the number of disk crossings during subsequent data reading and reduce reading overhead.
  • the master node when multiple data blocks obtained by segmenting the data stream cannot fill at least one EC stripe, the master node can also perform no-operations on the shards without data in at least one EC stripe without having to Perform padding operations to reduce write amplification.
  • the master node may also obtain the fifth request including the starting address, and then determine the target node based on the starting address, and read the target data block in columns from the target node. In this way, when reading data, the required data can be read by only reading the hard disk once on one node, reducing read amplification.
  • this application provides a data processing method.
  • the method is applied to distributed storage systems, including:
  • the master node obtains a first request, the first request is used to update the first data block in the erasure code EC strip, and determines the first data block according to the first request, and the first data block is the
  • the data block associated with the first request is sent to a set of slave nodes.
  • the set of slave nodes includes at least one slave node in the distributed storage system.
  • the processing request is used to instruct the master node to update the data block.
  • the operation is offloaded to one or more slave nodes in the slave node set;
  • the set of slave nodes updates the first data block and the check block according to the processing request.
  • This method offloads the operation of updating data blocks from the master node to a set of slave nodes, reduces the number of read operations in the process of updating the data blocks of the EC stripe, reduces network transmission overhead, and ensures system performance.
  • the master node sends a processing request to a set of slave nodes, including:
  • the master node sends a second request including the second data block to the first slave node
  • the first slave node updates the first data block according to the processing request, including:
  • the first slave node updates the first data block to a second data block and returns the first data block;
  • the method also includes:
  • the master node determines check block update information based on the first data block and the second data block;
  • the master node sends a processing request to the slave node set, including:
  • the second slave node updates the check block according to the processing request, including:
  • the second slave node updates the check block according to the check block update information.
  • This method pushes down the operator part of calculating the new check block when the master node updates the data block to the slave node, reducing the operation of reading the check block from the slave node where the check block is located, reducing network transmission overhead, and ensuring the system performance.
  • the master node sends a processing request to a set of slave nodes, including:
  • the master node sends a second request including the second data block to the first slave node
  • the first slave node updates the first data block according to the processing request, including:
  • the first slave node updates the first data block to a second data block
  • the method also includes:
  • the first slave node determines check block update information based on the first data block and the second data block
  • the master node sends a processing request to the slave node set, including:
  • the second slave node updates the check block according to the processing request, including:
  • the second slave node updates the check block according to the check block update information.
  • This method pushes down all the operators for calculating the new check block when the master node updates the data block to the slave node, reducing the operation of reading the check block from the slave node where the check block is located, reducing network transmission overhead, and ensuring the system performance.
  • the second request is an update request
  • the update request is used to instruct the first slave node to update the first data block to the second data block and return the First data block.
  • the master node only needs one update operation to replace one read operation and one write operation in related technologies, reducing the number of operations, reducing network transmission overhead, and ensuring system performance.
  • this application provides a data processing device, which includes various modules for executing the data processing method in the first aspect or any possible implementation of the first aspect.
  • the present application provides a data processing device.
  • the data processing device includes various units for executing the data processing method in the second aspect or any possible implementation of the second aspect.
  • this application provides a data processing system.
  • the data processing system includes various devices for executing the data processing method in the second aspect or any possible implementation of the second aspect.
  • this application provides a computing device cluster.
  • the cluster of computing devices includes at least one computing device including at least one processor and at least one memory.
  • the at least one processor and the at least one memory communicate with each other.
  • the at least one processor is configured to execute instructions stored in the at least one memory, so that the computing device or a cluster of computing devices executes the data processing method described in any implementation of the first aspect or the second aspect.
  • the present application provides a computer-readable storage medium that stores instructions instructing a computing device or a cluster of computing devices to execute the first aspect or any one of the first aspects. Implement the data processing method described in the method.
  • the present application provides a computer program product containing instructions that, when run on a computing device or a cluster of computing devices, causes the computing device or a cluster of computing devices to execute the first aspect or any one of the first aspects. Implement the data processing method described in the method.
  • Figure 1 is a schematic flow chart of an EC strip update provided by this application.
  • FIG. 2 is a system architecture diagram of a distributed storage system provided by this application.
  • FIG. 3 is a system architecture diagram of a distributed storage system provided by this application.
  • Figure 4 is a schematic diagram of an application scenario of a distributed storage system provided by this application.
  • Figure 5 is a flow chart of a data processing method provided by this application.
  • Figure 6 is a schematic diagram of row storage and column storage provided by this application.
  • Figure 7 is a schematic flow chart of writing data provided by this application.
  • Figure 8 is a schematic flow chart of an EC strip update provided by this application.
  • Figure 9 is a schematic flow chart of an EC strip update provided by this application.
  • Figure 10 is a schematic structural diagram of a data processing device provided by this application.
  • Figure 11 is a schematic structural diagram of a data processing system provided by this application.
  • Figure 12 is a schematic structural diagram of a computing device cluster provided by this application.
  • Figure 13 is a schematic structural diagram of a computing device cluster provided by this application.
  • EC stripe update which can also be referred to as EC stripe overwrite, specifically replaces several data blocks in the EC stripe with new data blocks, and based on the update of the data blocks, performs verification on the check blocks in the EC stripe. Update accordingly.
  • EC stripe overwriting can also be divided into EC lowercase and EC uppercase.
  • EC lower case refers to reading the check block, modified data block, and modified data block to determine the new check block.
  • EC upper case refers to reading other data blocks in the EC strip, and based on the modified data block and Other data blocks in the EC stripe determine the new parity block.
  • the overwritten data block When the overwritten data block is small, the amount of data read using the EC lower-write method is smaller, and the effect is The rate is higher. When the overwritten data block is large, the amount of data read using the EC upper-write method is smaller and the efficiency is higher.
  • the EC stripe includes data blocks D0 to Dk-1 , and parity blocks P and Q.
  • the node where data block D 0 is located is the master node
  • the node where data block D 1 to data block D k-1 is located is the first slave node
  • the node where data block D 0 to data block D k-1 is stored is also called a data block.
  • the storage node, the node where the check block P and the check block Q are located is the second slave node, also called the check block storage node.
  • the client requests to overwrite and write data block D′ 1 to data block D 1 in the EC stripe.
  • it is usually necessary to read the data to be modified such as data block D 1 first. to the main node, and read the check block P and check block Q to the main node.
  • the main node calculates a new check block P′ based on the data block D 1 , data block D′ 1 and check block P.
  • Q′ ⁇ 0 (D′ 1 -D 1 )+Q (2)
  • ⁇ 0 and ⁇ 0 are different calibration coefficients respectively.
  • the master node writes the data block D′ 1 , the new check block P′, and the new check block Q′ into the node where the data block D 1 , the check block P, and the check block Q are located.
  • updating a data block in the EC stripe requires three read operations and three write operations, which consumes a large amount of network resources and reduces system performance.
  • this application provides a data processing method applied to a distributed storage system.
  • the master node in the distributed storage system obtains the first request, which is used to update the data block in the EC strip.
  • the master node can determine the first data block according to the first request, and the first data block is The master node first requests the associated data block, and then the master node sends a processing request to a slave node set including at least one slave node in the distributed storage system to instruct the master node to offload the operation of updating the data block to one or more of the slave node sets. slave nodes.
  • This method offloads the operation of the master node to update the data block to the set of slave nodes, for example, pushing down the operator for calculating the new check block to the second slave node where the check block is located, avoiding the need for the master node or the first data block.
  • the first slave node (which can also be called an update node) reads the verification block from the second slave node, which reduces the number of read operations, reduces network transmission overhead, and ensures system performance.
  • this application focuses on changing the data transmission process and data distribution of EC to enhance data transmission and disk access efficiency, so it is suitable for a variety of storage scenarios and has high availability.
  • this method also supports optimizing the update process of the data block, for example, a read operation and a write operation can be combined into one read and write operation. In this way, only one read and write operation plus two write operations are needed to complete the update of the EC strip.
  • the network transmission overhead is reduced by half, which greatly reduces the network resource occupancy and improves the system performance.
  • the system has a storage and computing separation structure, and the system includes a computing node cluster and a storage node cluster.
  • the computing node cluster includes one or more computing nodes 110 (two computing nodes 110 are shown in FIG. 2, but are not limited to two computing nodes 110), and each computing node 110 can communicate with each other.
  • the computing node 110 is a computing device, such as a server, a desktop computer, or a controller of a storage array.
  • the computing node 110 at least includes a processor 112 , a memory 113 and a network card 114 .
  • the processor 112 is a central processing unit (CPU), used for processing data access requests from outside the computing node 110, or requests generated internally by the computing node 110. For example, when the processor 112 receives write data requests sent by the user, the data in these write data requests will be temporarily stored in the memory 113 . When the total amount of data in the memory 113 reaches a certain threshold, the processor 112 sends the data stored in the memory 113 to the storage node 100 for persistent storage. In addition, the processor 112 is also used for data calculation or processing, such as metadata management, data deduplication, data compression, virtualized storage space, address translation, etc. Only one CPU 112 is shown in Figure 2. In actual applications, there are often multiple CPUs 112, and one CPU 112 has one or more CPU cores. This embodiment does not limit the number of CPUs and CPU cores.
  • CPU central processing unit
  • Memory 113 refers to the internal memory that directly exchanges data with the processor. It can read and write data at any time and very quickly, and serves as a temporary data storage for the operating system or other running programs.
  • Memory includes at least two types of memory.
  • memory can be either random access memory or read-only memory (Read Only Memory, ROM).
  • random access memory is dynamic random access memory (Dynamic Random Access Memory, DRAM), or storage class memory (Storage Class Memory, SCM).
  • DRAM Dynamic Random Access Memory
  • SCM Storage Class Memory
  • DRAM is a semiconductor memory that, like most Random Access Memory (RAM), is a volatile memory device.
  • SCM is a composite storage technology that combines the characteristics of traditional storage devices and memory.
  • Storage-level memory can provide faster read and write speeds than hard disks, but is slower than DRAM in terms of access speed and cheaper than DRAM in cost.
  • DRAM and SCM are only exemplary in this embodiment, and the memory may also include other random access memories, such as static random access memory (Static Random Access Memory, SRAM).
  • SRAM static random access memory
  • the read-only memory for example, it can be a programmable read-only memory (Programmable Read Only Memory, PROM), an erasable programmable read-only memory (Erasable Programmable Read Only Memory, EPROM), etc.
  • the memory 113 can also be a dual in-line memory module or a dual in-line memory module (Dual In-line Memory Module, DIMM for short), that is, a module composed of dynamic random access memory (DRAM), or a solid state drive. (Solid State Disk, SSD).
  • DIMM Dual In-line Memory Module
  • the computing node 110 may be configured with multiple memories 113 and different types of memories 113 . This embodiment does not limit the number and type of memories 113 .
  • the memory 113 can be configured to have a power-saving function.
  • the power-saving function means that the data stored in the memory 113 will not be lost when the system is powered off and then on again. Memory with a power-saving function is called non-volatile memory.
  • the network card 114 is used to communicate with the storage node 100. For example, when the total amount of data in the memory 113 reaches a certain threshold, the computing node 110 may send a request to the storage node 100 through the network card 114 to persistently store the data.
  • the computing node 110 may also include a bus for communication between components within the computing node 110 .
  • remote storage can be used to implement persistent storage when storing data, so it has less local storage than a conventional server, thus achieving cost and Space saving. However, this does not mean that the computing node 110 cannot have local storage.
  • the computing node 110 may also have a small number of built-in hard disks or a small number of external hard disks.
  • Any computing node 110 can access any storage node 100 in the storage node cluster through the network.
  • the storage node cluster includes multiple storage nodes 100 (three storage nodes 100 are shown in FIG. 2, but are not limited to three storage nodes 100).
  • a storage node 100 includes one or more controllers 101, network cards 104, and multiple hard disks 105.
  • Network card 104 is used to communicate with computing node 110.
  • the hard disk 105 is used to store data, and can be a magnetic disk or other types of storage media, such as a solid-state hard disk or a shingled magnetic recording hard disk.
  • the controller 101 is configured to write data to the hard disk 105 or read data from the hard disk 105 according to the read/write data request sent by the computing node 110 . During the process of reading and writing data, the controller 101 needs to convert the address carried in the read/write data request into an address that the hard disk can recognize. It can be seen that the controller 101 also has some simple calculation functions.
  • FIG. 3 is a system architecture diagram of another distributed storage system applied in the embodiment of the present application.
  • the system is an integrated storage and computing architecture, and the system includes a storage cluster.
  • the storage cluster includes one or more servers 110 (three servers 110 are shown in FIG. 3 , but are not limited to three servers 110 ), and each server 110 can communicate with each other.
  • the server 110 is a device with both computing capabilities and storage capabilities, such as a server, a desktop computer, etc.
  • an ARM server or an X86 server can be used as the server 110 here.
  • the server 110 at least includes a processor 112, a memory 113, a network card 114 and a hard disk 105.
  • the processor 112, memory 113, network card 114 and hard disk 105 are connected through a bus.
  • the processor 112 and the memory 113 are used to provide computing resources.
  • the processor 112 is a central processing unit CPU, which is used to process data access requests from outside the server 110 (application server or other servers 110), and is also used to process requests generated within the server 110.
  • Memory 113 refers to the internal memory that directly exchanges data with the processor. It can read and write data at any time and very quickly, and serves as a temporary data storage for the operating system or other running programs.
  • the server 110 may be configured with multiple memories 113 and different types of memories 113 . This embodiment does not limit the number and type of memories 113 .
  • the memory 113 can be configured to have a power-saving function.
  • the hard disk 105 is used to provide storage resources, such as storing data. It can be a disk or other type of storage medium, such as a solid state drive or a shingled magnetic recording hard drive.
  • Network card 114 is used to communicate with other servers 110 .
  • FIGS. 2 and 3 are only a schematic architecture of a distributed storage system.
  • the distributed storage system can also use other architectures, such as distributed storage.
  • the system can also adopt a fully integrated architecture or a Memory Fabric architecture.
  • the above-mentioned distributed storage system can provide storage services, for example, provide users with a storage server in the form of a storage interface, so that users can use the storage resources of the distributed storage system through the above-mentioned storage interface.
  • a storage server in the form of a storage interface
  • users can access the distributed storage system through a client (such as an application client).
  • the distributed storage system can be the architecture shown in Figure 2 or Figure 3 above.
  • the client can call the storage interface provided by the storage service to generate storage request, and send the storage request to the distributed storage system.
  • the computing node that receives the storage request in the distributed storage system can divide the data into multiple groups of data blocks, and then calculate the check block based on each group of data blocks, and add each The group of data blocks and the check blocks determined by the group of data blocks are scattered and written into different storage nodes 100, for example, written into the hard disks 105 of different storage nodes 100 to form an EC stripe.
  • the storage node 100 that stores the first data block in the EC stripe (for example, data block D 0 ) can be regarded as the master node, and stores other data blocks in the EC stripe (for example, data block D 1 ,... data block D k-
  • the storage node 100 of 1 ) can be regarded as the first slave node, and the storage node 100 that stores the check block (for example, the check block P and the check block Q) can be regarded as the second slave node.
  • the client can call the storage interface provided by the storage service to generate a storage request and send the storage request to the distributed storage system.
  • the server 110 in the distributed storage system can The data is divided into multiple groups of data blocks, and then a check block is calculated based on each group of data blocks, and each group of data blocks and the check blocks determined by the group of data blocks are scattered and written to different servers 110, for example, in the hard disks 105 of different servers 100.
  • the server 110 that stores the first data block in the EC stripe (for example, data block D 0 ) can be regarded as the master node, and stores other data blocks in the EC stripe (for example, data block D 1 ,...data block D k-1 ) can be regarded as the first slave node, and the server 110 that stores the check block (for example, the check block P and the check block Q) can be regarded as the second slave node.
  • the client is used to access the distributed storage system through the storage service.
  • the distributed storage system can respond to the user's access to the distributed storage system through the storage service and return the access result.
  • the access results can be different.
  • the access result can represent a notification of successful writing.
  • the access result can be a read data block.
  • the master node can obtain the first request for updating the data block of the EC stripe, determine the first data block according to the first request, and then send a processing request to the slave node set to instruct the master node to
  • the operation of nodes updating data blocks is offloaded to one or more slave nodes in the slave node set.
  • the processing request may indicate that the master node's operation of updating the data block is offloaded to the first slave node where the first data block is located and the second slave node where the check block is located.
  • first data block and check block can be stored in different nodes in the distributed storage system except the master node.
  • first data block may also be stored on the master node.
  • the master node and the first slave node are the same node.
  • the check block may also be stored in the master node.
  • the master node and the second slave node may be the same node.
  • the data processing method in the embodiment of the present application will be introduced in a scenario where the master node, the first slave node, and the second slave node are different nodes.
  • the method includes:
  • Request 1 includes the data stream.
  • Request 1 is used to request to write the data in the data stream to the distributed storage system for persistent storage.
  • request 1 can be generated by the application client based on business requirements. This request 1 can be a write request or other requests that need to write data.
  • request 1 may include different types of data streams. For example, when the application client is a short video application or a long video application, request 1 may include a video data stream; for example, when the application client is a file management application or a text editing application, request 1 may include a text data stream.
  • the master node can receive request 1 sent by the application client to persistently store the data in the data stream carried by request 1.
  • S504 The master node divides the data in the data stream included in request 1 into blocks and obtains multiple data blocks.
  • a data stream can be an ordered sequence of bytes with a start and end point.
  • the master node can use fixed-length chunking or variable-length chunking to chunk the data in the data stream carried by request 1, thereby obtaining multiple data chunks.
  • fixed-length chunking refers to chunking the data in the data stream according to the set chunking granularity.
  • Variable-length chunking is to divide the data in the data stream into data chunks of variable size.
  • Variable-length chunking can include variable-length chunking based on sliding windows and variable-length chunking based on content (content-defined chunking, CDC) .
  • the master node can evenly divide the data in the data stream into multiple data blocks.
  • the master node can fill the data in the data stream, for example by filling in zeros at the end of the data stream, so that the size of the data in the filled data stream is the block granularity. is an integer multiple of , and then the master node evenly divides the data in the data stream into multiple data blocks according to the block granularity. For example, if the size of the data in the data stream is 20KB, and the master node blocks it according to the block granularity of 4KB, it can obtain 5 data blocks with a size of 4KB.
  • the master node may not fill the data in the data stream, but divide it into K-1 blocks according to the block granularity.
  • the above-mentioned S504 may not be executed when executing the data processing method in the embodiment of the present application. For example, when the size of the data in the data stream is too small to be divided into blocks, or the data in the data stream has been pre-blocked, the above S504 does not need to be executed.
  • the master node writes multiple data blocks to the data block storage node including the master node and the first slave node in columns.
  • the master node can first write multiple data blocks to the master node in columns. When the columns in the master node are full, then The remaining data blocks are written to the first slave node in columns.
  • the master node can first write the remaining data blocks to the first first slave node in columns.
  • the master node writes the remaining data blocks to the next first slave node in columns.
  • Row storage refers to storage in rows
  • column storage refers to storage in columns.
  • the size of each data block is 4K
  • the data block storage node can store 256 data blocks in each column
  • the number of data block storage nodes is 4,
  • the number of check block storage nodes is 2. If the master node writes data block storage nodes row by row, data block D 0 to data block D 3 are written to 4 data block storage nodes respectively, specifically one master node and three first slave nodes, P 0 and Q 0 Write to different check block storage nodes respectively.
  • data blocks D 4 to D 7 are respectively written into four data block storage nodes, and P 1 and Q 1 are respectively written into different check block storage nodes. If the master node writes data block storage nodes in columns, then data blocks D 0 to D 255 are written to the master node, data blocks D 256 to D 511 are written to the first first slave node, and so on, data block D 512 To D 767 is written to the second first slave node, and data blocks D 768 to D 1023 are written to the third first slave node.
  • multiple data blocks may not be able to fill an EC strip.
  • each column stores 256 data blocks and the number of data block storage nodes is 4. If the data block in the data stream The number is less than 769 (256*3+1), and when at least one data block in the data stream is not enough to fill an EC stripe, write amplification can be reduced by omitting to write the empty part.
  • the master node can perform no operations (recorded as zero Op) on at least one chunk of data in the EC strip without having to perform a filling operation, which can reduce write amplification.
  • the master node can first determine the size of the data stream. If the size of the data in the data stream is not enough to fill the stripe, it can only write the chunks that need to be filled, and the idle chunks will not be filled. This not only improves writing performance, but also reduces space waste.
  • S508 The master node calculates a check block based on each group of data blocks in multiple data blocks.
  • the master node can group multiple data blocks. For example, the multiple data blocks can be grouped according to the row in which each data block is located. Rows containing the same set of data blocks have the same row number. Then, the master node can calculate each group of data blocks according to the verification algorithm and generate a verification block. Among them, the master node can use different verification algorithms to generate different verification blocks. In order to facilitate understanding, the process of calculating the check block is still illustrated in FIG. 6 .
  • the master node when the master node writes data blocks in columns, the master node can calculate the parity block P 0 and the parity block Q based on the data block D 1 , data block D 256 , data block D 512 , and data block D 768 0. Similarly, the master node can calculate the check block P 1 and the check block Q 1 based on the data block D 2 , data block D 257 , data block D 513 , and data block D 769 .
  • an EC stripe may include data blocks in different data segments rather than contiguous data blocks in one data segment.
  • an EC stripe may include data block D0 , data block D256 , data block D512 , data block D768 and parity blocks P0 , Q0 .
  • S510 The master node writes the check block to the check block storage node including the second slave node.
  • the master node can write the check blocks to their corresponding second slave nodes respectively.
  • S506 and S508 can be executed sequentially according to the set order, and then S510 is executed.
  • the above S506 and S508 can be executed in parallel, and then S510 is executed.
  • S506 and S510 can also be executed in parallel.
  • S508 can be executed first to obtain the check block, and then the data block and the check block can be written to the corresponding node in parallel.
  • the embodiment of the present application does not limit the order of the above S506, S508, and S510.
  • S502 to S510 are optional steps in the embodiment of the present application, and the above steps may not be performed when performing the data processing method in the embodiment of the present application.
  • the data processing method in the embodiment of the present application can directly perform the following steps to update the EC strip. Detailed explanation below.
  • Request 2 is used to update the data block in the EC stripe.
  • request 2 is used to update the first data block in the EC stripe to the second data block.
  • Request 2 includes the second data block.
  • the request 2 may also include the logical address of the first data block to quickly address the first data block.
  • S512 The master node determines the first data block according to request 2.
  • the first data block is specifically the data block associated with request 2.
  • the master node can parse request 2, obtain the logical address of the data block that needs to be updated in request 2, and determine the first data block based on the logical address.
  • S514 The master node sends request 3 to the first slave node where the first data block is located.
  • S516 The first slave node where the first data block is located updates the first data block to the second data block.
  • S518 The master node receives the first data block returned by the first slave node where the first data block is located.
  • the master node determines the check block update information based on the first data block and the second data block.
  • S522 The master node sends request 4 to the second slave node where the verification block is located.
  • S524 The second slave node updates the check block according to the check block update information in request 4.
  • request 2 can also be called the first request, and request 3 and request 4 can also be collectively called processing requests.
  • the processing request is a request sent by the master node to the set of slave nodes, where request 3 can be called Making the second request, request 4 can be called the third request.
  • request 1 can also be called the fourth request.
  • request 3 and request 4 are used to indicate that the operation part of the master node updating the data block is offloaded to the first slave node and the second slave node.
  • the uninstallation process is described in detail below.
  • the request 3 sent by the master node to the first slave node includes the second data block, and the request 3 is specifically used to instruct the first slave node to update the first data block to the second data block. Considering that when the first data block in the EC strip is updated, the check block will also change accordingly, the master node can read the first data block according to request 3 for use in calculating a new check block.
  • request 3 can be an update request, and the return value of the update request is the first data block.
  • the first slave node where the first data block is located can read the first data block when updating the first data block, and then Write the second data block, and in addition, the first slave node may also return the first data block to the master node. In this way, the second data block is written and the first data block is read through one update operation (specifically, a read and write operation).
  • the master node can also send an additional request to read the first data block for calculating the check block update information.
  • the master node When the master node receives the first data block returned from the first slave node where the first data block is located, it can determine the check block update information through the EC algorithm based on the first data block and the second data block. For example, the master node may use formula (1) or formula (2) to determine the check block update information based on the first data block and the second data block.
  • the request 4 sent by the master node to the second slave node includes check block update information.
  • Request 4 is specifically used to update the check block.
  • the second slave node can update the check block according to the check block update information in request 4. For example, the second slave node can read the check block, determine a new check block based on the check block and the check block update information, and then store the new check block, thereby updating the check block.
  • the master node calculates a new check block based on the first data block, the second data block and the check block, and then issues the new check block to the check block.
  • the storage node is updated.
  • the embodiment of the present application offloads the operation of updating the data block to the first slave node and the second slave node. Specifically, the process of updating the check block in the operation of updating the data block is decomposed into two steps, consisting of different steps. Node completed.
  • the master node can complete the previous step, specifically calculating the check block update information based on the first data block and the second data block, and then sending the check block update information to the check block storage node, and the check block storage node completes
  • the last step is to update the check block based on the check block update information.
  • the read-first-then-write operation of the new and old data of D′ 256 and D 256 is turned into an update operation on the master node. Specifically, the update request comes with the data to be written, but before writing, the data at the original address is read out as the return value of the request. Then, after writing the data to the disk, the read data is returned to the primary node.
  • S516 to S518 may not be executed when performing the data processing method in the embodiment of the present application.
  • Request 3 and Request 4 may be used to instruct to offload all the operation of updating the data block by the master node to the first slave node where the first data block is located and the second slave node where the check block is located.
  • the first slave node (that is, the update node) where the first data block is located can directly calculate the check block update information based on the read first data block and the second data block.
  • the master node can send request 4 through the first slave node, and the first slave node carries the check block update information in the request 4 and pushes it down to the second slave node where the check block is located, so that the second slave node can
  • the check block update information in request 4 is used to calculate a new check block, thereby realizing the check block update.
  • the read-first-then-write operation of the new and old data of D′ 256 and D 256 is turned into an update operation. Specifically, the update request comes with the data to be written, but before writing, the data at the original address is read out as the return value of the request. Then, after writing the data to the disk, the read data is returned to the primary node.
  • the master node sends a processing request to the slave node set, and the slave node set updates the first data block and the check block according to the processing request.
  • the master node and the slave node can also update the first data block and the check block through other method steps.
  • the master node can also receive request 5, which can be a read request, and then the master node can read the target data block according to the read request. It should be noted that in the EC stripe query scenario, request 5 can also be called the fifth request.
  • request 5 can also be called the fifth request.
  • the master node can read the target data blocks in columns.
  • the read request may include a start address. Further, the read request may also include the length of the read data.
  • the master node may determine the target node from the data block storage node based on the above start address, and then the master node may determine the target node from the target node. The node reads the target data block column-wise.
  • the embodiment of the present application provides a data processing method.
  • This method divides the update process of the check block when updating the EC strip into local check and remote update, and sends the new check block originally calculated by the master node to the check block storage node where the check block is located.
  • the update process is optimized so that the master node or update node calculates the check block update information, and the check block storage node where the check block is located generates a new check block based on the check block update information and writes the new check block. piece.
  • the master node or update node is prevented from reading the check block from the check block storage node, reducing the number of read operations, reducing network transmission overhead, and ensuring system performance.
  • this method supports converting row storage into column storage when writing data, so that when reading data, it can be completed on the same disk of a machine, reducing the number of cross-disk data readings and improving read performance.
  • the data processing device 1000 can be deployed on a master node in a distributed storage system.
  • the device 1000 includes:
  • Obtaining unit 1002 configured to obtain a first request, the first request being used to update the data block in the erasure code EC strip;
  • Determining unit 1004 configured to determine the first data block according to the first request, where the first data block is the data block associated with the first request;
  • Communication unit 1006 configured to send a processing request to a set of slave nodes, the set of slave nodes including at least one slave node in the distributed storage system, and the processing request is used to instruct the master node to offload the operation of updating the data block. to one or more slave nodes in the slave node set.
  • the device 1000 in the embodiment of the present invention can be implemented by a central processing unit (CPU), an application-specific integrated circuit (ASIC), or a programmable logic device.
  • programmable logic device, PLD programmable logic device
  • the above PLD can be a complex programmable logical device (CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL ), data processing unit (DPU), system on chip (SoC), or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL general array logic
  • DPU data processing unit
  • SoC system on chip
  • the communication unit 1006 is specifically used to:
  • the determining unit 1004 is also used to:
  • the communication unit 1006 is specifically used for:
  • a third request including the check block update information is sent to the second slave node, where the check block update information is used to update the check block.
  • the communication unit 1006 is specifically used to:
  • the first slave node sends a third request including the check block update information to the second slave node, where the check block update information is used to update the check block.
  • the first data block is stored in a first slave node, and the master node and the first slave node are the same node; or, the check block in the EC strip is stored in The second slave node, the master node and the second slave node are the same node.
  • the acquisition unit 1002 is also used to:
  • the device 1000 also includes:
  • the reading and writing unit 1008 is used to divide the data in the data stream into multiple data blocks, and write the multiple data blocks into the data block storage nodes in the distributed storage system in columns.
  • the block storage node includes the master node and the first slave node;
  • the reading and writing unit 1008 is also configured to calculate a check block according to each group of data blocks in the plurality of data blocks, and write the check block to a check block storage node in the distributed storage system, so
  • the check block storage node includes the second slave node;
  • the read-write unit is specifically configured to perform a no-operation on the slices without data in the at least one EC stripe.
  • the acquisition unit 1002 is also used to:
  • the reading and writing unit 1008 is also used for:
  • the target node is determined according to the starting address, and the target data blocks are read in columns.
  • the data processing system 1100 includes a first data processing device 1000A and a second data processing device 1000B.
  • the first data processing device 1000A is deployed in a distributed storage system.
  • the master node and the second data processing device 1000B are deployed on slave nodes in the distributed storage system.
  • the first data processing device 1000A is configured to: obtain a first request, the first request is used to update the first data block in the erasure code EC strip, and determine the first data block according to the first request, The first data block is the data block associated with the first request, and a processing request is sent to a set of slave nodes.
  • the set of slave nodes includes at least one slave node in the distributed storage system, and the processing request is used to Instruct to offload the operation of updating data blocks by the master node to one or more slave nodes in the slave node set;
  • the second data processing device 1000B is configured to update the first data block and the check block according to the processing request.
  • the first data processing device 1000A is specifically used for:
  • the second data processing device 1000B on the first slave node is specifically used for:
  • the first data processing device 1000A is also used for:
  • the first data processing device 1000A is specifically used for:
  • the second data processing device 1000B on the second slave node is specifically used for:
  • the first data processing device 1000A is specifically used for:
  • the second data processing device 1000B on the first slave node is specifically used for:
  • the second data processing device 1000B on the first slave node is also used for:
  • the first data processing device 1000A is specifically used for:
  • the second data processing device 1000B on the second slave node is specifically used for:
  • the second request is an update request
  • the update request is used to instruct the first slave node to update the first data block to the second data block and return the First data block.
  • FIG 12 is a hardware structure diagram of a computing device 1200 provided by this application.
  • the computing device 1200 may be the aforementioned master node and is used to implement the functions of the data processing device 1000.
  • the computing device 1200 may be a server or a terminal device. Terminal devices include but are not limited to desktop computers, laptops, tablets or smartphones.
  • computing device 1200 includes: bus 1202, processor 1204, memory 1206, and communication interface 1208.
  • the processor 1204, the memory 1206 and the communication interface 1208 communicate through the bus 1202. It should be understood that this application does not limit the number of processors and memories in the computing device 1200.
  • the bus 1202 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one line is used in Figure 12, but it does not mean that there is only one bus or one type of bus.
  • Bus 1202 may include a path that carries information between various components of computing device 1200 (eg, memory 1206, processor 1204, communications interface 1208).
  • the processor 1204 may include a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP). any one or more of them.
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • Memory 1206 may include volatile memory, such as random access memory (RAM). Memory 1206 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, mechanical hard disk (hard disk drive, HDD) or solid state drive (solid state drive) , SSD).
  • RAM random access memory
  • Memory 1206 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, mechanical hard disk (hard disk drive, HDD) or solid state drive (solid state drive) , SSD).
  • ROM read-only memory
  • flash memory such as hard disk (hard disk drive, HDD) or solid state drive (solid state drive) , SSD).
  • HDD hard disk drive
  • solid state drive solid state drive
  • the communication interface 1208 uses transceiver modules such as but not limited to network interface cards and transceivers to implement the communication between the computing device 1200 and other devices. Communication between devices or communication networks.
  • the computing device cluster includes at least one computing device.
  • the computing device may be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device may also be a terminal device such as a desktop computer, a laptop computer, or a smartphone.
  • the computing device cluster includes at least one computing device 1200.
  • the memory 1206 of one or more computing devices 1200 in a cluster of computing devices may store instructions for the same data processing system 1100 to perform the data processing method.
  • one or more computing devices 1200 in the computing device cluster may also be used to execute part of the instructions of the data processing system 1100 for executing the data processing method.
  • a combination of one or more computing devices 1200 may collectively execute instructions of data processing system 1100 for performing a data processing method.
  • the memory 1206 in different computing devices 1200 in the computing device cluster can store different instructions for executing part of the functions of the data processing system 1100 .
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store or a data storage device such as a data center that contains one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.
  • the computer-readable storage medium includes instructions that instruct the computing device to perform the above-mentioned execution data processing method.
  • An embodiment of the present application also provides a computer program product containing instructions.
  • the computer program product may be a software or program product containing instructions capable of running on a computing device or stored in any available medium.
  • the computer program product is run on at least one computing device, at least one computing device is caused to execute the above data processing method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Multi Processors (AREA)

Abstract

一种数据处理方法及相关设备,涉及存储技术领域。该方法由分布式存储系统中主节点执行,包括:获取第一请求,第一请求用于更新纠删码EC条带中的数据块,然后根据第一请求确定第一数据块,向从节点集合发送处理请求,以指示将主节点更新数据块的操作卸载至从节点集合中的一个或多个从节点。该方法将主节点更新数据块的操作卸载至从节点,减少了更新EC条带过程中读操作的次数,避免占用大量的网络资源,保障了分布式存系统的系统性能。

Description

一种数据处理方法及相关设备
本申请要求于2022年06月27日提交中国国家知识产权局、申请号为202210740423.1、发明名称为“一种数据处理的方法”的中国专利申请的优先权,以及要求于2022年08月23日提交中国国家知识产权局、申请号为202211017671.X、发明名称为“一种数据处理方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及存储技术领域,尤其涉及一种数据处理方法、数据处理装置、计算设备集群、计算机可读存储介质、计算机程序产品。
背景技术
随着信息化技术的不断发展,越来越多的产业应用采用信息化部署方式,由此产生了大量的数据。为了降低数据的存储成本并保障数据的可靠性,业界提出了纠删码(Erasure Code,EC)技术。EC具体是将数据分成多组数据块,然后根据每组数据块计算得到校验块,将该组数据块和校验块分散存储在分布式存储系统的不同的节点。
一组数据块和基于这组数据块通过校验生成的校验块的集合称作EC条带(stripe)。为保证数据一致性,数据的读取和更新操作通常由EC条带中的一个数据块(例如是第一个数据块)所在的节点完成。其中,EC条带中的第一个数据块所在的节点也可以称作主节点,EC条带中的其他数据块所在的节点以及校验块所在的节点称作从节点。
然而,在对EC条带进行更新,例如是更新EC条带中的一个数据块时,通常需要主节点进行较多次读操作、写操作,多次读操作、写操作可以占用大量的网络资源,降低分布式存储系统的系统性能。
发明内容
本申请提供了一种数据处理方法,该方法将主节点更新数据块的操作卸载至从节点,减少了更新EC条带过程中读操作的次数,避免占用大量的网络资源,保障了分布式存系统的系统性能。本申请还提供了上述方法对应的数据处理装置、计算设备集群、计算机可读存储介质、计算机程序产品。
第一方面,本申请提供一种数据处理方法。该方法可以由分布式存储系统中主节点执行。具体地,主节点获取第一请求,第一请求用于更新EC条带中的数据块,然后根据第一请求确定第一数据块,第一数据块为第一请求关联的数据块,接着主节点向包括分布式存储系统中至少一个从节点的从节点集合发送处理请求,以指示将主节点更新数据块的操作卸载至从节点集合中的一个或多个从节点。如此,可以减少更新EC条带中的数据块的过程中读操作的次数,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,主节点可以向第一从节点发送包括第二数据块的第二请求,然后接收第一从节点将第一数据块更新为第二数据块所返回的第一数据块,接着根据第一数据块和第二数据块确定校验块更新信息,再向第二从节点发送包括校验块更新信息的第三请求,其中,校验块更新信息用于更新校验块。
该方法通过将计算新的校验块的算子部分下推至从节点,避免主节点从校验块所在的第二从节点读取校验块,减少读操作的次数,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,主节点可以向第一从节点发送包括第二数据块的第二请求,第二请求用于指示第一从节点将第一数据块更新为第二数据块,以及根据第一数据块和所述第二数据块确定校验块更新信息。然后,主节点通过第一从节点向第二从节点发送包括校验块更新信息的第三请求,该校验块更新信息用于更新校验块。
该方法通过将计算新的校验块的算子全部下推至从节点,具体是下推至第一数据块所在的第一从节点(也可以称为更新节点)以及校验块所在的第二从节点,避免第一从节点从第二从节点读取校验块,减少读操作的次数,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,主节点向第一从节点发送的第二请求为更新请求,该更新请求的返回值为第一数据块,用于指示第一从节点将第一数据块更新为第二数据块,并返回第一数据块。如此,主节点仅需一次更新操作即可替代相关技术中一次读操作和一次写操作,减少操作次数,降低网络传输开销,保障了系统性能。
在一些可能的实现方式中,第一数据块可以存储在第一从节点,主节点可以与第一从节点为同一节点。类似地,在另一些实施例中,校验块可以存储在第二主节点,主节点可以与第二从节点为同一节点。
如此,主节点可以从本地读取第一数据块或校验块,减少了远程读操作的次数,进而减少占用的网络资源,保障系统性能。
在一些可能的实现方式中,主节点获取第一请求之前,还可以获取包括数据流的第四请求,然后主节点将数据流中的数据分块得到多个数据块,将多个数据块按列写入分布式存储系统中的数据块存储节点,该数据块存储节点包括主节点和第一从节点,接着主节点根据多个数据块中每组数据块计算校验块,将校验块写入分布式存储系统中的校验块存储节点,该校验块存储节点包括第二从节点。
其中,将数据块按列存储至分布式存储系统,可以减少后续读数据过程中跨磁盘的次数,减少读开销。
在一些可能的实现方式中,数据流分块所得的多个数据块无法写满至少一个EC条带时,主节点还可以对至少一个EC条带中无数据的分片执行空操作,而不必执行填充操作,如此可以减少写放大。
在一些可能的实现方式中,主节点还可以获取包括起始地址的第五请求,然后根据起始地址确定目标节点,从目标节点按列读取目标数据块。如此,在读数据时,只用在一个节点读取一次硬盘即可读取所需的数据,减少了读放大。
第二方面,本申请提供一种数据处理方法。所述方法应用于分布式存储系统,包括:
主节点获取第一请求,所述第一请求用于更新纠删码EC条带中的第一数据块,根据所述第一请求确定所述第一数据块,所述第一数据块为所述第一请求关联的数据块,向从节点集合发送处理请求,所述从节点集合包括所述分布式存储系统中至少一个从节点,所述处理请求用于指示将所述主节点更新数据块的操作卸载至所述从节点集合中的一个或多个从节点;
所述从节点集合根据所述处理请求更新所述第一数据块和校验块。
该方法将主节点更新数据块的操作卸载至从节点集合,减少更新EC条带的数据块的过程中读操作的次数,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,所述主节点向从节点集合发送处理请求,包括:
所述主节点向第一从节点发送包括所述第二数据块的第二请求;
所述第一从节点根据所述处理请求更新所述第一数据块,包括:
所述第一从节点将所述第一数据块更新为第二数据块,并返回所述第一数据块;
所述方法还包括:
所述主节点根据所述第一数据块和所述第二数据块确定校验块更新信息;
所述主节点向从节点集合发送处理请求,包括:
向第二从节点发送包括校验块更新信息的第三请求;
所述第二从节点根据所述处理请求更新校验块,包括:
所述第二从节点根据所述校验块更新信息更新校验块。
该方法将主节点更新数据块的过程中计算新的校验块的算子部分下推至从节点,减少从校验块所在的从节点读校验块的操作,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,所述主节点向从节点集合发送处理请求,包括:
所述主节点向第一从节点发送包括所述第二数据块的第二请求;
所述第一从节点根据所述处理请求更新所述第一数据块,包括:
所述第一从节点将所述第一数据块更新为第二数据块;
所述方法还包括:
所述第一从节点根据所述第一数据块和所述第二数据块确定校验块更新信息;
所述主节点向从节点集合发送处理请求,包括:
通过所述第一从节点向第二从节点发送包括所述校验块更新信息的第三请求;
所述第二从节点根据所述处理请求更新校验块,包括:
所述第二从节点根据所述校验块更新信息更新校验块。
该方法将主节点更新数据块的过程中计算新的校验块的算子全部下推至从节点,减少从校验块所在的从节点读校验块的操作,降低网络传输开销,保障系统性能。
在一些可能的实现方式中,所述第二请求为更新请求,所述更新请求用于指示所述第一从节点将所述第一数据块更新为所述第二数据块,并返回所述第一数据块。如此,主节点仅需一次更新操作即可替代相关技术中一次读操作和一次写操作,减少操作次数,降低网络传输开销,保障了系统性能。
第三方面,本申请提供一种数据处理装置,所述装置包括用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法的各个模块。
第三方面,本申请提供一种数据处理装置。该数据处理装置包括用于执行第二方面或第二方面任一种可能实现方式中的数据处理方法的各个单元。
第四方面,本申请提供一种数据处理系统。该数据处理系统包括用于执行第二方面或第二方面任一种可能实现方式中的数据处理方法的各个装置。
第五方面,本申请提供一种计算设备集群。所述计算设备集群包括至少一台计算设备,所述至少一台计算设备包括至少一个处理器和至少一个存储器。所述至少一个处理器、所述至少一个存储器进行相互的通信。所述至少一个处理器用于执行所述至少一个存储器中存储的指令,以使得计算设备或计算设备集群执行如第一方面或第二方面的任一种实现方式所述的数据处理方法。
第六方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,所述指令指示计算设备或计算设备集群执行上述第一方面或第一方面的任一种实现方式所述的数据处理方法。
第七方面,本申请提供了一种包含指令的计算机程序产品,当其在计算设备或计算设备集群上运行时,使得计算设备或计算设备集群执行上述第一方面或第一方面的任一种实现方式所述的数据处理方法。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请提供的一种EC条带更新的流程示意图;
图2为本申请提供的一种分布式存储系统的系统架构图;
图3为本申请提供的一种分布式存储系统的系统架构图;
图4为本申请提供的一种分布式存储系统的应用场景示意图;
图5为本申请提供的一种数据处理方法的流程图;
图6为本申请提供的一种行存与列存的示意图;
图7为本申请提供的一种写数据的流程示意图;
图8为本申请提供的一种EC条带更新的流程示意图;
图9为本申请提供的一种EC条带更新的流程示意图;
图10为本申请提供的一种数据处理装置的结构示意图;
图11为本申请提供的一种数据处理系统的结构示意图;
图12为本申请提供的一种计算设备集群的结构示意图;
图13为本申请提供的一种计算设备集群的结构示意图。
具体实施方式
为了便于理解,首先对本申请实施例中所涉及到的一些技术术语进行介绍。
EC条带更新,也可以简称为EC条带覆盖写,具体是采用新的若干数据块替换EC条带中的若干数据块,并基于对数据块的更新,对EC条带中校验块进行相应更新。根据生成新的校验块的方式不同,EC条带覆盖写还可以分为EC小写、EC大写。EC小写是指读取校验块、被修改的数据块、修改后的数据块确定新的校验块,EC大写是指读取EC条带中的其他数据块,根据修改后的数据块以及EC条带中的其他数据块,确定新的校验块。在覆盖写数据块较小时,采用EC小写方式读取的数据量较小,效 率较高。在覆盖写数据块较大时,采用EC大写方式读取的数据量较小,效率较高。
下面对EC条带更新过程进行示例说明。参见图1所示的EC条带更新的流程示意图,在该示例中,EC条带包括数据块D0至数据块Dk-1以及校验块P、校验块Q。数据块D0所在的节点为主节点,数据块D1至数据块Dk-1所在的节点为第一从节点,存储数据块D0至数据块Dk-1的节点也称作数据块存储节点,校验块P和校验块Q所在的节点为第二从节点,也称作校验块存储节点。
如图1所示,客户端请求将数据块D′1覆盖写至EC条带中的数据块D1,为保证数据的强一致性,通常需要将待修改的数据如数据块D1先读到主节点,以及将校验块P、校验块Q读到主节点,主节点根据数据块D1、数据块D′1和校验块P计算新的校验块P′,根据数据块D1、数据块D′1和校验块Q计算新的校验块Q′,具体如下所示:
P′=α0(D′1-D1)+P        (1)
Q′=β0(D′1-D1)+Q           (2)
其中,α0和β0分别为不同的校验系数。
然后,主节点将数据块D′1、新的校验块P′、新的校验块Q′写入数据块D1、校验块P、校验块Q所在的节点。如此导致更新EC条带中的一个数据块,需要读操作3次,写操作3次,占用大量的网络资源,降低了系统性能。
为了解决传统技术中多次读操作和写操作占用大量的网络资源导致系统性能下降的问题,本申请提供了一种应用于分布式存储系统的数据处理方法。具体地,分布式存储系统中的主节点获取第一请求,该第一请求用于更新EC条带中的数据块,主节点可以根据第一请求确定第一数据块,该第一数据块为第一请求关联的数据块,然后主节点向包括分布式存储系统中至少一个从节点的从节点集合发送处理请求,以指示将主节点更新数据块的操作卸载至从节点集合中的一个或多个从节点。
该方法将主节点更新数据块的操作卸载至从节点集合,例如是将计算新的校验块的算子下推至校验块所在的第二从节点,避免了主节点或第一数据块所在的第一从节点(也可以称为更新节点)从第二从节点读取校验块,减少了读操作的次数,降低了网络传输开销,保障了系统性能。区别于其他EC优化技术,本申请着重改变EC的数据传输流程和数据分布,增强数据传输和磁盘访问效率,因而适用多种存储场景,具有较高可用性。
进一步地,该方法还支持对数据块的更新过程进行优化,例如可以将一次读操作和一次写操作合并为一次读写操作。如此,只需一次读写操作加上两次写操作,即可完成EC条带的更新,网络传输开销减少一半,大幅降低了网络资源占有率,提升了系统性能。
下面结合附图对本申请实施例的系统架构进行介绍。
参见图2所示的分布式存储系统的系统架构图,该系统为存算分离结构,该系统包括计算节点集群和存储节点集群。计算节点集群包括一个或多个计算节点110(图2中示出了两个计算节点110,但不限于两个计算节点110),各个计算节点110之间可以相互通信。计算节点110是一种计算设备,如服务器、台式计算机或者存储阵列的控制器等。在硬件上,如图2所示,计算节点110至少包括处理器112、内存113和网卡114。其中,处理器112是一个中央处理器(central processing unit,CPU),用于处理来自计算节点110外部的数据访问请求,或者计算节点110内部生成的请求。示例性的,处理器112接收用户发送的写数据请求时,会将这些写数据请求中的数据暂时保存在内存113中。当内存113中的数据总量达到一定阈值时,处理器112将内存113中存储的数据发送给存储节点100进行持久化存储。除此之外,处理器112还用于数据进行计算或处理,例如元数据管理、重复数据删除、数据压缩、虚拟化存储空间以及地址转换等。图2中仅示出了一个CPU 112,在实际应用中,CPU 112的数量往往有多个,其中,一个CPU 112又具有一个或多个CPU核。本实施例不对CPU的数量,以及CPU核的数量进行限定。
内存113是指与处理器直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为操作系统或其他正在运行中的程序的临时数据存储器。内存包括至少两种存储器,例如内存既可以是随机存取存储器,也可以是只读存储器(Read Only Memory,ROM)。举例来说,随机存取存储器是动态随机存取存储器(Dynamic Random Access Memory,DRAM),或者存储级存储器(Storage Class Memory,SCM)。 DRAM是一种半导体存储器,与大部分随机存取存储器(Random Access Memory,RAM)一样,属于一种易失性存储器(volatile memory)设备。SCM是一种同时结合传统储存装置与存储器特性的复合型储存技术,存储级存储器能够提供比硬盘更快速的读写速度,但存取速度上比DRAM慢,在成本上也比DRAM更为便宜。然而,DRAM和SCM在本实施例中只是示例性的说明,内存还可以包括其他随机存取存储器,例如静态随机存取存储器(Static Random Access Memory,SRAM)等。而对于只读存储器,举例来说,可以是可编程只读存储器(Programmable Read Only Memory,PROM)、可抹除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)等。另外,内存113还可以是双列直插式存储器模块或双线存储器模块(Dual In-line Memory Module,简称DIMM),即由动态随机存取存储器(DRAM)组成的模块,还可以是固态硬盘(Solid State Disk,SSD)。实际应用中,计算节点110中可配置多个内存113,以及不同类型的内存113。本实施例不对内存113的数量和类型进行限定。此外,可对内存113进行配置使其具有保电功能。保电功能是指系统发生掉电又重新上电时,内存113中存储的数据也不会丢失。具有保电功能的内存被称为非易失性存储器。
网卡114用于与存储节点100通信。例如,当内存113中的数据总量达到一定阈值时,计算节点110可通过网卡114向存储节点100发送请求以对所述数据进行持久化存储。另外,计算节点110还可以包括总线,用于计算节点110内部各组件之间的通信。在功能上,由于图2中的计算节点110的主要功能是计算业务,在存储数据时可以利用远程存储器来实现持久化存储,因此它具有比常规服务器更少的本地存储器,从而实现了成本和空间的节省。但这并不代表计算节点110不能具有本地存储器,在实际实现中,计算节点110也可以内置少量的硬盘,或者外接少量硬盘。
任意一个计算节点110可通过网络访问存储节点集群中的任意一个存储节点100。存储节点集群包括多个存储节点100(图2中示出了三个存储节点100,但不限于三个存储节点100)。一个存储节点100包括一个或多个控制器101、网卡104与多个硬盘105。网卡104用于与计算节点110通信。硬盘105用于存储数据,可以是磁盘或者其他类型的存储介质,例如固态硬盘或者叠瓦式磁记录硬盘等。控制器101用于根据计算节点110发送的读/写数据请求,往硬盘105中写入数据或者从硬盘105中读取数据。在读写数据的过程中,控制器101需要将读/写数据请求中携带的地址转换为硬盘能够识别的地址。由此可见,控制器101也具有一些简单的计算功能。
图3为本申请实施例所应用的另一种分布式存储系统的系统架构图,该系统为存算一体架构,该系统包括存储集群。存储集群包括一个或多个服务器110(图3中示出了三个服务器110,但不限于三个服务器110),各个服务器110之间可以相互通信。服务器110是一种既具有计算能力又具有存储能力的设备,如服务器、台式计算机等。示例型的,ARM服务器或者X86服务器都可以作为这里的服务器110。
在硬件上,如图3所示,服务器110至少包括处理器112、内存113、网卡114和硬盘105。处理器112、内存113、网卡114和硬盘105之间通过总线连接。其中,处理器112和内存113用于提供计算资源。具体地,处理器112是一个中央处理器CPU,用于处理来自服务器110外部(应用服务器或者其他服务器110)的数据访问请求,也用于处理服务器110内部生成的请求。内存113是指与处理器直接交换数据的内部存储器,它可以随时读写数据,而且速度很快,作为操作系统或其他正在运行中的程序的临时数据存储器。实际应用中,服务器110中可配置多个内存113,以及不同类型的内存113。本实施例不对内存113的数量和类型进行限定。此外,可对内存113进行配置使其具有保电功能。硬盘105用于提供存储资源,例如存储数据。它可以是磁盘或者其他类型的存储介质,例如固态硬盘或者叠瓦式磁记录硬盘等。网卡114用于与其他服务器110通信。
需要说明的是,上述图2、图3仅仅是分布式存储系统的一种示意性架构,在本申请实施例其他可能的实现方式中,分布式存储系统也可以使用其他架构,例如分布式存储系统也可以采用全融合架构,或者Memory Fabric架构。
进一步地,上述分布式存储系统可以提供存储服务,例如,以存储接口形式为用户提供存储服务器,使用户可通过上述存储接口使用分布式存储系统的存储资源。参见图4所示的分布式存储系统的应用场景示意图,用户可以通过客户端(如应用客户端)访问分布式存储系统,该分布式存储系统可以是上述图2或图3所示的架构。
例如,分布式存储系统采用图2所示的架构时,客户端可以调用存储服务提供的存储接口生成存储 请求,并发送存储请求至分布式存储系统,分布式存储系统中接收到该存储请求的计算节点可以将数据分为多组数据块,然后根据每组数据块计算出校验块,并将每组数据块以及由该组数据块确定的校验块分散写入不同的存储节点100中,例如是写入不同存储节点100的硬盘105中,形成EC条带。存储EC条带中第一个数据块(例如是数据块D0)的存储节点100可以视为主节点,存储EC条带中其他数据块(例如是数据块D1,…数据块Dk-1)的存储节点100可以视为第一从节点,存储校验块(例如是校验块P和校验块Q)的存储节点100可以视为第二从节点。
又例如,分布式存储系统采用图3所示的架构时,客户端可以调用存储服务提供的存储接口生成存储请求,并发送存储请求至分布式存储系统,分布式存储系统中的服务器110可以将数据分为多组数据块,然后根据每组数据块计算出校验块,并将每组数据块以及由该组数据块确定的校验块分散写入不同的服务器110中,例如是写入不同服务器100的硬盘105中。存储EC条带中第一个数据块(例如是数据块D0)的服务器110可以视为主节点,存储EC条带中其他数据块(例如是数据块D1,…数据块Dk-1)的存储节点100可以视为第一从节点,存储校验块(例如是校验块P和校验块Q)的服务器110可以视为第二从节点。
其中,客户端用于通过存储服务访问分布式存储系统,分布式存储系统可以通过存储服务响应用户对分布式存储系统的访问,返回访问结果。根据访问操作不同,访问结果可以是不同的。例如访问操作为写操作时,访问结果可以表征写成功的通知,又例如访问操作为读操作时,访问结果可以是读取的数据块。
针对EC条带覆盖写的场景,主节点可以获取用于更新EC条带的数据块的第一请求,根据第一请求确定第一数据块,然后向从节点集合发送处理请求,以指示将主节点更新数据块的操作卸载至从节点集合中的一个或多个从节点。例如处理请求可以指示将主节点更新数据块的操作卸载至第一数据块所在的第一从节点和校验块所在的第二从节点。
需要说明的是,上述第一数据块、校验块可以存储在分布式存储系统中除主节点之外的其他不同节点。在一些实施例中,第一数据块也可以存储在主节点,在该情况下,主节点与第一从节点为同一节点。在另一些实施例中,校验块也可以存储在主节点,在该情况下,主节点与第二从节点可以为同一节点。
接下来,以主节点、第一从节点、第二从节点为不同节点的场景,对本申请实施例的数据处理方法进行介绍。
参见图5所示的数据处理方法的流程图,该方法包括:
S502:主节点获取请求1。
请求1中包括数据流。请求1用于请求将数据流中的数据写入分布式存储系统进行持久化存储。其中,请求1可以由应用客户端基于业务需求产生,该请求1可以是写请求,也可以是需要写入数据的其他请求。根据应用客户端的业务需求不同,请求1可以包括不同类型的数据流。例如,应用客户端为短视频应用或长视频应用时,请求1可以包括视频数据流;又例如,应用客户端为文件管理应用或文本编辑应用时,请求1可以包括文本数据流。主节点可以接收应用客户端下发的请求1,以将请求1携带的数据流中的数据进行持久化存储。
S504:主节点将请求1包括的数据流中数据进行分块,获得多个数据块。
数据流可以是具有起点和终点的有序字节序列。具体地,主节点可以采用定长分块或变长分块,对请求1携带的数据流中的数据进行分块,从而获得多个数据块。其中,定长分块是指按照设置好的分块粒度对数据流中的数据进行分块。变长分块是将数据流中的数据分为大小不固定的数据块,变长分块可以包括基于滑动窗口的变长分块和基于内容的变长分块(content-defined chunking,CDC)。
为了便于理解,下面以定长分块进行示例说明。具体地,数据流中数据的大小为分块粒度的整数倍时,主节点可以将数据流中数据均匀地切分为多个数据块。数据流中数据的大小并非分块粒度的整数倍时,主节点可以将数据流中数据进行填充,例如是在数据流的末端填零,使得填充后的数据流中数据的大小为分块粒度的整数倍,接着主节点按照该分块粒度将数据流中的数据均匀地切分为多个数据块。例如,数据流中数据的大小为20KB,主节点按照4KB的分块粒度进行分块,可以获得5个大小为4KB的数据块。
在一些实施例中,针对数据流中数据的大小并非分块粒度的整数倍的情况,主节点也可以不对数据流中数据进行填充,而是按照分块粒度切分出K-1个大小等于分块粒度的数据块以及一个大小不等于分块粒度的数据块。
需要说明的是,执行本申请实施例的数据处理方法也可以不执行上述S504。例如,数据流中数据的大小较小,不足以进行分块,或者数据流中的数据已预先分块时,也可以不执行上述S504。
S506:主节点将多个数据块按列写入包括主节点和第一从节点在内的数据块存储节点。
假设数据块存储节点可以在每列存储L个数据块,其中,L为正整数,则主节点可以将多个数据块先按列写入主节点,当主节点中的列写满时,则将剩余的数据块按列写入第一从节点。
第一从节点包括多个时,主节点可以将剩余的数据块先按列写入第一个第一从节点。以此类推,当第一个从节点写满时,如果数据块还有剩余,则主节点将剩余的数据块按列写入下一个第一从节点。
为了便于理解,下面结合一示例进行说明。参见图6所示的行存与列存的示意图,其中,行存是指按行存储,列存是指按列存储。该示例中,每个数据块的大小为4K,数据块存储节点可以在每列存储256个数据块,数据块存储节点的数量为4,校验块存储节点的数量为2。若主节点按行写入数据块存储节点,则数据块D0至数据块D3分别写入4个数据块存储节点,具体是一个主节点和3个第一从节点,P0和Q0分别写入不同的校验块存储节点。类似地,数据块D4至数据块D7分别写入4个数据块存储节点,P1和Q1分别写入不同的校验块存储节点。若主节点按列写入数据块存储节点,则数据块D0至D255写入主节点,数据块D256至D511写入第一个第一从节点,以此类推,数据块D512至D767写入第二个第一从节点,数据块D768至D1023写入第三个第一从节点。
在一些可能的实现方式中,多个数据块可以存在无法写满一个EC条带的情况,例如,每列存储256个数据块,数据块存储节点的数量为4时,若数据流中数据块的数量小于769(256*3+1),则数据流中的至少一个数据块不足以写满一个EC条带时,可以通过将空的部分省略写的方式减少写放大。具体实现时,主节点可以对至少一个EC条带中无数据的分片(chunk)执行空操作(记作zero Op),而不必执行填充操作,如此可以减少写放大。
如图7所示,主节点在chunk1写入一个数据块时,可以在主节点对应的事务1(记作transaction1)中加入zero Op,而事务2至4(记作transaction2~4)中的write Op均采用zero Op进行替代。相应地,chunk2~4实际上不会分配空间,也不会有实际的数据落盘,且在读取的时候,能够向上层请求返回正确数据,同时不会真正读盘。如此,可以减少写放大。
基于此,主节点可以先判断数据流中的大小,如果数据流中数据的大小不足满条带,则可以仅写需要填充的chunk,空闲的chunk不填充。这样不仅可以提高写的性能,同时可以减少空间浪费。
S508:主节点根据多个数据块中每组数据块计算校验块。
具体地,主节点可以对多个数据块可以进行分组,例如,可以按照各数据块所在的行对多个数据块进行分组。同一组数据块所在的行具有相同行号。然后,主节点可以根据校验算法,对每组数据块进行计算,生成校验块。其中,主节点可以采用不同的校验算法,生成不同的校验块。为了便于理解,仍以图6对计算校验块的过程进行示例说明。
在该示例中,主节点按列写入数据块时,主节点可以根据数据块D1、数据块D256、数据块D512、数据块D768计算得到校验块P0和校验块Q0,类似地,主节点可以根据数据块D2、数据块D257、数据块D513、数据块D769计算得到校验块P1和校验块Q1
本申请实施例中,数据分布的方法可以由按照行存的方式调整为列存的方式。如此,相邻地址的数据块可以集中放置在相同的磁盘,如数据块D0、数据块D1放在相同的磁盘。相应地,一个EC条带可以包括不同数据段中的数据块,而不是一个数据段中的连续数据块。如图6所示,一个EC条带可以包括数据块D0、数据块D256、数据块D512、数据块D768和校验块P0、Q0。当数据块D0所在磁盘或节点故障导致数据块D0丢失时,可以根据数据块D256、数据块D512、数据块D768和校验块P0、Q0恢复上述数据块D0
S510:主节点将校验块写入包括第二从节点在内的校验块存储节点。
当校验块存储节点的数量为多个时,也即第二从节点的数量为多个时,主节点可以将校验块分别写入各自对应的第二从节点。
需要说明的是,上述S506、S508可以按照设定顺序先后执行,然后执行S510。在一些实施例中, 上述S506、S508可以并行执行,然后执行S510。在另一些实施例中,S506和S510也可以并行执行,例如可以先执行S508获得校验块后,将数据块和校验块并行写入相应的节点。本申请实施例对上述S506、S508、S510的顺序不作限制。
还需要说明的是,上述S502至S510为本申请实施例的可选步骤,执行本申请实施例的数据处理方法也可以不执行上述步骤。例如,本申请实施例的数据处理方法可以直接执行以下步骤,从而对EC条带进行更新。下面进行详细说明。
S511:主节点获取请求2。
请求2用于更新EC条带中的数据块,例如请求2用于将EC条带中的第一数据块更新为第二数据块。请求2中包括第二数据块。在一些实施例中,请求2中还可以包括第一数据块的逻辑地址,以用于快速寻址第一数据块。
S512:主节点根据请求2,确定第一数据块。
第一数据块具体为请求2关联的数据块。具体地,主节点可以解析请求2,获得请求2中需要更新的数据块的逻辑地址,根据该逻辑地址确定第一数据块。
S514:主节点向第一数据块所在的第一从节点发送请求3。
S516:第一数据块所在的第一从节点将第一数据块更新为第二数据块。
S518:主节点接收第一数据块所在的第一从节点返回的第一数据块。
S520:主节点根据第一数据块和第二数据块确定校验块更新信息。
S522:主节点向校验块所在的第二从节点发送请求4。
S524:第二从节点根据请求4中的校验块更新信息,更新校验块。
在更新EC条带的场景中,请求2也可以称作第一请求,请求3、请求4也可以统称为处理请求,处理请求为主节点向从节点集合发送的请求,其中,请求3可以称作第二请求,请求4可以称作第三请求。在构建EC条带场景中,请求1也可以称作第四请求。
在图5的示例中,请求3和请求4用于指示将主节点更新数据块的操作部分卸载至第一从节点和第二从节点。下面对卸载过程进行详细说明。
主节点向第一从节点发送的请求3包括第二数据块,请求3具体用于指示第一从节点将第一数据块更新为第二数据块。考虑到更新EC条带中的第一数据块时,校验块也会相应发生变化,主节点可以根据请求3,读出第一数据块,以用于计算新的校验块。
需要说明,请求3可以为更新请求,更新请求的返回值为第一数据块,如此,第一数据块所在的第一从节点可以在更新第一数据块时,读出第一数据块,然后写入第二数据块,此外,第一从节点还可以向主节点返回第一数据块。如此,通过一次更新操作(具体为读写操作)实现写入第二数据块,读取第一数据块。在一些可能的实现方式中,主节点也可以额外发送一个请求,以读取第一数据块,用于计算校验块更新信息。
主节点接收到第一数据块所在的第一从节点返回的第一数据块,可以根据第一数据块、第二数据块,通过EC算法,确定校验块更新信息。例如,主节点可以根据第一数据块、第二数据块,采用公式(1)或公式(2)确定校验块更新信息。
主节点向第二从节点发送的请求4包括校验块更新信息。请求4具体用于更新校验块。第二从节点可以根据请求4中的校验块更新信息,更新校验块。例如,第二从节点可以读取校验块,根据校验块和校验块更新信息确定新的校验块,然后存储新的校验块,从而实现更新校验块。
区别于传统方法中将校验块读取到主节点,主节点根据第一数据块、第二数据块和校验块计算新的校验块,然后下发新的校验块至校验块存储节点进行更新,本申请实施例将更新数据块的操作部分卸载至第一从节点和第二从节点,具体是将更新数据块的操作中更新校验块的过程分解为两步,由不同节点完成。
具体地,主节点可以完成前一步,具体是根据第一数据块和第二数据块计算校验块更新信息,然后向校验块存储节点下发校验块更新信息,校验块存储节点完成后一步,具体是根据校验块更新信息更新校验块。
为了便于理解,下面结合一具体示例进行说明。
如图8所示,主节点完成校验块更新信息P″=α0(D′256-D256)和校验块更新信息Q″=β0(D′256-D256)的计算,然后主节点将校验块更新信息P″和校验块更新信息Q″下推至校验块存储节点。校验块存储节点完成新的校验块P′=P″+P和新的校验块Q′=Q″+Q的计算。而且在主节点将D′256和D256新旧数据的先读后写操作变成一次更新操作。具体地,在更新请求中自带要写入的数据,但在写入之前先读出原地址的数据作为请求的返回值。然后将数据写入磁盘后,再将读出的数据返回到主节点。
还需要说明的是,执行本申请实施例的数据处理方法也可以不执行上述S516至S518。例如,请求3和请求4可以用于指示将主节点更新数据块的操作全部卸载至第一数据块所在的第一从节点和校验块所在的第二从节点。
例如,第一数据块所在的第一从节点(即更新节点)可以直接根据读出的第一数据块和第二数据块计算校验块更新信息。主节点可以通过第一从节点发送请求4,由该第一从节点将校验块更新信息携带在请求4中,下推至校验块所在的第二从节点,以使第二从节点根据请求4中的校验块更新信息计算新的校验块,从而实现校验块更新。
为了便于理解,下面仍以更新EC条带中的数据块D256进行示例说明。
如图9所示,第一从节点完成校验块更新信息P″=α0(D′256-D256)和校验块更新信息Q″=β0(D′256-D256)的计算,然后第一从节点将校验块更新信息P″和校验块更新信息Q″下推至校验块存储节点。第二从节点完成新的校验块P′=P″+P和新的校验块Q′=Q″+Q。而且在主节点只用得到操作的结果即可,不使用数据无需读到主节点,减少数据传输。在第一从节点将D′256和D256新旧数据的先读后写操作变成一次更新操作。具体地,在更新请求中自带要写入的数据,但在写入之前先读出原地址的数据作为请求的返回值。然后将数据写入磁盘后,再将读出的数据返回到主节点。
以上为本申请实施例中主节点向从节点集合发送处理请求,从节点集合根据处理请求更新第一数据块和校验块的一些具体实现方式,在本申请实施例其他可能的实现方式中,主节点、从节点也可以通过其他方法步骤更新第一数据块和校验块。
在一些可能的实现方式中,主节点还可以接收请求5,请求5可以是读请求,然后主节点可以根据读请求,读取目标数据块。需要说明,在EC条带查询场景中,请求5也可以称作第五请求。其中,数据块采用列存方式时,主节点可以按列读取目标数据块。具体地,读请求可以包括起始地址,进一步地,读请求中还可以包括读数据的长度,主节点可以根据上述起始地址,从数据块存储节点中确定目标节点,然后主节点可以从目标节点按列读取目标数据块。
如此,在读数据时,只用在一个节点读取一次硬盘即可读取所需的数据,减少了读放大。以图6进行示例说明,如果想从起始地址读取8KB或者16KB的数据,虽然还是数据块D0、数据块D1、数据块D2、数据块D3,但只用在一台机器的同一块磁盘里完成,减少了跨盘读数据的次数。
基于上述内容描述,本申请实施例提供了一种数据处理方法。该方法通过将更新EC条带时对校验块的更新流程分为本地校验、远端更新,将原来由主节点计算出新的校验块发送至校验块所在的校验块存储节点进行更新的过程,优化为由主节点或更新节点计算校验块更新信息,校验块所在的校验块存储节点根据校验块更新信息生成新的校验块并写入该新的校验块。如此,避免了主节点或更新节点等从校验块存储节点读取校验块,减少了读操作的次数,降低了网络传输开销,保障了系统性能。进一步地,该方法支持写数据时将行存转为列存,如此在读数据时,可以实现在一台机器的同一块磁盘里完成,减少了跨盘读数据的次数,提升了读性能。
以上结合图1至图9对本申请提供的数据处理方法进行介绍,接下来结合附图对本申请提供的数据处理装置、数据处理系统的功能以及实现该数据处理装置、数据处理系统的计算设备或计算设备集群进行介绍。
首先,参见图10,示出了一种数据处理装置的结构示意图,数据处理装置1000可以部署于分布式存储系统中主节点,装置1000包括:
获取单元1002,用于获取第一请求,所述第一请求用于更新纠删码EC条带中的数据块;
确定单元1004,用于根据所述第一请求确定所述第一数据块,所述第一数据块为所述第一请求关联的数据块;
通信单元1006,用于向从节点集合发送处理请求,所述从节点集合包括所述分布式存储系统中至少一个从节点,所述处理请求用于指示将所述主节点更新数据块的操作卸载至所述从节点集合中的一个或多个从节点。
应理解的是,本发明本申请实施例的装置1000可以通过中央处理单元(central processing unit,CPU)实现,也可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)、数据处理单元(data processing unit,DPU)、片上系统(system on chip,SoC)或其任意组合。也可以通过软件实现图5至图9所示的数据处理方法时,装置1000及其各个模块也可以为软件模块。
在一些可能的实现方式中,所述通信单元1006具体用于:
向第一从节点发送包括所述第二数据块的第二请求;
接收所述第一从节点将所述第一数据块更新为第二数据块所返回的所述第一数据块;
所述确定单元1004还用于:
根据所述第一数据块和所述第二数据块确定校验块更新信息;
所述通信单元1006具体用于:
向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
在一些可能的实现方式中,所述通信单元1006具体用于:
向第一从节点发送包括所述第二数据块的第二请求,所述第二请求用于指示所述第一从节点将所述第一数据块更新为所述第二数据块,以及根据所述第一数据块和所述第二数据块确定校验块更新信息;
通过所述第一从节点向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
在一些可能的实现方式中,所述第一数据块存储在第一从节点,所述主节点与所述第一从节点为同一节点;或者,所述EC条带中的校验块存储在第二从节点,所述主节点与所述第二从节点为同一节点。
在一些可能的实现方式中,所述获取单元1002还用于:
所述获取第一请求之前,获取包括数据流的第四请求;
所述装置1000还包括:
读写单元1008,用于将所述数据流中的数据分块得到多个数据块,将所述多个数据块按列写入所述分布式存储系统中的数据块存储节点,所述数据块存储节点包括所述主节点和所述第一从节点;
所述读写单元1008,还用于根据所述多个数据块中每组数据块计算校验块,将所述校验块写入所述分布式存储系统中的校验块存储节点,所述校验块存储节点包括所述第二从节点;
其中,所述多个数据块无法写满至少一个EC条带时,所述读写单元具体用于对所述至少一个EC条带中无数据的分片执行空操作。
在一些可能的实现方式中,所述获取单元1002还用于:
获取包括起始地址的第五请求;
所述读写单元1008还用于:
根据所述起始地址确定目标节点,按列读取所述目标数据块。
由于图10所示的数据处理装置1000对应于图5、图8、图9所示的方法,故图10所示的数据处理装置1000的具体实现方式及其所具有的技术效果,可以参见前述实施例中的相关之处描述,在此不做赘述。
然后,参见图11,示出了一种数据处理系统的结构示意图,数据处理系统1100包括第一数据处理装置1000A和第二数据处理装置1000B,第一数据处理装置1000A部署于分布式存储系统中主节点,第二数据处理装置1000B部署于分布式存储系统中从节点。
第一数据处理装置1000A,用于:获取第一请求,所述第一请求用于更新纠删码EC条带中的第一数据块,根据所述第一请求确定所述第一数据块,所述第一数据块为所述第一请求关联的数据块,向从节点集合发送处理请求,所述从节点集合包括所述分布式存储系统中至少一个从节点,所述处理请求用于 指示将所述主节点更新数据块的操作卸载至所述从节点集合中的一个或多个从节点;
所述第二数据处理装置1000B,用于根据所述处理请求更新所述第一数据块和校验块。
在一些可能的实现方式中,所述第一数据处理装置1000A,具体用于:
向第一从节点发送包括所述第二数据块的第二请求;
所述第一从节点上的第二数据处理装置1000B,具体用于:
将所述第一数据块更新为第二数据块,并返回所述第一数据块;
所述第一数据处理装置1000A还用于:
根据所述第一数据块和所述第二数据块确定校验块更新信息;
所述第一数据处理装置1000A具体用于:
向第二从节点发送包括所述校验块更新信息的第三请求;
所述第二从节点上的第二数据处理装置1000B,具体用于:
根据所述校验块更新信息更新校验块。
在一些可能的实现方式中,所述第一数据处理装置1000A,具体用于:
向第一从节点发送包括所述第二数据块的第二请求;
所述第一从节点上的第二数据处理装置1000B,具体用于:
将所述第一数据块更新为第二数据块;
所述第一从节点上的第二数据处理装置1000B还用于:
根据所述第一数据块和所述第二数据块确定校验块更新信息;
所述第一数据处理装置1000A,具体用于:
通过所述第一从节点向第二从节点发送包括所述校验块更新信息的第三请求;
所述第二从节点上的第二数据处理装置1000B,具体用于:
根据所述校验块更新信息更新校验块。
在一些可能的实现方式中,所述第二请求为更新请求,所述更新请求用于指示所述第一从节点将所述第一数据块更新为所述第二数据块,并返回所述第一数据块。
由于图11所示的数据处理系统1100对应于图5、图8、图9所示的方法,故图11所示的数据处理系统1100的具体实现方式及其所具有的技术效果,可以参见前述实施例中的相关之处描述,在此不做赘述。
图12为本申请提供的一种计算设备1200的硬件结构图,该计算设备1200可以是前述主节点,用于实现数据处理装置1000的功能。该计算设备1200可以是服务器或终端设备。终端设备包括但不限于台式机、笔记本电脑、平板电脑或智能手机。
如图12所示,计算设备1200包括:总线1202、处理器1204、存储器1206和通信接口1208。处理器1204、存储器1206和通信接口1208之间通过总线1202通信。应理解,本申请不限定计算设备1200中的处理器、存储器的个数。
总线1202可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线1202可包括在计算设备1200各个部件(例如,存储器1206、处理器1204、通信接口1208)之间传送信息的通路。
处理器1204可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器1206可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器1206还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。存储器1206中存储有可执行的程序代码,处理器1204执行该可执行的程序代码以实现前述数据处理方法。具体的,存储器1206上存有数据处理装置1000用于执行数据处理方法的指令。
通信接口1208使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备1200与其他 设备或通信网络之间的通信。
本申请还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。
如图13所示,所述计算设备集群包括至少一个计算设备1200。计算设备集群中的一个或多个计算设备1200中的存储器1206中可以存有相同的数据处理系统1100用于执行数据处理方法的指令。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备1200也可以用于执行数据处理系统1100用于执行数据处理方法的部分指令。换言之,一个或多个计算设备1200的组合可以共同执行数据处理系统1100用于执行数据处理方法的指令。
需要说明的是,计算设备集群中的不同的计算设备1200中的存储器1206可以存储不同的指令,用于执行数据处理系统1100的部分功能。
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行上述执行数据处理方法。
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行上述数据处理方法。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的保护范围。

Claims (15)

  1. 一种数据处理方法,其特征在于,所述方法由分布式存储系统中主节点执行,包括:
    获取第一请求,所述第一请求用于更新纠删码EC条带中的数据块;
    根据所述第一请求确定所述第一数据块,所述第一数据块为所述第一请求关联的数据块;
    向从节点集合发送处理请求,所述从节点集合包括所述分布式存储系统中至少一个从节点,所述处理请求用于指示将所述主节点更新数据块的操作卸载至所述从节点集合中的一个或多个从节点。
  2. 根据权利要求1所述的方法,其特征在于,所述向从节点集合发送处理请求,包括:
    向第一从节点发送包括所述第二数据块的第二请求;
    接收所述第一从节点将所述第一数据块更新为第二数据块所返回的所述第一数据块;
    根据所述第一数据块和所述第二数据块确定校验块更新信息;
    向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
  3. 根据权利要求1所述的方法,其特征在于,所述向从节点集合发送处理请求,包括:
    向第一从节点发送包括所述第二数据块的第二请求,所述第二请求用于指示所述第一从节点将所述第一数据块更新为所述第二数据块,以及根据所述第一数据块和所述第二数据块确定校验块更新信息;
    通过所述第一从节点向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,
    所述第一数据块存储在第一从节点,所述主节点与所述第一从节点为同一节点;或者,
    所述EC条带中的校验块存储在第二从节点,所述主节点与所述第二从节点为同一节点。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述获取第一请求之前,所述方法还包括:
    获取包括数据流的第四请求;
    将所述数据流中的数据分块得到多个数据块,将所述多个数据块按列写入所述分布式存储系统中的数据块存储节点,所述数据块存储节点包括所述主节点和第一从节点;
    根据所述多个数据块中每组数据块计算校验块,将所述校验块写入所述分布式存储系统中的校验块存储节点,所述校验块存储节点包括第二从节点;
    其中,所述多个数据块无法写满至少一个EC条带时,对所述至少一个EC条带中无数据的分片执行空操作。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    获取包括起始地址的第五请求;
    根据所述起始地址确定目标节点,按列读取所述目标数据块。
  7. 一种数据处理装置,其特征在于,所述装置部署于分布式存储系统中主节点,所述装置包括:
    获取单元,用于获取第一请求,所述第一请求用于更新纠删码EC条带中的数据块;
    确定单元,用于根据所述第一请求确定所述第一数据块,所述第一数据块为所述第一请求关联的数据块;
    通信单元,用于向从节点集合发送处理请求,所述从节点集合包括所述分布式存储系统中至少一个从节点,所述处理请求用于指示将所述主节点更新数据块的操作卸载至所述从节点集合中的一个或多个从节点。
  8. 根据权利要求7所述的装置,其特征在于,所述通信单元具体用于:
    向第一从节点发送包括所述第二数据块的第二请求;
    接收所述第一从节点将所述第一数据块更新为第二数据块所返回的所述第一数据块;
    所述确定单元还用于:
    根据所述第一数据块和所述第二数据块确定校验块更新信息;
    所述通信单元具体用于:
    向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
  9. 根据权利要求7所述的装置,其特征在于,所述通信单元具体用于:
    向第一从节点发送包括所述第二数据块的第二请求,所述第二请求用于指示所述第一从节点将所述第一数据块更新为所述第二数据块,以及根据所述第一数据块和所述第二数据块确定校验块更新信息;
    通过所述第一从节点向第二从节点发送包括所述校验块更新信息的第三请求,所述校验块更新信息用于更新所述校验块。
  10. 根据权利要求7至9中任一项所述的装置,其特征在于,
    所述第一数据块存储在第一从节点,所述主节点与所述第一从节点为同一节点;或者,
    所述EC条带中的校验块存储在第二从节点,所述主节点与所述第二从节点为同一节点。
  11. 根据权利要求7至10中任一项所述的装置,其特征在于,所述获取单元还用于:
    所述获取第一请求之前,获取包括数据流的第四请求;
    所述装置还包括:
    读写单元,用于将所述数据流中的数据分块得到多个数据块,将所述多个数据块按列写入所述分布式存储系统中的数据块存储节点,所述数据块存储节点包括所述主节点和所述第一从节点;
    所述读写单元,还用于根据所述多个数据块中每组数据块计算校验块,将所述校验块写入所述分布式存储系统中的校验块存储节点,所述校验块存储节点包括所述第二从节点;
    其中,所述多个数据块无法写满至少一个EC条带时,所述读写单元具体用于对所述至少一个EC条带中无数据的分片执行空操作。
  12. 根据权利要求11所述的装置,其特征在于,所述获取单元还用于:
    获取包括起始地址的第五请求;
    所述读写单元还用于:
    根据所述起始地址确定目标节点,按列读取所述目标数据块。
  13. 一种计算设备集群,其特征在于,所述计算设备集群包括至少一台计算设备,所述至少一台计算设备包括至少一个处理器和至少一个存储器,所述至少一个存储器中存储有计算机可读指令;所述至少一个处理器执行所述计算机可读指令,以使得所述计算设备集群执行如权利要求1至6中任一项所述的方法。
  14. 一种计算机可读存储介质,其特征在于,包括计算机可读指令;所述计算机可读指令用于实现权利要求1至6中任一项所述的方法。
  15. 一种计算机程序产品,其特征在于,包括计算机可读指令;所述计算机可读指令用于实现权利要求1至6中任一项所述的方法。
PCT/CN2023/101259 2022-06-27 2023-06-20 一种数据处理方法及相关设备 WO2024001863A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210740423 2022-06-27
CN202210740423.1 2022-06-27
CN202211017671.XA CN117349075A (zh) 2022-06-27 2022-08-23 一种数据处理方法及相关设备
CN202211017671.X 2022-08-23

Publications (1)

Publication Number Publication Date
WO2024001863A1 true WO2024001863A1 (zh) 2024-01-04

Family

ID=89360122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/101259 WO2024001863A1 (zh) 2022-06-27 2023-06-20 一种数据处理方法及相关设备

Country Status (2)

Country Link
CN (1) CN117349075A (zh)
WO (1) WO2024001863A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930103A (zh) * 2016-05-10 2016-09-07 南京大学 一种分布式存储ceph的纠删码覆盖写方法
US20210096954A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for replica placement in a linked node system
US20210097034A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for efficient updating of data in a linked node system
US20210096952A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for erasure coded data placement in a linked node system
WO2021189905A1 (zh) * 2020-10-20 2021-09-30 平安科技(深圳)有限公司 分布式数据调取方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930103A (zh) * 2016-05-10 2016-09-07 南京大学 一种分布式存储ceph的纠删码覆盖写方法
US20210096954A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for replica placement in a linked node system
US20210097034A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for efficient updating of data in a linked node system
US20210096952A1 (en) * 2019-09-30 2021-04-01 Dell Products L.P. Method and system for erasure coded data placement in a linked node system
WO2021189905A1 (zh) * 2020-10-20 2021-09-30 平安科技(深圳)有限公司 分布式数据调取方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN117349075A (zh) 2024-01-05

Similar Documents

Publication Publication Date Title
US11119668B1 (en) Managing incompressible data in a compression-enabled log-structured array storage system
US10754550B2 (en) Optimized data placement for individual file accesses on deduplication-enabled sequential storage systems
US9934194B2 (en) Memory packet, data structure and hierarchy within a memory appliance for accessing memory
US11636089B2 (en) Deferred reclamation of invalidated entries that are associated with a transaction log in a log-structured array
CN102609361B (zh) 虚拟机存储数据迁移方法和装置
US11886729B2 (en) Data storage method and apparatus
US11487460B2 (en) Deferred reclamation of invalidated entries associated with replication in a log-structured array
US20140089562A1 (en) Efficient i/o processing in storage system
WO2021213281A1 (zh) 数据读取方法和系统
CN107423425B (zh) 一种对k/v格式的数据快速存储和查询方法
US20230236971A1 (en) Memory management method and apparatus
CN114327278A (zh) 数据的追加写方法、装置、设备以及存储介质
US20240070120A1 (en) Data processing method and apparatus
CN109375868B (zh) 一种数据存储方法、调度装置、系统、设备及存储介质
WO2022007225A1 (zh) 数据存储方法、存储系统、存储设备及存储介质
WO2023020136A1 (zh) 存储系统中的数据存储方法及装置
WO2023050856A1 (zh) 数据处理方法及存储系统
WO2024001863A1 (zh) 一种数据处理方法及相关设备
WO2023000686A1 (zh) 存储系统中的数据存储方法以及装置
WO2024021470A1 (zh) 一种跨区域的数据调度方法、装置、设备及存储介质
CN116594551A (zh) 一种数据存储方法及装置
CN116566396A (zh) 数据压缩方法、装置、存储介质、设备集群及程序产品
CN114265791A (zh) 一种数据调度方法、芯片以及电子设备
WO2023279833A1 (zh) 一种数据处理方法及装置
US20230244635A1 (en) Performance of file system operations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23830043

Country of ref document: EP

Kind code of ref document: A1