WO2021046693A1 - 存储系统中数据处理方法、装置以及存储系统 - Google Patents

存储系统中数据处理方法、装置以及存储系统 Download PDF

Info

Publication number
WO2021046693A1
WO2021046693A1 PCT/CN2019/104981 CN2019104981W WO2021046693A1 WO 2021046693 A1 WO2021046693 A1 WO 2021046693A1 CN 2019104981 W CN2019104981 W CN 2019104981W WO 2021046693 A1 WO2021046693 A1 WO 2021046693A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
metadata
storage node
data block
check
Prior art date
Application number
PCT/CN2019/104981
Other languages
English (en)
French (fr)
Inventor
王道辉
宋驰
王同雷
湛云
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980092966.3A priority Critical patent/CN113544635A/zh
Priority to PCT/CN2019/104981 priority patent/WO2021046693A1/zh
Priority to EP19945241.8A priority patent/EP3971701A4/en
Publication of WO2021046693A1 publication Critical patent/WO2021046693A1/zh
Priority to US17/569,908 priority patent/US20220129346A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/108Parity data distribution in semiconductor storages, e.g. in SSD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/835Timestamp

Definitions

  • This application relates to the field of data storage technology, and in particular to a data processing method, device and storage medium in a storage system.
  • the target device in the storage system may use erasure coding (EC) technology to write the business data to be written into the storage device on the corresponding storage node in the storage system in the form of stripes.
  • EC erasure coding
  • the client writing data may be the following process: the storage system client divides the business data to be written into data blocks according to the size of the stripe unit in the strips, and generates check blocks of the data blocks according to the EC algorithm.
  • the client sends the data segment and the metadata of the data segment to the storage node that stores the data segment, and the client sends the check segment to the storage node that stores the check segment.
  • the storage device receives the data partition and the metadata of the data partition, it backs up the metadata of the data partition to the storage node that stores the metadata of the data partition according to the backup strategy.
  • the storage node When the metadata of the data partition is successfully backed up to the storage node storing the metadata of the data partition, the storage node writes the data partition to the corresponding storage device of the storage node according to the storage location indicated by the metadata, and returns the write to the client Successful response.
  • the storage node needs to wait for the metadata backup to be successful before writing the data to the storage device in blocks, which causes the writing operation to take too long.
  • the embodiments of the present application provide a data processing method, a storage system, a computer device, and a storage medium in a storage system to overcome the problem of excessively long write operation time in related technologies.
  • this application provides a data processing method in a storage system, the storage system including a client, a data storage node, and a check storage node; the method includes:
  • the client sends the check segment in the stripe and the metadata of the data segment to the check storage node; wherein, the check segment is based on the data segment in the strip Sum check algorithm; a check block and a data block in the stripe have the same size and equal to the size of the stripe unit in the stripe; the metadata of the data block contains all The corresponding relationship between the user access address of the data segment and the stripe unit identifier storing the data segment in the stripe.
  • the client sends the data partitions and the metadata of the data partitions to the data storage node, and sends the metadata and check partitions of the data partitions to the check storage node, thereby
  • the metadata that divides the data into blocks is backed up on the verification storage node, thereby reducing the time for write operations.
  • the metadata of the data block further includes a time stamp.
  • the client obtains the timestamp of the data block
  • the time stamp of the new data block is determined according to the time stamp; wherein, the new data block and the data block have the same user access address.
  • this application provides a data processing method in a storage system, the storage system including a client, a data storage node, a metadata storage node, and a verification storage node; the method includes:
  • the data storage node receives a write request sent by the client; the write request includes data partitions written to a strip of a storage device in the data storage node and metadata of the data partitions; wherein, The check block of the strip is stored on the storage device of the check storage node; the memory of the check storage node is also used to store the metadata of the data block and record the data block in the log.
  • the metadata write operation of the block; the check block is generated according to the data block and check algorithm in the stripe; the size of one check block and one data block in the stripe The same and equal to the size of the stripe unit in the stripe; the metadata of the data segment includes the user access address of the data segment and the stripe unit storing the data segment in the stripe Correspondence of identification;
  • the data storage node sends a write request response to the client; the write request response is used to indicate the completion of the write request operation.
  • the metadata of the data block is written to the check storage device when the client writes the check block to the check storage device, which improves the storage efficiency of the metadata copy of the data block, thereby Reduce the write operation time.
  • the method further includes: after the data storage node sends a write request response to the client, the data storage node stores the metadata of the data block in the metadata Data storage node.
  • the metadata of the data block further includes a timestamp; the method further includes:
  • the data storage node notifies the check storage node to delete metadata of old data blocks in the memory of the check storage node and write metadata of the old data blocks recorded in the log Operation; wherein, the old data block and the data block contain the same user access address; the time stamp of the old data block is earlier than the time stamp of the metadata of the data block.
  • the method further includes:
  • the data storage node When the data storage node recovers from the failure, the data storage node obtains the write operation of the metadata of the data block containing the user access address in the log from the check storage node;
  • the metadata of the data block with the latest time stamp is used as the metadata of the data block corresponding to the user access address.
  • the storage node selects the metadata of the data partition with the latest time stamp from the metadata of the data partitions accessed by the same user as the metadata of the latest data partition, thereby ensuring data consistency.
  • this application provides a data processing method in a storage system.
  • the storage system includes a client, a data storage node, a metadata storage node, and a verification storage node; the method includes:
  • the check storage node receives a write request sent by the client; the write request includes metadata and check blocks written to the data partitions of the strips of the storage device in the data storage node; wherein, The check block is generated according to the data block and check algorithm in the strip; one check block and one data block in the strip have the same size and are equal to the strip.
  • the size of the stripe unit in the data segment; the metadata of the data segment includes the correspondence between the user access address of the data segment and the identifier of the stripe unit storing the data segment in the strip;
  • the verification storage node caches the data block in the memory and records the write operation of the metadata of the data block in a log.
  • the metadata of the data block further includes a time stamp.
  • the method further includes:
  • the check storage node receives a notification sent by the data storage node; the notification is used to instruct the check storage node to delete metadata and the log of old data blocks in the memory of the check storage node
  • the verification storage node deletes the metadata of the old data block in the memory and the write operation of the metadata of the old data block recorded in the log according to the notification.
  • the method further includes:
  • the verification storage node receives a log acquisition request sent by the data storage node; the log acquisition request is used to acquire metadata of the data block containing the user access address in the log from the verification storage node Write operation;
  • the check storage node sends to the data storage node a write operation of metadata of the data block including the user access address.
  • the present application provides a storage system, the storage system including a data storage node and a check storage node;
  • the data storage node is used for:
  • the first write request includes data partitions written to a strip of a storage device in the data storage node and metadata of the data partitions; the data partition The metadata of the block includes the correspondence between the user access address of the data block and the stripe unit identifier in the strip where the data block is stored;
  • the write request response is used to indicate completion of the first write request
  • the check storage node is used to receive a second write request sent by the client; the second write request includes the metadata of the data block and the check block; wherein the check block is Generated according to the data block and check algorithm in the stripe; a check block and a data block in the stripe have the same size and are equal to the stripe unit in the stripe size;
  • the data block is cached in the memory of the verification storage node and the write operation of the metadata of the data block is recorded in a log.
  • the data storage node is further configured to store metadata of the data block in the metadata storage node after sending a write request response to the client.
  • the metadata of the data block further includes a timestamp
  • the data storage node is also used to notify the check storage node to delete the metadata of the old data block in the memory of the check storage node and the metadata of the old data block recorded in the log.
  • Data writing operation wherein the old data block and the data block contain the same user access address; the time stamp of the old data block is earlier than the metadata time of the data block stamp.
  • the verification storage node is further configured to receive a notification sent by the data storage node, and delete the metadata of the old data block in the memory and the old data recorded in the log according to the notification. Write the metadata of the data block.
  • the present application provides a client, the client includes a processor and an interface, the processor communicates with the interface, and the processor is configured to execute to implement the above-mentioned first aspect or the first aspect. The operation performed by any of the methods.
  • the present application provides a data storage node, the data storage node includes a processor and an interface, the processor communicates with the interface, and the processor is configured to execute to implement the second aspect or the second aspect described above. The operation performed by any method provided by the aspect.
  • the present application provides a check storage node, the check storage node includes a processor and an interface, the processor communicates with the interface, and the processor is configured to execute to implement the third aspect or The operations performed by any of the methods provided in the third aspect.
  • the present application provides a data processing device in a storage system, which is used to execute the above-mentioned data processing method in the storage system.
  • the data processing apparatus in the storage system includes a module for executing the data processing method in the storage system provided in the foregoing first aspect or any optional manner of the foregoing first aspect.
  • the present application provides a data processing device in a storage system, which is used to execute the above-mentioned data processing method in the storage system.
  • the data processing device in the storage system includes a module for executing the data processing method in the storage system provided in the foregoing second aspect or any optional manner of the foregoing second aspect.
  • the present application provides a data processing device in a storage system, which is used to execute the above-mentioned data processing method in the storage system.
  • the data processing device in the storage system includes a module for executing the data processing method in the storage system provided in the third aspect or any optional manner of the third aspect.
  • the present application provides a storage medium, the storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to implement the above-mentioned first aspect or any one of the methods provided in the first aspect. operating.
  • the present application provides a storage medium, the storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to implement the above-mentioned second aspect or any of the methods provided in the second aspect. operating.
  • the present application provides a storage medium that stores at least one instruction, and the instruction is loaded and executed by a processor to implement the above-mentioned third aspect or any of the methods provided in the third aspect. operating.
  • FIG. 1 is a schematic diagram of a storage system provided by an embodiment of the present application.
  • Figure 2 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a data processing method in a storage system according to an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a data processing device in a storage system provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a data processing device in a storage system provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a data processing device in a storage system provided by an embodiment of the present application.
  • FIG. 1 is a schematic diagram of a storage system provided by an embodiment of the present application.
  • the storage system includes a client 101, a management device 102, and a storage node 103.
  • the storage system can store the user's business data in the form of stripes.
  • N For a stripe, it can include N+M blocks, where there are N data blocks and M
  • a check block the data block is used to store the user's business data (service data to be written), and the check block is used to store the check data of the business data, where N and M are positive integers greater than 0 . Both data block and check block are called block.
  • the client 101 is responsible for writing and reading business data.
  • the client can request the management device 102 to allocate a stripe.
  • the strip includes a plurality of strip units, and each strip unit is used to store a block. Multiple stripe units of the stripe constitute an erasure coding (EC) relationship. Multiple strip units in one strip have the same size.
  • Each strip unit has a strip unit identification.
  • the management device 102 records the correspondence between the identifier of the strip unit of a strip and the storage device in the storage node 103 that provides the storage space of the strip unit.
  • the client 101 requests a strip from the management device 102, and the management device 102 allocates a strip to the client and provides the identification of the strip unit in the strip, and the client will divide the business data into data according to the size of the strip unit.
  • Blocking according to the EC algorithm to generate check blocks for data blocks.
  • the size of the data block and the check block are the same, and equal to the size of the stripe unit.
  • the user sends service data to the client 101, which will carry the user access address corresponding to the service data.
  • it may be a logical block address (logical block address, LBA) address.
  • LBA logical block address
  • it may also be other identifiers that can be recognized by the user, which is not limited in the embodiment of the present application.
  • the user in the embodiment of the present application refers to a device that writes service data or reads service data.
  • the client 101 divides the service into data blocks, and each data block has a corresponding user access address.
  • the client 101 records the correspondence between the user access address of the data segment and the strip unit identifier of the strip unit storing the data segment according to the strip unit identifier of the strip assigned by the management device 102, that is, the record data segment Metadata, where the metadata of the block is used to indicate the storage location of the block, and the metadata of the block may include the storage location of the strip unit storing the block on the storage node 103.
  • the storage system also includes a partition view, and the partition view is used to record the correspondence between the strip units of all stripes in a partition and the storage device of the storage node 103 that provides storage space for the strip units of the strip.
  • the specific implementation may be the corresponding relationship between the strip unit identifier and the storage device identifier. Therefore, after the management device 102 allocates a stripe, the client 101 can write the data block and the check block into the storage device corresponding to the corresponding stripe ID according to the stripe identifier of the stripe.
  • the storage device in the embodiment of the present application may be a mechanical hard disk or a solid-state hard disk.
  • the business data may include block data, file data, or object data, etc.
  • the specific content of the business data is not limited in the embodiment of the present application.
  • the storage node 103 used to provide a storage device and store data partitions is referred to as a data storage node
  • the storage node 103 used to provide storage equipment and store check partitions is referred to as a check storage node.
  • an embodiment of the present application provides a data writing method.
  • the client 101 sends a stripe allocation request to the management device 102, and the stripe allocation request is used to indicate the request for stripe allocation.
  • the management device 102 returns the stripe information to the client 101 based on the stripe allocation request, and the stripe information carries the identification of the stripe unit of the stripe.
  • the client 101 receives the strip information returned by the management device 102.
  • FIG. 2 is a schematic structural diagram of a computer device provided in an embodiment of the present application.
  • the processor may be a central processing unit (CPU).
  • the processor 201 is configured to execute various method embodiments and provide methods executed by corresponding devices.
  • the interface 202 may be a network interface card or a host bus adapter, etc., which is not limited in the embodiment of the present application.
  • the processor 201 communicates with the interface 202.
  • the computer device 200 may also include a hard disk, such as a mechanical disk or a solid state drive (SSD), and the computer device 200 may also include other components for implementing device functions, which will not be repeated here.
  • SSD solid state drive
  • a computer-readable storage medium such as a memory including instructions, which can be executed by a processor in a computer device to complete the data reconstruction method in the following embodiments.
  • the computer-readable storage medium may be read-only memory (ROM), random access memory (RAM), compact disc read-only memory (CD-ROM), Tapes, floppy disks and optical data storage devices, etc.
  • the client 101 sends a stripe allocation request to the management device 102, where the stripe allocation request is used to indicate a request for stripe allocation.
  • This strip is used to store the business data to be written.
  • the client 101 may divide the service data to be written into data blocks based on the strips allocated by the management device 102, and generate check blocks of the data blocks according to the EC algorithm. According to the partition view, the client 101 writes the data partition and the check partition into the storage device of the storage node 103 corresponding to the stripe unit, respectively.
  • the strip allocation request can carry the device identification of the client 101 so that the management device 102 can return strip information to the client 101 corresponding to the device identification based on the device identification.
  • the device identifier of the client can be the internet protocol address (IP) of the client 101, or it can be the media access control (MAC) address of the client 101.
  • IP internet protocol address
  • MAC media access control
  • the embodiment of the application does not identify the device. Make specific restrictions.
  • N and M in the embodiment of the present application can have multiple values, so the storage system of the embodiment of the present application has multiple EC modes. Therefore, the client 101 requesting the management device 102 to allocate a strip may also carry EC mode information.
  • the embodiment of the present application does not specifically limit the values of M and N.
  • the management device 102 allocates strips to the client 101 based on the received strip allocation request, and returns strip information to the client, where the strip information includes the strip unit identifier of the strip.
  • the strip unit identifier of the strip is used to uniquely indicate the strip unit contained in the strip.
  • the strip unit identifier can be a number randomly assigned by the management device 102 to the strip unit of the strip, or it can be the management device 102 assigning the strip unit of the strip based on a preset strip unit identification allocation rule.
  • the stripe unit ID may also be divided by other devices of the storage system, which is not limited in the embodiment of the present application.
  • the embodiment of the present application also does not specifically limit the preset strip unit identification allocation rules, the allocation method of the strip unit identification, and the representation manner of the strip unit identification.
  • the client 101 obtains a striping unit identifier from the received striping information.
  • the client 101 divides the service data to be written into data blocks, and generates check blocks of the data blocks according to the data blocks in the stripe and the EC algorithm.
  • the client 101 divides the service data to be written into data blocks of the strip unit size according to the strip unit size of the strips, and generates check blocks of the data blocks according to the EC algorithm.
  • the client 101 records the metadata of the data segment, that is, the correspondence between the user access address of the data segment and the strip unit identifier of the strip unit corresponding to the storage data segment.
  • the stripe unit storing data blocks is called a data stripe unit
  • the stripe unit storing check blocks is called a check stripe unit.
  • the process of dividing the data into blocks can be any one of the following process 1 to process 3.
  • the client 101 may divide the business data to be written into data blocks according to the block size.
  • the client 101 may divide the business data to be written into data blocks according to the block size, and The number of divided data blocks is smaller than the number of data strip units of the strip. For the missing data blocks, 0 can be used to complete these data blocks, or these data blocks can be made empty, which is not specifically limited in the embodiment of the present application.
  • the client 101 may divide the business data to be written into data blocks according to the block size. For the remaining service data to be written, the client 101 may execute step 301, so that the management device 102 redistributes stripes.
  • the client 101 adds a time stamp to the metadata of the data segment, where the time stamp is used to indicate the time when the data segment is written.
  • the time stamp is the time stamp of the data block, that is, the writing time of the data block.
  • the time for writing the data block may be the time when the client 101 sends the data block to the storage device that provides storage space for the data strip unit corresponding to the data block, or the time when the user writes data to the storage system.
  • the timestamp may be the number or version number of the data block with the same user access address by the client, or the specific storage system clock time, etc.
  • the embodiment of the present application does not specifically limit the time stamp.
  • the user will modify the data block, that is, the user modifies the data block of the same user access address. Based on the timestamp, it is possible to determine which data block is the latest data block.
  • the process of determining the latest block can be: the client obtains the timestamp of the data block; determines the new data block according to the timestamp of the data block; where the new data block has the same user access as the data block address.
  • the client 101 sends the data segment and the metadata of the data segment to the data storage device storing the data segment, and sends the metadata of the verification segment and the data segment to the verification storage device storing the verification segment; Among them, the metadata of the data block contains a timestamp.
  • the relationship between the stripe unit of the stripe and the storage device is implemented by a distributed algorithm.
  • the strip units at the same position of multiple strips belonging to the same partition are located in the same storage device.
  • the first stripe units of stripe 1 and stripe 2 belonging to the same partition are located in the same storage device, that is, the same storage device provides storage space. Therefore, after the data block of the storage address accessed by the same user is modified, the modified data block is located in a different strip of the same partition, but it is still located on the same storage device as the data block before the modification.
  • the strip includes 5 strip units, they are 3 data strip units and 2 check strip units respectively.
  • data stripe unit 1 data stripe unit 2, data stripe unit 3, check stripe unit 1, and check stripe unit 2.
  • the corresponding segment is sent to the corresponding storage device.
  • data block 1 is stored in data striping unit 1
  • data block 2 is stored in data striping unit 2
  • data block 3 is stored in data striping unit 3
  • parity block 1 is stored in check striping unit 1.
  • the check block 2 is stored in the check strip unit 2.
  • Data stripe unit 1 corresponds to data storage device 1
  • data stripe unit 2 corresponds to data storage device 2
  • data stripe unit 3 corresponds to data storage device 3
  • check stripe unit 1 corresponds to check storage device 1
  • check strip Unit 2 corresponds to check storage device 2. Because the storage device is located in the storage node, the client 101 sends the corresponding segment to the storage device, and the segment must first be sent to the storage node where the storage device is located.
  • the client 101 sends the metadata of data partition 1 and data partition 1 to the data storage device 1 on the data storage node 1, and the client 101 sends the data partition 2 and data partition to the data storage device 2 on the data storage device 2.
  • the client 101 sends the metadata of data block 3 and data block 3 to the data storage device 3 on the data storage device 3, and the client 101 sends the metadata of the data block 3 to the check storage device 1 on the check storage node 4.
  • the client 101 sends the check to the check storage device 2 on the check storage device 5.
  • the metadata of block 2 and data block 1, the metadata of data block 2, and the metadata of data block 3.
  • the data storage device receives the data partition and the metadata of the data partition sent by the client 101, and the verification storage device receives the verification partition and the metadata of the data partition sent by the client.
  • the data storage device 1 on the data storage node 1 receives the data block 1 and the metadata of the data block 1 sent by the client 101
  • the data storage device 2 on the data storage device 2 receives the data block 2 sent by the client 101 and
  • the metadata of data block 2 receives the data block 3 and the metadata of the data block 3 sent by the client 101
  • the check storage device 1 on the verification storage node 4 receives
  • the client 101 sends the check block 1 and the metadata of the data block 1, the metadata of the data block 2 and the metadata of the data block 3
  • the check storage device 2 on the check storage device 5 receives the client The metadata of check block 2 and data block 1, the metadata of data block 2 and the metadata of data block 3 sent by 101.
  • the metadata of the data partition can be backed up on the verification storage device.
  • the metadata of the data partition is not only stored on the corresponding data storage device, but also stored on the verification storage device, thereby completing the writing Operation, thus reducing the write operation time.
  • there are multiple copies of the metadata of the data block thereby providing the reliability of the metadata of the data block.
  • the metadata of the data block is written to the check storage device by the client 101 when the check block is written to the check storage device, the storage efficiency of the metadata copy of the data block is improved, thereby reducing The time for the write operation.
  • the data storage node writes the metadata of the data block into the memory, writes the metadata of the data block into the record log, and writes the data block into the data storage device on the data storage node; the verification storage node writes the data The metadata of the block is written into the memory, the metadata of the data block is written into the record log, and the check block is written to the check storage device on the check storage node.
  • the memory of the data storage node stores metadata and logs of data blocks of data storage devices distributed on the data storage node, and the logs are used to record metadata write operations of the data blocks.
  • the metadata and log of the data block are stored in the memory of the verification storage node, and the log is used to record the metadata write operation of the data block.
  • the data block of the same user access address has different versions due to modification. That is, there will be data blocks with the same user access address located on the data storage device of the same data storage node, and the memory of the data storage node will store metadata and logs of the data blocks with the same user access address. According to the time stamp in the metadata of the data block, it can be judged which metadata of the data block is the metadata of the new data block.
  • the memory of the verification storage node will also store metadata and logs of multiple data blocks with the same user access address.
  • the data storage node backs up the metadata of the data block to the metadata storage node, and deletes the metadata of the data block in the memory of the data storage node and the metadata write operation of the data block in the log.
  • the data storage node backs up the metadata of the data block to the metadata storage node.
  • One way of implementation is to send the metadata of the data block to the metadata storage node, and the metadata storage node does not need to divide the data.
  • the metadata of the block is written into the metadata storage device on the metadata storage node.
  • the data storage node backs up the metadata of the data block to the metadata storage node, which sends the metadata of the data block to the metadata storage node, and the metadata storage node needs to write the metadata of the data block Enter the metadata storage device of the metadata storage device on the metadata storage node.
  • Step 309 implements multi-copy storage of metadata in data blocks, and improves the reliability of metadata in data blocks.
  • a storage node that stores metadata of data partitions is referred to as a metadata storage node.
  • a storage device that stores metadata of data blocks in a metadata storage node is called a metadata storage device.
  • the number of metadata storage nodes is not limited.
  • the data storage node notifies the check storage node to delete the metadata of the old data block in the memory of the check storage node and the write operation of the metadata of the old data block recorded in the log.
  • the old data block and the data block have the same user access address, and the time stamp in the metadata of the old data block is earlier than the time stamp in the metadata of the data block.
  • the storage node Before the data storage node stores the metadata of a certain data block, if the time stamp on the metadata of the data block is greater than the time stamp of the target metadata, indicating that the data block is the latest block, the storage The node can directly write the metadata of the data block into the memory. At this time, there is no need to wait for the metadata of the data block to be written to the storage device, thereby reducing the time of the write operation.
  • the target metadata is the metadata of all data blocks in the data storage node that have the same user access as the data block.
  • the log of the write operation of the data block in the memory of the verification storage node can be obtained.
  • the write operation of data blocks can be obtained from the logs of multiple check storage nodes, so as to prevent the loss or inconsistency of the metadata of the data blocks caused by an error in a check storage node.
  • the data storage node may send a log acquisition request to the verification storage node, and the verification storage node may send the data storage node a write operation of the metadata of the data block containing the user access address according to the log acquisition request.
  • the verification storage node can send all logs to the data storage node; in another implementation, the data storage node can carry the user's access address in the log acquisition request, and the verification storage node can obtain the user access address according to the user's access address.
  • the data storage node selects the metadata of the data block with the latest time stamp from the metadata of the data block accessed by the same user as the metadata of the latest data block, thereby ensuring data consistency.
  • the latest timestamp can be the largest.
  • the client can query the data storage node for the current latest time stamp of the data block with the same user access address, and the data storage node returns the same user access address to the client The current latest timestamp of the data block. Based on the latest timestamp, the client starts to assign a timestamp to the metadata of the data segment of the same user access address.
  • the data segment and the metadata of the data segment are sent to the data storage node through the client, and the metadata of the data segment and the check segment are sent to the check storage.
  • Nodes, data storage nodes or check storage nodes do not need to wait for the metadata backup to succeed, and can directly store the received blocks in the storage device, thereby reducing the time of the write operation.
  • the client can send the data partitions and metadata of the data partitions to the data storage node, and send the metadata and check partitions of the data partitions to the check storage node, and the data storage node and the checksum
  • the verification storage node stores the received block and metadata.
  • an embodiment of the present application provides a data processing method in a storage system.
  • the storage system includes a client, a data storage node, and a verification storage node; The method includes:
  • the client sends the check block in the stripe and the metadata of the data block to the check storage node; wherein the check block is based on the data block in the stripe Sum check algorithm; a check block and a data block in the stripe have the same size and equal to the size of the stripe unit in the stripe; the metadata of the data block contains all The corresponding relationship between the user access address of the data segment and the stripe unit identifier storing the data segment in the stripe.
  • the metadata of the data block further includes a time stamp.
  • the method further includes:
  • the time stamp of the new data block is determined according to the time stamp; wherein, the new data block and the data block have the same user access address.
  • the embodiment of the present application provides a data processing method in a storage system.
  • the storage system includes a client, a data storage node, a metadata storage node, and a verification storage node; the method includes:
  • the data storage node receives a write request sent by the client; the write request includes data partitions written to a strip of a storage device in the data storage node and metadata of the data partitions; wherein, The check block of the strip is stored on the storage device of the check storage node; the memory of the check storage node is also used to store the metadata of the data block and record the data block in the log.
  • the metadata write operation of the block; the check block is generated according to the data block and check algorithm in the stripe; the size of one check block and one data block in the stripe The same and equal to the size of the stripe unit in the stripe; the metadata of the data segment includes the user access address of the data segment and the stripe unit storing the data segment in the stripe Correspondence of identification;
  • the data storage node sends a write request response to the client; the write request response is used to indicate the completion of the write request operation.
  • the method further includes: after the data storage node sends a write request response to the client, the data storage node stores the metadata of the data block in the metadata storage node.
  • the metadata of the data block further includes a timestamp; the method further includes:
  • the data storage node notifies the check storage node to delete metadata of old data blocks in the memory of the check storage node and write metadata of the old data blocks recorded in the log Operation; wherein, the old data block and the data block contain the same user access address; the time stamp of the old data block is earlier than the time stamp of the metadata of the data block.
  • the method further includes:
  • the data storage node When the data storage node recovers from the failure, the data storage node obtains the write operation of the metadata of the data block containing the user access address in the log from the check storage node;
  • the metadata of the data block with the latest time stamp is used as the metadata of the data block corresponding to the user access address.
  • the embodiment of the present application provides a data processing method in a storage system.
  • the storage system includes a client, a data storage node, a metadata storage node, and a verification storage node; the method includes:
  • the check storage node receives a write request sent by the client; the write request includes metadata and check blocks written to the data partitions of the strips of the storage device in the data storage node; wherein, The check block is generated according to the data block and check algorithm in the strip; one check block and one data block in the strip have the same size and are equal to the strip.
  • the size of the stripe unit in the data segment; the metadata of the data segment includes the correspondence between the user access address of the data segment and the identifier of the stripe unit storing the data segment in the strip;
  • the verification storage node caches the data block in the memory and records the write operation of the metadata of the data block in a log.
  • the metadata of the data block further includes a time stamp.
  • the method further includes:
  • the check storage node receives a notification sent by the data storage node; the notification is used to instruct the check storage node to delete metadata and the log of old data blocks in the memory of the check storage node
  • the verification storage node deletes the metadata of the old data block in the memory and the write operation of the metadata of the old data block recorded in the log according to the notification.
  • the method further includes:
  • the verification storage node receives a log acquisition request sent by the data storage node; the log acquisition request is used to acquire metadata of the data block containing the user access address in the log from the verification storage node Write operation;
  • the check storage node sends to the data storage node a write operation of metadata of the data block including the user access address.
  • an embodiment of the present application also provides a storage system, the storage system including a data storage node and a check storage node;
  • the data storage node is used for:
  • the first write request includes data partitions written to a strip of a storage device in the data storage node and metadata of the data partitions; the data partition The metadata of the block includes the correspondence between the user access address of the data block and the stripe unit identifier in the strip where the data block is stored;
  • the write request response is used to indicate completion of the first write request
  • the check storage node is used to receive a second write request sent by the client; the second write request includes the metadata of the data block and the check block; wherein the check block is Generated according to the data block and check algorithm in the stripe; a check block and a data block in the stripe have the same size and are equal to the stripe unit in the stripe size;
  • the data block is cached in the memory of the verification storage node and the write operation of the metadata of the data block is recorded in a log.
  • the data storage node is further configured to store the metadata of the data block in the metadata storage node after sending a write request response to the client.
  • the metadata of the data block further includes a timestamp
  • the data storage node is also used to notify the check storage node to delete the metadata of the old data block in the memory of the check storage node and the metadata of the old data block recorded in the log.
  • Data writing operation wherein the old data block and the data block contain the same user access address; the time stamp of the old data block is earlier than the metadata time of the data block stamp.
  • the verification storage node is further configured to receive a notification sent by the data storage node, and delete the metadata of the old data block in the memory and the old data recorded in the log according to the notification. Write the metadata of the data block.
  • the storage system includes the device, a data storage node, and a check storage node; the device includes:
  • the first sending module 401 is configured to send data blocks in a strip and metadata of the data blocks to the data storage node;
  • the second sending module 402 is configured to send the check block in the stripe and the metadata of the data block to the check storage node; wherein, the check block is based on the data in the stripe.
  • the data is generated by the data block and the check algorithm; a check block in the stripe is the same size as a data block, and is equal to the size of the strip unit in the stripe; the size of the data block
  • the metadata includes a correspondence between the user access address of the data segment and the stripe unit identifier in the strip where the data segment is stored.
  • the metadata of the data block further includes a time stamp.
  • the device further includes:
  • An obtaining module configured to obtain the timestamp of the data block
  • the determining module is configured to determine the timestamp of the new data block according to the timestamp; wherein, the new data block and the data block have the same user access address.
  • FIG. 5 is a schematic structural diagram of a data processing device in a storage system provided by an embodiment of the present application.
  • the storage system includes a client, the device, a metadata storage node, and a verification storage node; the device includes:
  • the receiving module 501 is configured to receive a write request sent by the client; the write request includes data partitions and metadata of the data partitions written into the strips of the storage device in the data storage node; wherein, The check block of the strip is stored on the storage device of the check storage node; the memory of the check storage node is also used to store the metadata of the data block and record the data in the log Metadata write operation of the block; the check block is generated according to the data block and check algorithm in the stripe; one check block and one data block in the stripe The size is the same and equal to the size of the stripe unit in the stripe; the metadata of the data block includes the user access address of the data block and the stripe storing the data block in the stripe Correspondence of unit identification;
  • the sending module 502 is configured to send a write request response to the client; the write request response is used to indicate the completion of the write request operation.
  • the device further includes:
  • the storage module is configured to store the metadata of the data block in the metadata storage node after the sending module sends a write request response to the client.
  • the metadata of the data block further includes a time stamp; the device further includes:
  • the notification module is used to perform step 310 described above.
  • the device further includes:
  • An obtaining module configured to, when the device recovers from a failure, the data storage node obtains from the check storage node the write operation of metadata of the data block containing the user access address in the log;
  • a determining module configured to determine the metadata of the data block with the latest time stamp from the write operation of the metadata of the data block of the user access address in the log;
  • the determining module is further configured to use the metadata of the data block with the latest time stamp as the metadata of the data block corresponding to the user access address.
  • the storage system includes a client, a data storage node, a metadata storage node, and the device; the device includes:
  • the receiving module 601 is configured to receive a write request sent by the client; the write request includes metadata and check partitions of the data partitions written in the strips of the storage device in the data storage node;
  • the check block is generated according to the data block and check algorithm in the strip; one check block and one data block in the strip have the same size and are equal to the strip.
  • the size of the stripe unit in the data segment; the metadata of the data segment includes the correspondence between the user access address of the data segment and the identifier of the stripe unit storing the data segment in the strip;
  • the cache module 602 is configured to cache the data block in the memory and record the write operation of the metadata of the data block in the log.
  • the metadata of the data block further includes a time stamp.
  • the device further includes: a deletion module
  • the receiving module 601 is further configured to receive a notification sent by the data storage node; the notification is used to instruct the verification storage node to delete metadata of old data blocks in the memory of the verification storage node, and The write operation of the metadata of the old data block recorded in the log; wherein the old data block and the data block contain the same user access address; the old data block The timestamp of is earlier than the timestamp of the metadata of the data segment;
  • the deletion module is configured to delete the metadata of the old data block in the memory and the write operation of the metadata of the old data block recorded in the log according to the notification.
  • the device further includes: a sending module
  • the receiving module 601 is further configured to receive a log obtaining request sent by the data storage node; the log obtaining request is used to obtain data blocks containing the user access address in the log from the verification storage node Write operation of metadata;
  • the sending module is configured to send to the data storage node a write operation of metadata of the data block including the user access address.
  • the embodiment of the present application also provides a storage medium, the storage medium stores at least one instruction, and the instruction is loaded and executed by the processor to realize the execution of the above-mentioned client, data storage node, or check storage node in the above-mentioned storage system. The operation performed during the data processing method.
  • modules in the devices in the embodiments of the present application may be hardware components, software modules, or a combination of the two, which is not limited in the embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种存储系统中数据处理方法以及存储系统,涉及数据存储技术领域,通过客户端将条带中的数据分块及数据分块的元数据发送至数据存储节点,将数据分块的元数据以及校验分块发送至校验存储节点,从而将数据分块的元数据在校验存储节点上实现了备份,从而降低了写操作的时间。

Description

存储系统中数据处理方法、装置以及存储系统 技术领域
本申请涉及数据存储技术领域,特别涉及一种存储系统中数据处理方法、装置及存储介质。
背景技术
存储系统中的目标设备可以采用纠删码(erasure coding,EC)技术,将待写入的业务数据以条带的形式写入该存储系统内的相应存储节点上的存储设备。
客户端写入数据可以是以下过程:存储系统客户端将待写入的业务数据根据条带中条带单元的大小划分成数据分块,根据EC算法生成数据分块的校验分块。客户端向存储数据分块的存储节点发送数据分块及数据分块的元数据,客户端向存储校验分块的存储节点发送校验分块。存储设备接收数据分块及数据分块的元数据后,将数据分块的元数据按照备份策略备份到存储数据分块的元数据的存储节点。当数据分块的元数据成功备份到存储数据分块的元数据的存储节点,存储节点将数据分块按照元数据指示的存储位置写入到存储节点的相应存储设备,向客户端返回写入成功响应。
在上述数据写入的过程中,存储节点需要等待元数据备份成功后才可以将数据分块写入到存储设备,从而导致写操作的时间过长。
发明内容
本申请实施例提供了一种存储系统中数据处理方法、存储系统、计算机设备及存储介质,以克服相关技术中存在的写操作时间过长的问题。
第一方面,本申请提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点和校验存储节点;所述方法包括:
所述客户端向所述数据存储节点发送条带中的数据分块及所述数据分块的元数据;
所述客户端向所述校验存储节点发送条带中的校验分块以及所述数据分块的元数据;其中,所述校验分块是根据所述分条中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系。
在上述实现方式中,通过客户端将条带中的数据分块及数据分块的元数据发送 至数据存储节点,将数据分块的元数据以及校验分块发送至校验存储节点,从而将数据分块的元数据在校验存储节点上实现了备份,从而降低了写操作的时间。
在一种可能的实现方式中,所述数据分块的元数据还包含时间戳。
在一种可能的实现方式中,所述客户端获取所述数据分块的时间戳;
根据所述时间戳确定新的数据分块的时间戳;其中,所述新的数据分块与所述数据分块具有相同的用户访问地址。
第二方面,本申请提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
所述数据存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;其中,所述条带的校验分块存储在所述校验存储节点的存储设备上;所述校验存储节点的内存还用于存储所述数据分块的元数据并且在日志中记录所述数据分块的元数据写入操作;所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
所述数据存储节点向所述客户端发送写请求响应;所述写请求响应用于指示完成写请求操作。
基于上述实现方式,由于数据分块的元数据由客户端在向校验存储设备写入校验分块时一起写入到校验存储设备,提高了数据分块的元数据副本保存效率,从而降低了写操作时间。
在一种可能的实现方式中,所述方法还包括:所述数据存储节点向所述客户端发送写请求响应之后,所述数据存储节点将所述数据分块的元数据存储到所述元数据存储节点。
在一种可能的实现方式中,所述数据分块的元数据还包含时间戳;所述方法进一步包含:
所述数据存储节点通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
基于上述可能的实现方式,将相同用户访问地址的旧的数据分块的元数据从校验存储节点的内存中淘汰,从而释放校验存储节点的内存空间。
在一种可能的实现方式中,所述方法还包括:
当所述数据存储节点从故障中恢复,所述数据存储节点从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
从所述日志中的所述用户访问地址的数据分块的元数据的写入操作中确定时间 戳最新的数据分块的元数据;
将所述时间戳最新的数据分块的元数据作为所述用户访问地址对应的数据分块的元数据。
基于上述可能的实现方式,存储节点从具有相同用户访问的数据分块的元数据中选择时间戳最新的数据分块的元数据作为最新的数据分块的元数据,从而保证数据一致性。
第三方面,本申请提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
所述校验存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
所述校验存储节点在内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
在一种可能的实现方式中,所述数据分块的元数据还包含时间戳。
在一种可能的实现方式中,所述方法还包括:
所述校验存储节点接收所述数据存储节点发送的通知;所述通知用于指示所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳;
所述校验存储节点根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
在一种可能的实现方式中,所述方法还包括:
所述校验存储节点接收所述数据存储节点发送的日志获取请求;所述日志获取请求用于从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
所述校验存储节点向所述数据存储节点发送包含所述用户访问地址的数据分块的元数据的写入操作。
第四方面,本申请提供一种存储系统,所述存储系统包含数据存储节点和校验存储节点;
所述数据存储节点用于:
接收所述客户端发送的第一写请求;所述第一写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;所述数据分块的元数据 包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
向所述客户端发送写请求响应;所述写请求响应用于指示完成所述第一写请求;
所述校验存储节点用于接收所述客户端发送的第二写请求;所述第二写请求包含所述数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;
在所述校验存储节点的内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
在一种可能的实现方式中,所述数据存储节点还用于向所述客户端发送写请求响应之后,将所述数据分块的元数据存储到所述元数据存储节点。
在一种可能的实现方式中,所述数据分块的元数据还包含时间戳;
所述数据存储节点,还用于通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
所述校验存储节点,还用于接收所述数据存储节点发送的通知,根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
基于上述可能的实现方式,将相同用户访问地址的旧的数据分块的元数据从校验存储节点的内存中淘汰,从而释放校验存储节点的内存空间。
第五方面,本申请提供一种客户端,所述客户端包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如上述第一方面或第一方面提供的任一方法所执行的操作。
第六方面,本申请提供一种数据存储节点,所述数据存储节点包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如上述第二方面或第二方面提供的任一方法所执行的操作。
第七方面,本申请提供一种校验存储节点,所述校验存储节点包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如上述第三方面或第三方面提供的任一方法所执行的操作。
第八方面,本申请提供一种存储系统中数据处理装置,用于执行上述存储系统中数据处理方法。具体地,该存储系统中数据处理装置包括用于执行上述第一方面或上述第一方面的任一种可选方式提供的存储系统中数据处理方法的模块。
第九方面,本申请提供一种存储系统中数据处理装置,用于执行上述存储系统中数据处理方法。具体地,该存储系统中数据处理装置包括用于执行上述第二方面或上述第二方面的任一种可选方式提供的存储系统中数据处理方法的模块。
第十方面,本申请提供一种存储系统中数据处理装置,用于执行上述存储系统中数据处理方法。具体地,该存储系统中数据处理装置包括用于执行上述第三方面或上述第三方面的任一种可选方式提供的存储系统中数据处理方法的模块。
第十一方面,本申请提供一种存储介质,该存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现如上述第一方面或第一方面提供的任一方法所执行的操作。
第十二方面,本申请提供一种存储介质,该存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现如上述第二方面或第二方面提供的任一方法所执行的操作。
第十三方面,本申请提供一种存储介质,该存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现如上述第三方面或第三方面提供的任一方法所执行的操作。
附图说明
图1是本申请实施例提供的一种存储系统的示意图;
图2是本申请实施例提供的一种计算机设备的结构示意图;
图3是本申请实施例提供的一种存储系统中数据处理方法的流程图;
图4是本申请实施例提供的一种存储系统中数据处理装置的结构示意图;
图5是本申请实施例提供的一种存储系统中数据处理装置的结构示意图;
图6是本申请实施例提供的一种存储系统中数据处理装置的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
图1是本申请实施例提供的一种存储系统的示意图,参见图1,该存储系统包括客户端101、管理设备102以及存储节点103。为了保证存储系统的冗余度,该存储系统可以以条带的形式存储用户的业务数据,对于一个条带而言,可以包括N+M个分块,其中,有N个数据分块以及M个校验分块,数据分块用于存储用户的业务数据(待写入的业务数据),校验分块用于存储业务数据的校验数据,其中,N和M为大于0的正整数。数据分块和校验分块均称为分块。
其中,客户端101负责写入和读取业务数据。客户端可以向管理设备102请求分配条带。其中,条带包括多个条带单元,每一个条带单元用于存储一个分块。条带的多个条带单元之间构成纠删码(erasure coding,EC)关系。一个条带中的多个条带单元的大小相同。每一个条带单元具有条带单元标识。管理设备102记录一个条带的条带单元的标识与提供条带单元存储空间的存储节点103中的存储设备的对应关系。客户端101从管理设备102请求分配条带,管理设备102为客户端分配条 带并且提供条带中的条带单元的标识,再由客户端将根据条带单元的大小将业务数据划分成数据分块,根据EC算法生成数据分块的校验分块。其中,数据分块和校验分块的大小均相同,并且等于条带单元的大小。其中,用户向客户端101发送业务数据,会携带业务数据对应的用户访问地址。例如,可以是逻辑块地址(logical block address,LBA)地址。在其他场景中,还可以是用户可以识别的其他标识,本申请实施例对此不作限定。本申请实施例中的用户是指写入业务数据或读取业务数据的设备。客户端101将业务划分成数据分块,每一个数据分块有对应的用户访问地址。客户端101根据管理设备102分配的条带的条带单元标识,记录数据分块的用户访问地址与存储该数据分块的条带单元的条带单元标识的对应关系,即记录数据分块的元数据,其中,分块的元数据用于指示分块的存储位置,分块的元数据可以包括存储该分块的条带单元在存储节点103上的存储位置,进一步的,本申请实施例中,存储系统还包含分区视图,分区视图用于记录一个分区中的所有条带的条带单元与为该条带的条带单元提供存储空间的存储节点103的存储设备的对应关系。具体实现可以是条带单元标识与存储设备标识的对应关系。因此,客户端101在管理设备102分配条带后,根据条带的条带标识,可以将数据分块以及校验分块写入相应的条带标识对应的存储设备。
本申请实施例中的存储设备可以是机械硬盘,也可以是固态硬盘。其中,业务数据可以包括块数据、文件数据或对象数据等,本申请实施例对该业务数据的具体内容不做限定。
本申请实施例中,用于提供存储设备且存储数据分块的存储节点103称为数据存储节点,用于提供存储设备且存储校验分块的存储节点103称为校验存储节点。
结合上述对存储系统中的各个设备的描述,本申请一个实施例提供了一种数据写入方法,客户端101向管理设备102发送条带分配请求,条带分配请求用于指示请求分配条带。管理设备102基于条带分配请求,向客户端101返回条带信息,条带信息携带条带的条带单元的标识。客户端101接收管理设备102返回的条带信息。
需要说明的是,客户端101、管理设备102以及存储节点103均可以是计算机设备,图2是本申请实施例提供的一种计算机设备的结构示意图,计算机设备200包括可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器201和接口202,其中处理器可以是中央处理单元(central processing units,CPU)。处理器201用于执行各个方法实施例提供相应设备执行的方法。接口202可以是网络接口卡或者主机总线适配器等,本申请实施例不作限定。处理器201与接口202通信。当然,该计算机设备200还可以包含硬盘,例如机械盘或固态硬盘(solid state drive,SSD),该计算机设备200还可以包括其他用于实现设备功能的部件,在此不做赘述。
在示例性实施例中,还提供了一种计算机可读存储介质,例如包括指令的存储器,上述指令可由计算机设备中的处理器执行以完成下述实施例中的数据重构的方 法。例如,该计算机可读存储介质可以是只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、只读光盘(compact disc read-only memory,CD-ROM)、磁带、软盘和光数据存储设备等。
为了表明存储系统写入数据的过程,参见图3所示的本申请实施例提供的一种存储系统中数据处理方法的流程图,该方法流程包括下述步骤301-315。
301、客户端101向管理设备102发送条带分配请求,该条带分配请求用于指示请求分配条带。
该条带用于存储待写入的业务数据。客户端101可以基于管理设备102分配的条带,将待写入业务数据划分成数据分块,并根据EC算法生成数据分块的校验分块。客户端101根据分区视图,将数据分块以及校验分块分别写入条带单元对应的存储节点103的存储设备中。
由于存储系统内的客户端比较多,该条带分配请求中可以携带客户端101的设备标识,以便管理设备102可以基于该设备标识,向该设备标识对应的客户端101返回条带信息。该客户端的设备标识可以是客户端101的互联网协议地址(internet protocol address,IP),也可以是客户端101的介质访问控制(media access control,MAC)地址,本申请实施例对该设备标识不做具体限定。
在N+M的EC实现中,本申请实施例中N和M可以有多种取值,从而本申请实施例的存储系统具有多种EC模式。因此,客户端101向管理设备102请求分配条带还可以携带EC模式信息。本申请实施例对M和N的取值不做具体限定。
302、管理设备102基于接收的条带分配请求,为客户端101分配条带,并向客户端返回条带信息,该条带信息包含条带的条带单元标识。
该条带的条带单元标识用于唯一指示该条带包含的条带单元。该条带单元标识可以是该管理设备102随机为该条带的条带单元分配的编号,还可以是管理设备102基于预设的条带单元标识分配规则,为该条带的条带单元分配的条带单元标识。条带单元标识也可以由存储系统的其它设备进行划分,本申请实施例对此不作限定。本申请实施例对预设的条带单元标识分配规则、该条带单元标识的分配方式以及条带单元标识的表示方式也不做具体限定。
303、客户端101从接收的条带信息中,获取条带单元标识。
304、客户端101将待写入的业务数据划分成数据分块,根据条带中的数据分块和EC算法生成数据分块的校验分块。
如前所述,客户端101根据条带的条带单元大小,将待写入的业务数据划分成条带单元大小的数据分块,并且根据EC算法生成数据分块的校验分块。客户端101记录数据分块的元数据,即数据分块的用户访问地址与存储数据分块对应的条带单元的条带单元标识的对应关系。本申请实施例将存储数据分块的条带单元称为数据条带单元,将存储校验分块的条带单元称为校验条带单元。
其中,根据待写入的业务数据,划分数据分块的过程可以是下述过程1-过程3 中的任一过程。
过程1、当待写入的业务数据的大小等于条带所有数据条带单元的大小时,客户端101可以按照分块的大小,将待写入的业务数据划分成数据分块。
过程2、当待写入的业务数据的大小小于条带上所有数据条带单元的大小时,该客户端101可以按照分块的大小,将待写入的业务数据划分成数据分块,并且划分得到的数据分块的数量小于条带的数据条带单元的个数。对于缺少的数据分块,可以用0补全这些数据分块,或者使这些数据分块为空,对此本申请实施例不做具体限定。
过程3、当待写入的业务数据的大小大于条带上所有数据条带单元的大小时,该客户端101可以按照分块的大小,将待写入的业务数据划分成数据分块。对于剩余的待写入的业务数据,客户端101可以执行步骤301,以便管理设备102再分配条带。
305、客户端101在数据分块的元数据中添加时间戳,该时间戳用于指示数据分块的写入的时间。
本发明实施例中,时间戳是数据分块的时间戳,即数据分块的写入时间。对数据分块写入的时间可以是客户端101向为该数据分块对应的数据条带单元提供存储空间的存储设备发送该数据分块的时间,或用户向存储系统写入数据的时间。具体实现,时间戳可以是客户端对具有相同用户访问地址的数据分块的编号,或版本号等,也可以是具体存储系统时钟时间等。本申请实施例对时间戳不做具体限定。以某一个数据分块,用户会修改该数据块,即用户对同一个用户访问地址的数据分块进行修改,基于时间戳,则可以确定哪一个数据分块是最新的数据分块。则确定最新分块的过程可以是:客户端获取数据分块的时间戳;根据数据分块的时间戳确定新的数据分块;其中,新的数据分块与数据分块具有相同的用户访问地址。
306、客户端101向存储数据分块的数据存储设备发送数据分块以及数据分块的元数据,向存储校验分块的校验存储设备发送校验分块以及数据分块的元数据;其中,数据分块的元数据包含时间戳。
其中,在存储系统中,条带的条带单元与存储设备的关系由分布式算法实现。其中一种实现,属于同一分区的多个条带的相同位置的条带单元位于同一个存储设备。例如,属于同一个分区的条带1和条带2的第一个条带单元均位于同一个存储设备,即由同一个存储设备提供存储空间。因此,同一用户访问存储地址的数据分块修改后,修改后的数据分块位于同一分区的不同条带,但仍然与修改前的数据分块位于同一个存储设备。
示例性的,如果条带包含5个条带单元,分别为3个数据条带单元和2个校验条带单元。例如,数据条带单元1、数据条带单元2、数据条带单元3、校验条带单元1和校验条带单元2。根据条带中条带单元与存储设备的关系,例如,根据条带单元标识与存储设备标识的对应关系,将相应的分块发送到对应的存储设备上。例 如,数据分块1存储到数据条带单元1,数据分块2存储到数据条带单元2,数据分块3存储到数据条带单元3,校验分块1存储到校验条带单元1,校验分块2存储到校验条带单元2。数据条带单元1对应数据存储设备1,数据条带单元2对应数据存储设备2,数据条带单元3对应数据存储设备3,校验条带单元1对应校验存储设备1,校验条带单元2对应校验存储设备2。因为存储设备位于存储节点,因此,客户端101向存储设备发送相应的分块,要先将分块发送到存储设备所在的存储节点。
客户端101向数据存储节点1上的数据存储设备1发送数据分块1以及数据分块1的元数据,客户端101向数据存储设备2上的数据存储设备2发送数据分块2以及数据分块2的元数据,客户端101向数据存储设备3上的数据存储设备3发送数据分块3以及数据分块3的元数据,客户端101向校验存储节点4上的校验存储设备1发送校验分块1以及数据分块1的元数据、数据分块2的元数据和数据分块3的元数据,客户端101向校验存储设备5上的校验存储设备2发送校验分块2以及数据分块1的元数据、数据分块2的元数据和数据分块3的元数据。
307、数据存储设备接收客户端101发送的数据分块以及数据分块的元数据,校验存储设备接收客户端发送的校验分块以及数据分块的元数据。
数据存储节点1上的数据存储设备1接收客户端101发送的数据分块1以及数据分块1的元数据,数据存储设备2上的数据存储设备2接收客户端101发送的数据分块2以及数据分块2的元数据,数据存储设备3上的数据存储设备3接收客户端101发送的数据分块3以及数据分块3的元数据,校验存储节点4上的校验存储设备1接收客户端101发送的校验分块1以及数据分块1的元数据、数据分块2的元数据和数据分块3的元数据,校验存储设备5上的校验存储设备2接收客户端101发送的校验分块2以及数据分块1的元数据、数据分块2的元数据和数据分块3的元数据。
本申请实施例中,数据分块的元数据可以在校验存储设备上备份,数据分块的元数据除在相应的数据存储设备上存储,还在校验存储设备上存储,从而完成写入操作,因此减少了写入操作时间。另一方面,数据分块的元数据存在多个副本,从而提供了数据分块的元数据的可靠性。进一步的,由于数据分块的元数据由客户端101在向校验存储设备写入校验分块时一起写入到校验存储设备,提高了数据分块的元数据副本保存效率,从而降低了写操作的时间。
308、数据存储节点将数据分块的元数据写入内存中并将数据分块的元数据写入记录日志,将数据分块写入数据存储节点上的数据存储设备;校验存储节点将数据分块的元数据写入内存中并将数据分块的元数据写入记录日志,将校验分块写入校验存储节点上的校验存储设备。
数据存储节点的内存中存储有分布在该数据存储节点上的数据存储设备的数据分块的元数据及日志,日志用于记录数据分块的元数据写入操作。校验存储节点的内存中存储数据分块的元数据及日志,日志用于记录数据分块的元数据写入操作。 如前所述,相同用户访问地址的数据分块由于修改,存在不同的版本。即会存在相同用户访问地址的数据分块位于相同数据存储节点的数据存储设备上,则数据存储节点的内存中会存储相同用户访问地址的数据分块的元数据和日志。根据数据分块的元数据中的时间戳,可以判断出哪一个数据分块的元数据是新的数据分块的元数据。同理,在校验存储节点的内存中也会存储有多个具有相同用户访问地址的数据分块的元数据及日志。
309、数据存储节点将数据分块的元数据备份到元数据存储节点,删除数据存储节点内存中的数据分块的元数据及日志中的数据分块的元数据写入操作。
本申请实施例,数据存储节点将数据分块的元数据备份到元数据存储节点,一种实现方式是将数据分块的元数据发送到元数据存储节点,不需要元数据存储节点将数据分块的元数据写入元数据存储节点上的元数据存储设备。另一种实现,数据存储节点将数据分块的元数据备份到元数据存储节点,是将数据分块的元数据发送到元数据存储节点,需要元数据存储节点将数据分块的元数据写入元数据存储节点上的元数据存储设备的元数据存储设备。步骤309实现数据分块的元数据多副本存储,提高数据分块的元数据的可靠性。同时,可以及时释放数据存储节点的内存空间。本申请实施例中存储数据分块的元数据的存储节点称为元数据存储节点。元数据存储节点中存储数据分块的元数据的存储设备称为元数据存储设备。本申请实施例,元数据存储节点的数量不作限定。
310、数据存储节点通知校验存储节点删除校验存储节点内存中旧的数据分块的元数据以及日志中记录的旧的数据分块的元数据的写入操作。其中,旧的数据分块与数据分块具有相同的用户访问地址,旧的数据分块的元数据中的时间戳早于数据分块的元数据中的时间戳。
当数据存储节点将新的数据分块的元数据备份到元数据存储节点,因为新的数据分块的元数据已经存储到元数据存储节点。因此,可以删除校验存储节点内存中时间戳早于该数据分块的元数据的时间戳以及具有相同用户访问地址的旧的数据分块的元数据,同时删除日志中记录的旧的数据分块的元数据写入操作。即将相同用户访问地址的旧的数据分块的元数据从校验存储节点的内存中淘汰,从而释放校验存储节点的内存空间。
当数据存储节点在将某一个数据分块的元数据进行存储之前,若该数据分块的元数据上时间戳大于目标元数据的时间戳,说明该数据分块为最新分块,则该存储节点可以直接将数据分块的元数据写入到内存中,此时无需等待将数据分块的元数据写入存储设备,从而减少了写入操作的时间。其中,目标元数据为数据存储节点中所有与该数据分块具有相同用户访问的数据分块的元数据。
本申请实施例中,当数据存储节点故障,导致数据分块的元数据丢失时,可以获取校验存储节点内存中的数据分块的写入操作的日志。例如可以从多个校验存储节点的日志中获取数据分块的写入操作,从而防止某一个校验存储节点由于错误导 致的数据分块的元数据丢失或不一致。具体实现,可以由数据存储节点向校验存储节点发送日志获取请求,校验存储节点根据日志获取请求向数据存储节点发送包含用户访问地址的数据分块的元数据的写入操作。具体实现中,校验存储节点可以向数据存储节点发送所有日志;另一种实现方式,数据存储节点可以在日志获取请求中携带用户访问地址,则校验存储节点根据用户访问地址获取包含该用户访问地址的数据分块的元数据写入操作。数据存储节点从具有相同用户访问的数据分块的元数据中选择时间戳最新的数据分块的元数据作为最新的数据分块的元数据,从而保证数据一致性。其中一种实现,时间戳最新可以是时间戳最大。在客户端发生重启,丢失最新的数据分块的时间戳时,客户端可以向数据存储节点查询相同用户访问地址的数据分块当前最新的时间戳,数据存储节点向客户端返回相同用户访问地址的数据分块当前最新的时间戳。客户端基于该最新时间戳开始为相同的用户访问地址的数据分块的元数据分配时间戳。
本申请实施例所提供的方法,通过客户端将条带中的数据分块及数据分块的元数据发送至数据存储节点,将数据分块的元数据以及校验分块发送至校验存储节点,数据存储节点或校验存储节点无需等待元数据备份成功,可以直接将接收的分块存储在存储设备,从而降低了写入操作的时间。
在一些实施中,客户端可以将数据分块、数据分块的元数据发送至数据存储节点,将数据分块的元数据和校验分块发送至校验存储节点,由数据存储节点和校验存储节点对其接收到的分块和元数据进行存储,在此,本申请实施例提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点和校验存储节点;所述方法包括:
所述客户端向所述数据存储节点发送条带中的数据分块及所述数据分块的元数据;
所述客户端向所述校验存储节点发送条带中的校验分块以及所述数据分块的元数据;其中,所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系。
可选地,所述数据分块的元数据还包含时间戳。
可选地,所述方法还包括:
所述客户端获取所述数据分块的时间戳;
根据所述时间戳确定新的数据分块的时间戳;其中,所述新的数据分块与所述数据分块具有相同的用户访问地址。
本申请实施例提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
所述数据存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数 据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;其中,所述条带的校验分块存储在所述校验存储节点的存储设备上;所述校验存储节点的内存还用于存储所述数据分块的元数据并且在日志中记录所述数据分块的元数据写入操作;所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
所述数据存储节点向所述客户端发送写请求响应;所述写请求响应用于指示完成写请求操作。
可选地,所述方法还包括:所述数据存储节点向所述客户端发送写请求响应之后,所述数据存储节点将所述数据分块的元数据存储到所述元数据存储节点。
可选地,所述数据分块的元数据还包含时间戳;所述方法进一步包含:
所述数据存储节点通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
可选地,所述方法还包括:
当所述数据存储节点从故障中恢复,所述数据存储节点从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
从所述日志中的所述用户访问地址的数据分块的元数据的写入操作中确定时间戳最新的数据分块的元数据;
将所述时间戳最新的数据分块的元数据作为所述用户访问地址对应的数据分块的元数据。
本申请实施例提供一种存储系统中数据处理方法,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
所述校验存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
所述校验存储节点在内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
可选地,所述数据分块的元数据还包含时间戳。
可选地,所述方法还包括:
所述校验存储节点接收所述数据存储节点发送的通知;所述通知用于指示所述 校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳;
所述校验存储节点根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
可选地,所述方法还包括:
所述校验存储节点接收所述数据存储节点发送的日志获取请求;所述日志获取请求用于从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
所述校验存储节点向所述数据存储节点发送包含所述用户访问地址的数据分块的元数据的写入操作。
为了进一步实现上述方法,本申请实施例还提供一种存储系统,所述存储系统包含数据存储节点和校验存储节点;
所述数据存储节点用于:
接收所述客户端发送的第一写请求;所述第一写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
向所述客户端发送写请求响应;所述写请求响应用于指示完成所述第一写请求;
所述校验存储节点用于接收所述客户端发送的第二写请求;所述第二写请求包含所述数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;
在所述校验存储节点的内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
可选地,所述数据存储节点还用于向所述客户端发送写请求响应之后,将所述数据分块的元数据存储到所述元数据存储节点。
可选地,所述数据分块的元数据还包含时间戳;
所述数据存储节点,还用于通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
所述校验存储节点,还用于接收所述数据存储节点发送的通知,根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
图4是本申请实施例提供的一种存储系统中数据处理装置的结构示意图,所述存储系统包含所述装置、数据存储节点和校验存储节点;所述装置包括:
第一发送模块401,用于向所述数据存储节点发送条带中的数据分块及所述数据分块的元数据;
第二发送模块402,用于向所述校验存储节点发送条带中的校验分块以及所述数据分块的元数据;其中,所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系。
可选地,所述数据分块的元数据还包含时间戳。
可选地,所述装置还包括:
获取模块,用于获取所述数据分块的时间戳;
确定模块,用于根据所述时间戳确定新的数据分块的时间戳;其中,所述新的数据分块与所述数据分块具有相同的用户访问地址。
图5是本申请实施例提供的一种存储系统中数据处理装置的结构示意图,所述存储系统包含客户端、所述装置、元数据存储节点和校验存储节点;所述装置包括:
接收模块501,用于接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;其中,所述条带的校验分块存储在所述校验存储节点的存储设备上;所述校验存储节点的内存还用于存储所述数据分块的元数据并且在日志中记录所述数据分块的元数据写入操作;所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
发送模块502,用于向所述客户端发送写请求响应;所述写请求响应用于指示完成写请求操作。
可选地,所述装置还包括:
存储模块,用于所述发送模块向所述客户端发送写请求响应之后,将所述数据分块的元数据存储到所述元数据存储节点。
可选地,所述数据分块的元数据还包含时间戳;所述装置还包括:
通知模块,用于执行上述步骤310。
可选地,所述装置还包括:
获取模块,用于当所述装置从故障中恢复,所述数据存储节点从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
确定模块,用于从所述日志中的所述用户访问地址的数据分块的元数据的写入 操作中确定时间戳最新的数据分块的元数据;
所述确定模块,还用于将所述时间戳最新的数据分块的元数据作为所述用户访问地址对应的数据分块的元数据。
图6是本申请实施例提供的一种存储系统中数据处理装置的结构示意图,所述存储系统包含客户端、数据存储节点、元数据存储节点和所述装置;所述装置包括:
接收模块601,用于接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
缓存模块602,用于在内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
可选地,所述数据分块的元数据还包含时间戳。
可选地,所述装置还包括:删除模块;
所述接收模块601,还用于接收所述数据存储节点发送的通知;所述通知用于指示所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳;
所述删除模块,用于根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
可选地,所述装置还包括:发送模块;
所述接收模块601,还用于接收所述数据存储节点发送的日志获取请求;所述日志获取请求用于从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
所述发送模块,用于向所述数据存储节点发送包含所述用户访问地址的数据分块的元数据的写入操作。
本申请实施例还提供一种存储介质,该存储介质中存储有至少一条指令,该指令由处理器加载并执行以实现如上述客户端、数据存储节点或校验存储节点在执行上述存储系统中数据处理方法时所执行的操作。
本申请实施例各装置中的模块,具体实现可以是硬件组件,也可以是软件模块实现,或者两者的结合,本发明实施例对此不作限定。

Claims (31)

  1. 一种存储系统中数据处理方法,其特征在于,所述存储系统包含客户端、数据存储节点和校验存储节点;所述方法包括:
    所述客户端向所述数据存储节点发送条带中的数据分块及所述数据分块的元数据;
    所述客户端向所述校验存储节点发送条带中的校验分块以及所述数据分块的元数据;其中,所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系。
  2. 根据权利要求1所述的方法,其特征在于,所述数据分块的元数据还包含时间戳。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    所述客户端获取所述数据分块的时间戳;
    根据所述时间戳确定新的数据分块的时间戳;其中,所述新的数据分块与所述数据分块具有相同的用户访问地址。
  4. 一种存储系统中数据处理方法,其特征在于,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
    所述数据存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;其中,所述条带的校验分块存储在所述校验存储节点的存储设备上;所述校验存储节点的内存还用于存储所述数据分块的元数据并且在日志中记录所述数据分块的元数据写入操作;所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
    所述数据存储节点向所述客户端发送写请求响应;所述写请求响应用于指示完成写请求操作。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:所述数据存储节点向所述客户端发送写请求响应之后,所述数据存储节点将所述数据分块的元数据存储到所述元数据存储节点。
  6. 根据权利要求5所述的方法,其特征在于,所述数据分块的元数据还包含时间戳;所述方法进一步包含:
    所述数据存储节点通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
  7. 根据权利要求4至6任一所述的方法,其特征在于,所述方法还包括:
    当所述数据存储节点从故障中恢复,所述数据存储节点从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
    从所述日志中的所述用户访问地址的数据分块的元数据的写入操作中确定时间戳最新的数据分块的元数据;
    将所述时间戳最新的数据分块的元数据作为所述用户访问地址对应的数据分块的元数据。
  8. 一种存储系统中数据处理方法,其特征在于,所述存储系统包含客户端、数据存储节点、元数据存储节点和校验存储节点;所述方法包括:
    所述校验存储节点接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
    所述校验存储节点在内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
  9. 根据权利要求8所述的方法其特征在于,所述数据分块的元数据还包含时间戳。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    所述校验存储节点接收所述数据存储节点发送的通知;所述通知用于指示所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳;
    所述校验存储节点根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
  11. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    所述校验存储节点接收所述数据存储节点发送的日志获取请求;所述日志获取请求用于从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
    所述校验存储节点向所述数据存储节点发送包含所述用户访问地址的数据分块 的元数据的写入操作。
  12. 一种存储系统,其特征在于,所述存储系统包含数据存储节点和校验存储节点;
    所述数据存储节点用于:
    接收所述客户端发送的第一写请求;所述第一写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
    向所述客户端发送写请求响应;所述写请求响应用于指示完成所述第一写请求;
    所述校验存储节点用于接收所述客户端发送的第二写请求;所述第二写请求包含所述数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;
    在所述校验存储节点的内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
  13. 根据权利要求12所述的存储系统,其特征在于,所述数据存储节点还用于向所述客户端发送写请求响应之后,将所述数据分块的元数据存储到所述元数据存储节点。
  14. 根据权利要求13所述的存储系统,其特征在于,所述数据分块的元数据还包含时间戳;
    所述数据存储节点,还用于通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳。
    所述校验存储节点,还用于接收所述数据存储节点发送的通知,根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
  15. 一种客户端,其特征在于,所述客户端包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如权利要求1至权利要求3任一项所述的方法。
  16. 一种数据存储节点,其特征在于,所述数据存储节点包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如权利要求4至权利要求7任一项所述的方法。
  17. 一种校验存储节点,其特征在于,所述校验存储节点包括处理器和接口,所述处理器与所述接口通信,所述处理器用于执行以实现如权利要求8至权利要求 11任一项所述的方法。
  18. 一种存储系统中数据处理装置,其特征在于,所述存储系统包含所述装置、数据存储节点和校验存储节点;所述装置包括:
    第二发送模块,用于向所述数据存储节点发送条带中的数据分块及所述数据分块的元数据;
    第二发送模块,用于向所述校验存储节点发送条带中的校验分块以及所述数据分块的元数据;其中,所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系。
  19. 根据权利要求18所述的装置,其特征在于,所述数据分块的元数据还包含时间戳。
  20. 根据权利要求19所述的装置,其特征在于,所述装置还包括:
    获取模块,用于获取所述数据分块的时间戳;
    确定模块,用于根据所述时间戳确定新的数据分块的时间戳;其中,所述新的数据分块与所述数据分块具有相同的用户访问地址。
  21. 一种存储系统中数据处理装置,其特征在于,所述存储系统包含客户端、所述装置、元数据存储节点和校验存储节点;所述装置包括:
    接收模块,用于接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块及所述数据分块的元数据;其中,所述条带的校验分块存储在所述校验存储节点的存储设备上;所述校验存储节点的内存还用于存储所述数据分块的元数据并且在日志中记录所述数据分块的元数据写入操作;所述校验分块是根据所述条带中所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
    发送模块,用于向所述客户端发送写请求响应;所述写请求响应用于指示完成写请求操作。
  22. 根据权利要求21所述的装置,其特征在于,所述装置还包括:
    存储模块,用于所述发送模块向所述客户端发送写请求响应之后,将所述数据分块的元数据存储到所述元数据存储节点。
  23. 根据权利要求22所述的装置,其特征在于,所述数据分块的元数据还包含时间戳;所述装置还包括:
    通知模块,用于通知所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分 块的时间戳早于所述数据分块的元数据的时间戳。
  24. 根据权利要求21至23任一所述的装置,其特征在于,所述装置还包括:
    获取模块,用于当所述装置从故障中恢复,所述数据存储节点从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
    确定模块,用于从所述日志中的所述用户访问地址的数据分块的元数据的写入操作中确定时间戳最新的数据分块的元数据;
    所述确定模块,还用于将所述时间戳最新的数据分块的元数据作为所述用户访问地址对应的数据分块的元数据。
  25. 一种存储系统中数据处理装置,其特征在于,所述存储系统包含客户端、数据存储节点、元数据存储节点和所述装置;所述装置包括:
    接收模块,用于接收所述客户端发送的写请求;所述写请求包含写入所述数据存储节点中存储设备的条带的数据分块的元数据以及校验分块;其中,所述校验分块是根据所述条带中的所述数据分块和校验算法生成的;所述条带中的一个校验分块和一个数据分块大小相同,并且等于所述条带中的条带单元的大小;所述数据分块的元数据包含所述数据分块的用户访问地址与所述条带中存储所述数据分块的条带单元标识的对应关系;
    缓存模块,用于在内存中缓存所述数据分块并在日志中记录所述数据分块的元数据的写入操作。
  26. 根据权利要求25所述的装置其特征在于,所述数据分块的元数据还包含时间戳。
  27. 根据权利要求26所述的装置,其特征在于,所述装置还包括:删除模块;
    所述接收模块,还用于接收所述数据存储节点发送的通知;所述通知用于指示所述校验存储节点删除所述校验存储节点的内存中旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作;其中,所述旧的数据分块与所述数据分块包含相同的用户访问地址;所述旧的数据分块的时间戳早于所述数据分块的元数据的时间戳;
    所述删除模块,用于根据所述通知删除所述内存中的所述旧的数据分块的元数据以及所述日志中记录的所述旧的数据分块的元数据的写入操作。
  28. 根据权利要求25所述的装置,其特征在于,所述装置还包括:发送模块;
    所述接收模块,还用于接收所述数据存储节点发送的日志获取请求;所述日志获取请求用于从所述校验存储节点获取所述日志中包含所述用户访问地址的数据分块的元数据的写入操作;
    所述发送模块,用于向所述数据存储节点发送包含所述用户访问地址的数据分块的元数据的写入操作。
  29. 一种存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求1至权利要求3任一项所述方法。
  30. 一种存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求4至权利要求7任一项所述方法。
  31. 一种存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器加载并执行以实现如权利要求8至权利要求11任一项所述方法。
PCT/CN2019/104981 2019-09-09 2019-09-09 存储系统中数据处理方法、装置以及存储系统 WO2021046693A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201980092966.3A CN113544635A (zh) 2019-09-09 2019-09-09 存储系统中数据处理方法、装置以及存储系统
PCT/CN2019/104981 WO2021046693A1 (zh) 2019-09-09 2019-09-09 存储系统中数据处理方法、装置以及存储系统
EP19945241.8A EP3971701A4 (en) 2019-09-09 2019-09-09 DATA PROCESSING METHODS IN A STORAGE SYSTEM, DEVICE AND STORAGE SYSTEM
US17/569,908 US20220129346A1 (en) 2019-09-09 2022-01-06 Data processing method and apparatus in storage system, and storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/104981 WO2021046693A1 (zh) 2019-09-09 2019-09-09 存储系统中数据处理方法、装置以及存储系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/569,908 Continuation US20220129346A1 (en) 2019-09-09 2022-01-06 Data processing method and apparatus in storage system, and storage system

Publications (1)

Publication Number Publication Date
WO2021046693A1 true WO2021046693A1 (zh) 2021-03-18

Family

ID=74865933

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/104981 WO2021046693A1 (zh) 2019-09-09 2019-09-09 存储系统中数据处理方法、装置以及存储系统

Country Status (4)

Country Link
US (1) US20220129346A1 (zh)
EP (1) EP3971701A4 (zh)
CN (1) CN113544635A (zh)
WO (1) WO2021046693A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114968668A (zh) * 2022-06-17 2022-08-30 重庆紫光华山智安科技有限公司 数据处理方法、装置、数据接入端及存储介质
CN115098046B (zh) * 2022-08-26 2023-01-24 苏州浪潮智能科技有限公司 磁盘阵列初始化方法、系统、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699494A (zh) * 2013-12-06 2014-04-02 北京奇虎科技有限公司 一种数据存储方法、数据存储设备和分布式存储系统
CN109144406A (zh) * 2017-06-28 2019-01-04 华为技术有限公司 分布式存储系统中元数据存储方法、系统及存储介质
CN109814805A (zh) * 2018-12-25 2019-05-28 华为技术有限公司 存储系统中分条重组的方法及分条服务器
US20190243553A1 (en) * 2017-03-28 2019-08-08 Hitachi, Ltd. Storage system, computer-readable recording medium, and control method for system

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101694984B1 (ko) * 2010-12-08 2017-01-11 한국전자통신연구원 비대칭 클러스터링 파일시스템에서의 패리티 산출 방법
CN102521269B (zh) * 2011-11-22 2013-06-19 清华大学 一种基于索引的计算机连续数据保护方法
KR20150061258A (ko) * 2013-11-27 2015-06-04 한국전자통신연구원 분산 raid 시스템에서 패리티 청크 운용 방법과 이를 지원하는 데이터 서버 장치
US9495255B2 (en) * 2014-08-07 2016-11-15 Pure Storage, Inc. Error recovery in a storage cluster
CN105353988A (zh) * 2015-11-13 2016-02-24 曙光信息产业(北京)有限公司 一种元数据读写方法及装置
EP3208714B1 (en) * 2015-12-31 2019-08-21 Huawei Technologies Co., Ltd. Data reconstruction method, apparatus and system in distributed storage system
WO2017145375A1 (ja) * 2016-02-26 2017-08-31 株式会社日立製作所 ストレージシステム
ES2899933T3 (es) * 2016-03-15 2022-03-15 Datomia Res Labs Ou Gestión y seguridad de datos del sistema de almacenamiento distribuido
US9672905B1 (en) * 2016-07-22 2017-06-06 Pure Storage, Inc. Optimize data protection layouts based on distributed flash wear leveling
US10635529B2 (en) * 2017-05-25 2020-04-28 Western Digital Technologies, Inc. Parity offload for multiple data storage devices
CN109213420A (zh) * 2017-06-29 2019-01-15 杭州海康威视数字技术股份有限公司 数据存储方法、装置及系统
CN107423422B (zh) * 2017-08-01 2019-09-24 武大吉奥信息技术有限公司 基于网格的空间数据分布式存储及检索方法和系统
CN109542342B (zh) * 2018-11-09 2022-04-26 锐捷网络股份有限公司 元数据管理与数据重构方法、设备及存储介质
US11074129B2 (en) * 2019-10-31 2021-07-27 Western Digital Technologies, Inc. Erasure coded data shards containing multiple data objects

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699494A (zh) * 2013-12-06 2014-04-02 北京奇虎科技有限公司 一种数据存储方法、数据存储设备和分布式存储系统
US20190243553A1 (en) * 2017-03-28 2019-08-08 Hitachi, Ltd. Storage system, computer-readable recording medium, and control method for system
CN109144406A (zh) * 2017-06-28 2019-01-04 华为技术有限公司 分布式存储系统中元数据存储方法、系统及存储介质
CN109814805A (zh) * 2018-12-25 2019-05-28 华为技术有限公司 存储系统中分条重组的方法及分条服务器

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3971701A4 *

Also Published As

Publication number Publication date
EP3971701A1 (en) 2022-03-23
US20220129346A1 (en) 2022-04-28
EP3971701A4 (en) 2022-06-15
CN113544635A (zh) 2021-10-22

Similar Documents

Publication Publication Date Title
US7293145B1 (en) System and method for data transfer using a recoverable data pipe
US8972779B2 (en) Method of calculating parity in asymetric clustering file system
JP4317876B2 (ja) データストレージシステムにおける冗長データ割り当て
US8117409B2 (en) Method and apparatus for backup and restore in a dynamic chunk allocation storage system
KR100439675B1 (ko) 대용량 공유 저장장치를 위한 효율적인 스냅샷 수행방법
JP6009095B2 (ja) ストレージシステム及び記憶制御方法
US20210294499A1 (en) Enhanced data compression in distributed datastores
US7877554B2 (en) Method and system for block reallocation
JP2008282382A (ja) 仮想化されたストレージ領域に関するデータのバックアップおよび復元を行う方法および装置
JP2004342050A (ja) 複数のスナップショット維持方法及びサーバ装置及びストレージ装置
US20220129346A1 (en) Data processing method and apparatus in storage system, and storage system
JPWO2009069326A1 (ja) ネットワークブートシステム
US11861165B2 (en) Object tiering in a distributed storage system
US11449402B2 (en) Handling of offline storage disk
US20210326207A1 (en) Stripe reassembling method in storage system and stripe server
WO2019000949A1 (zh) 分布式存储系统中元数据存储方法、系统及存储介质
WO2018076633A1 (zh) 一种远程数据复制方法、存储设备及存储系统
US7849264B2 (en) Storage area management method for a storage system
US20190347165A1 (en) Apparatus and method for recovering distributed file system
US11775194B2 (en) Data storage method and apparatus in distributed storage system, and computer program product
US11487428B2 (en) Storage control apparatus and storage control method
CN111736754A (zh) 数据迁移方法和装置
CN113805789A (zh) 存储设备中的元数据处理方法及相关设备
WO2020034695A1 (zh) 数据存储方法、数据恢复方法、装置、设备及存储介质
WO2018055686A1 (ja) 情報処理システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19945241

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019945241

Country of ref document: EP

Effective date: 20211217

NENP Non-entry into the national phase

Ref country code: DE