WO2021043246A1 - Procédé et appareil de lecture de données - Google Patents

Procédé et appareil de lecture de données Download PDF

Info

Publication number
WO2021043246A1
WO2021043246A1 PCT/CN2020/113420 CN2020113420W WO2021043246A1 WO 2021043246 A1 WO2021043246 A1 WO 2021043246A1 CN 2020113420 W CN2020113420 W CN 2020113420W WO 2021043246 A1 WO2021043246 A1 WO 2021043246A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
hard disk
target
storage block
checksum
Prior art date
Application number
PCT/CN2020/113420
Other languages
English (en)
Chinese (zh)
Inventor
张瑛
熊伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021043246A1 publication Critical patent/WO2021043246A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Definitions

  • This application relates to the field of data storage, and in particular to a data reading method and device.
  • Redundant array of independent disks refers to a technology that implements data reading and writing based on multiple hard disks. According to different implementation principles, RAID can be divided into hard RAID and soft RAID. Among them, hard RAID is to realize RAID functions including data reading and writing through hardware, while soft RAID refers to the realization of RAID functions through operating system and CPU.
  • the hard disk may send a failure signal to the processor.
  • the processor may mark the hard disk as a failed disk.
  • the processor can read the data in other hard disks except the faulty disk, and recover the data in the faulty disk based on the read data. Data, the recovered data is returned to the client.
  • the processor After the processor receives the read request for the failed disk, it needs to read the data on other hard disks to restore the data on the failed disk. Therefore, it will cause a large delay in data reading and will Cause greater resource consumption.
  • This application provides a data reading method and device, which can be used to solve the problem of data reading caused by the processor in the related art reading the failed disk by reading data on other hard disks to restore the data on the failed disk.
  • the problem of large time delay and large resource consumption. is as follows:
  • a data reading method includes: when receiving failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, setting the hard disk status of the target hard disk to Risk status, receiving a read request to read the target hard disk sent by the client.
  • the target hard disk refers to any hard disk in the redundant array of independent hard disks whose corresponding hard disk status is in the risk state, and the risk status is used to indicate the occurrence of the hard disk.
  • a read request sent by the client to read the target hard disk is received.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client.
  • the target hard disk refers to a hard disk whose corresponding hard disk status in the RAID is in a risk state
  • the risk state refers to a software failure due to a failure. That is, in the embodiment of the present application, after receiving the failure information sent by the target hard disk to indicate that the hard disk is in a risk state, the hard disk state corresponding to the hard disk may be set to the risk state.
  • the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned.
  • the recovery is directly performed by reading the data on other hard disks, which shortens the time delay of data reading and also reduces the consumption of system resources.
  • the failure information is a software failure type identification
  • the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
  • the hard disk that has failed may feed back to the operating system a fault type identifier used to indicate the type of the fault.
  • the operating system can identify the hard disk with a software failure based on the fault type identifier, and then respond to the software failure.
  • the hard disk of this application can read data through the method provided in this application.
  • the hard disk that has failed only feeds back the failure signal. In this way, the operating system cannot identify what type of failure has occurred in the hard disk. Therefore, it can only be processed according to the hard disk that has hardware failure, that is, the data is directly processed. restore.
  • the implementation process of verifying the target data may be: obtaining the reference checksum of the stored target data; calculating the actual checksum of the target data according to the target data; if the target data If the reference checksum of is the same as the actual checksum of the target data, it is determined that the verification of the target data is passed.
  • the method further includes: repairing the data stored on the target hard disk; and modifying the hard disk status of the target hard disk after the data repair is completed to a safe state.
  • the operating system can create a background task to modify the data stored on the target hard disk while reading the target data stored on the target hard disk according to the read request, and repair the data on the target hard disk.
  • the hard disk status is changed to a safe status.
  • the implementation process of repairing the data stored on the target hard disk may be: obtaining a checksum of the data index area of each storage block in the target hard disk, where the data index area refers to the corresponding storage block
  • the area for storing data index information in the corresponding storage block, the data index information includes the checksum of each data stored in the corresponding storage block; according to the checksum of the data index area of each storage block, the data on each storage block Index information is checked; for the first storage block that has passed the data index information check, the checksum of each data stored in the first storage block is obtained from the data index information of the first storage block; according to The obtained checksum of each data is verified for each data in the first storage block; and the data in the first storage block that fails the verification is repaired.
  • the verification information of each data stored in the data index area can be used to verify each storage block.
  • Each data is verified, and then only the data that fails the verification is restored. In this way, the amount of data restoration can be reduced.
  • the storage block can be directly reconstructed to restore all data on the storage block.
  • a data reading device in a second aspect, is provided, and the data reading device has the function of realizing the behavior of the data reading method in the first aspect.
  • the data reading device includes at least one module, and the at least one module is used to implement the data reading method provided in the above-mentioned first aspect.
  • a data reading device in a third aspect, includes a processor and a memory, and the memory is used for storing and supporting the data reading device to perform the data reading provided in the first aspect.
  • the processor is configured to execute the program stored in the memory.
  • the operating device of the storage device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.
  • a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which when run on a computer, cause the computer to execute the data reading method described in the first aspect.
  • a computer program product containing instructions, which when running on a computer, causes the computer to execute the data reading method described in the first aspect.
  • the hard disk state of the target hard disk may be set to the risk state. Subsequently, when a read request for the target hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In this way, compared with the related technology No matter what happens to the hard disk, it can be recovered directly by reading the data on other hard disks, which shortens the time delay of data reading and reduces the consumption of system resources.
  • FIG. 1 is a system architecture diagram involved in a data reading method provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a data reading method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of data distribution in a storage block provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a data reading device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of another data reading device provided by an embodiment of the present application.
  • FIG. 1 is an architecture diagram of a storage system involved in a data reading method provided by an embodiment of the present application. As shown in Figure 1, the system includes client 01 and storage device 02. Among them, the client 01 and the storage device 02 can communicate.
  • the client 01 can send a read request or a write request to the storage device 02.
  • the storage device 02 may include a processor 021, a memory 022, and a hard disk 023.
  • the processor 021 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs used to control the execution of the program of this application. integrated circuit.
  • CPU Central Processing Unit
  • ASIC application-specific integrated circuit
  • the storage device may include multiple processors 021.
  • processors 021 may be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or devices including a processing core for processing data (for example, computer program instructions).
  • the memory 022 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM)) or can store information and Other types of dynamic storage devices for instructions can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory, CD-ROM or other optical discs Storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures And any other media that can be accessed by the computer, but not limited to this.
  • the memory 022 may exist independently and is connected to the processor 021.
  • the memory 022 may also be integrated with the processor 021.
  • the storage device 02 can be a storage array or a server.
  • the storage device 02 includes a controller and several hard disks.
  • the processor 021 and the memory 022 may be located in the controller of the storage array, and the controller and several hard disks are connected through a back-end interface card.
  • the processor 021, the storage 022 and several hard disks are all located inside the server.
  • FIG. 1 is only a schematic diagram of some components included in the device.
  • RAID technology is often used to store data in practical applications, for example, RAID5, RAID6, RAIDTP, etc.
  • RAID5 Take RAID5 as an example.
  • the space of each hard disk is divided into multiple storage blocks, each of which has the same size.
  • One storage block is taken out from different hard disks to form a storage block group.
  • the number of storage blocks is determined by the RAID type. Taking "4+1" RAID5 as an example, a storage block group consists of 4 data blocks and 1 parity block, so 5 storage blocks are required.
  • the storage device 02 When the storage device 02 stores data, the data can be split into 4 data fragments, and the check data of these 4 data fragments are calculated to generate the check fragments, and then the 4 data fragments are summed
  • the storage block group corresponds to a segment of logical address, which is the logical address of the data.
  • the storage device 02 stores the mapping relationship between the logical address and the physical address where the data is actually stored.
  • the read request includes the logical address of the data to be read, and the processor 021 can run the operating system to determine the location of the data to be read according to the logical address of the data to be read
  • the storage block and its location (physical address) in the storage block After the storage block where the data to be read is located and the location in the storage block are determined, the data can be read from the storage block.
  • the storage device 02 may also include a communication bus and a communication interface (not shown in FIG. 1). Among them, the communication bus is used to transfer information between various components included in the storage device 02.
  • the communication interface is used to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
  • devices or communication networks such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
  • Fig. 2 is a flowchart of a data reading method provided by an embodiment of the present application.
  • the execution subject of this method may be the processor deployed in the storage device 02 described in FIG. 1. Referring to Figure 2, the method includes the following steps:
  • Step 201 When receiving the failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk is set to the risk state.
  • the risk status is used to indicate that a small amount of data in the target hard disk is damaged or lost, and the target hard disk is allowed to continue writing data, and the remaining undamaged and lost data is also allowed to be read.
  • the failure information used to indicate that the target hard disk is at risk may refer to information used to indicate that the target hard disk is damaged or lost but the target hard disk is still usable.
  • the processor may send a failure detection instruction to each hard disk in the storage device every predetermined period of time. After each hard disk receives the fault detection instruction sent by the processor, it can detect whether it has read/write abnormality or whether there is data damage or loss. If it detects read/write abnormality or data damage or loss, the hard disk can determine its own occurrence. The fault type of the fault, and a fault type identifier for identifying the fault type is sent to the processor.
  • the processor may also send a fault detection instruction to the hard disk when it continuously receives an I/O abnormal error code sent by a certain hard disk, so as to query the type of fault of the hard disk. . That is, when the processor is processing a read request or write request for a hard disk, if it receives an I/O abnormal error code sent by the hard disk n times in a row, it means that the hard disk may be malfunctioning. At this time, the The processor may send a fault detection instruction to the hard disk to obtain the fault type of the hard disk.
  • n can be a preset value.
  • the failure types of hard disk failures may include software failures and hardware failures.
  • the fault type identification may include a software fault type identification and a hardware fault type identification.
  • the software failure type identification is used to identify the failure of the hard disk as a software failure
  • the hardware failure type identification is used to identify the failure of the hard disk as a hardware failure.
  • hardware failure refers to a hardware device failure.
  • software failure refers to data damage or loss caused by software abnormalities. It should be noted that when the hard disk has a software failure, only a small part of the data on the hard disk will be damaged or lost. However, the location of data damage or loss cannot be clarified.
  • the hard disk can continue to write data later, and data can also be read for undamaged and lost data.
  • the hard disk after the hard disk fails, if it is determined that it is a software failure, it can report the software failure type identification to the processor, and if it is a hardware failure, it can report the hardware failure type identification to the processor. .
  • each hard disk may also actively report the fault type identification to the processor when a fault is detected.
  • the hard disk state of the hard disk may be set to a risk state.
  • the processor may directly set the hard disk status of the hard disk to a failed state. At this time, the failed state is used to indicate It is forbidden to read and write data on the hard disk later.
  • the memory of the storage device may store the corresponding relationship between the hard disk identification and the status information of each of the multiple hard disks.
  • the status information may include a safety status, a risk status, and a failure status.
  • the safe state is used to indicate that the data stored on the hard disk is not damaged or lost, and the hard disk is currently not malfunctioning.
  • the risk status is used to indicate that a small part of the data stored on the hard disk is damaged or lost, the hard disk has a software failure, and the subsequent processor can continue to write data in the hard disk, and the subsequent steps in the embodiment of this application can be used Read data from the hard disk.
  • the invalid state is used to indicate that the hard disk has a hardware failure and is currently unavailable.
  • the processor When a read and write request for the hard disk is subsequently received, data reading and writing on the hard disk will be prohibited. Based on this, after the processor receives the software failure type identifier reported by the target hard disk, it can set the status information corresponding to the hard disk identifier of the target hard disk in the above corresponding relationship to the risk state to indicate that the target hard disk has a software failure. A small amount of data has been damaged or lost.
  • Step 202 Receive a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state.
  • the client when the data that the client wants to read is stored in the target hard disk, the client may send a read request carrying the logical address of the data to be read to the processor.
  • the processor may receive the read request sent by the client, and determine the hard disk to be read as the target hard disk according to the logical address of the data to be read carried in the read request. That is, the read request is a read request for reading data in the target hard disk.
  • Step 203 According to the read request, read the target data stored on the target hard disk.
  • the processor After receiving the read request, the processor can read the target data stored on the target hard disk according to the logical address carried in the read request.
  • the hard disk space can be divided into multiple storage blocks (that is, blocks). Among them, the size of each storage block is the same.
  • Each storage block may correspond to a segment of logical address, and each storage block may store multiple pieces of data, and the memory of the storage device may store the mapping relationship between the physical address of the data and the logical address.
  • the processor can determine the physical address for storing the target data to be read according to the logical address carried in the read request, that is, determine the target memory block to be read, and then the processor can read from the target memory block Get the stored target data.
  • Step 204 Verify the target data.
  • the target hard disk is a hard disk in a risk state
  • a small amount of data stored on the target hard disk may be damaged or lost.
  • the processor reads the target data according to the logical address in the read request, it cannot determine whether the target data has been damaged. Based on this, the processor can verify the target data.
  • the processor may obtain a reference checksum (checksum) of the stored target data, and calculate the actual checksum of the target data according to the target data. If the reference checksum of the target data is the same as the actual checksum, then Confirm that the verification of the target data is passed.
  • checksum checksum
  • the storage device may store metadata of various data stored on each of the multiple hard disks.
  • the metadata includes the storage address of each data and the checksum of each data.
  • the processor may obtain metadata containing the logical address from the stored metadata according to the logical address carried in the read request, and obtain the checksum of the target data from the obtained metadata. At this time, the obtained checksum of the target data is the reference checksum of the correct target data originally stored in the space indicated by the logical address carried in the read request.
  • the processor may also calculate the actual checksum of the target data according to the acquired target data.
  • the calculation method for obtaining the actual checksum is the same as the calculation method for the reference checksum in the stored metadata.
  • the processor can compare the two. If the two are the same, it means that the obtained target data is correct and not damaged. At this time, the verification of the target data is passed. If the two are not the same, it means that the acquired target data is corrupted data. At this time, the verification of the target data fails.
  • Step 205 If the verification of the target data is passed, the target data is sent to the client.
  • the processor can directly return the target data to the client.
  • the processor can recover the target data by reading data on other hard disks except the target hard disk among the multiple hard disks included in the RAID.
  • the hard disk state of the target hard disk may be set to the risk state.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
  • the processor after the processor receives the read request from the target hard disk, it considers that there is damaged data on the target hard disk. Therefore, the processor can also establish a background task to store data on the target hard disk. The data is repaired.
  • the space of the target hard disk can be divided into multiple storage blocks.
  • the processor may sequentially verify each storage block of the target hard disk starting from the first storage block of the target hard disk, and then repair the storage blocks that fail the verification.
  • the processor may obtain the checksum of the data index area of each storage block in the target hard disk.
  • the data index area refers to the area where the data index information is stored in the corresponding storage block, and the data index information includes the data stored in the corresponding storage block.
  • the checksum of each data according to the checksum of the data index area of each storage block, the data index information on each storage block is verified; for the first storage block that has passed the data index information verification, from Obtain the checksum of each data stored in the first storage block from the data index information of the first storage block; verify each data in the first storage block according to the obtained checksum of each data; Repair the data in the first storage block that has not passed the verification.
  • the first storage block refers to any storage block that passes the data index information verification, that is, the storage blocks that pass the data index information verification can all be referred to as the first storage block. Conversely, the storage block that fails the verification of the data index information can be called the second storage block.
  • each storage block may include a user data area, a data index area, and a tail area.
  • the data requested by the client is user data stored in the user data area, that is, the target data in this application is user data.
  • the data index area stores the mapping information corresponding to each piece of user data.
  • the mapping information includes the checksum of the corresponding user data, the offset of the corresponding data in the storage block, and so on.
  • the checksum of the data index area is stored in the tail area.
  • Figure 3 shows a schematic diagram of data layout in a hard disk.
  • the hard disk may include multiple Blocks, that is, multiple storage blocks.
  • Block1 multiple pieces of user data such as data0 and data1 can be stored in Block1, and the space occupied by multiple pieces of data can be called a user data region (user data region).
  • Each piece of user data corresponds to mapping information, as shown in Figure 3, the mapping information corresponding to data0 is data0 ref, and the mapping information corresponding to data1 is data1 ref.
  • the mapping information may include the offset and checksum of the user data.
  • the space occupied by the mapping information corresponding to multiple pieces of user data may be referred to as a data reference region.
  • the checksum of the data index area is stored in the last sector of Block1, that is, the tail area.
  • the processor can obtain the checksum of the data index area from the tail area of the storage block, that is, the data index At the same time, the processor can read the data index information in the data index area. The actual checksum of the data index area is calculated according to the acquired data index information.
  • the actual checksum is determined according to the data index information in the data index area using the same method as the reference checksum stored in the tail area. In this case, if the data index area is stored If the data index information is not damaged, the actual checksum of the data index area will be the same as the checksum of the data index area stored in the tail area. If the data index information stored in the data index area is damaged, the actual checksum will be The checksum will be different from the checksum stored in the tail area. Based on this, in the embodiment of the present application, after the processor calculates the actual checksum of the data index area, it can compare the actual checksum with the checksum of the data index area obtained from the tail area.
  • the processor can determine that the data index information in the data index area is not damaged, that is, the verification of the data index area passes.
  • the storage block is also the first storage block that passes the aforementioned data index information verification.
  • the processor can use the checksum of each piece of data stored in the data index area to verify each piece of data in the user data area.
  • the processor can first read the first piece of data in the user data area on the storage block, and read the checksum of the first piece of data from the data index area. After that, the processor can calculate a checksum according to the first piece of data, and compare the calculated checksum with the checksum of the piece of data read. If the two are the same, the first piece of data is indicated. It is not damaged, that is, the verification of the piece of data is passed. At this time, the processor can continue to verify the next piece of data in the user data area to determine whether it is damaged. If the two checksums are not the same, it means that the first piece of data has been damaged, that is, the check of this piece of data has failed.
  • the processor can mark the first piece of data, and then continue to check The next data is checked. In this way, the processor can repair the marked data after verifying all the data in the user data area.
  • the processor may also repair the data every time it determines that a piece of data fails the verification.
  • the processor when the processor repairs a piece of data, the processor can read data information related to the piece of data from other hard disks except the target hard disk, and calculate and restore the piece of data based on the read data information. After recovering the data, the processor can store the data in other hard disks or write to other storage blocks of the target hard disk. At the same time, the processor can store the user data area of the current storage block. Delete the piece of data stored in the data, that is, release the space occupied by the piece of data in the current storage block, and subsequently, the processor may also write new data in the space.
  • the processor when the processor determines that the first piece of data in the storage block that fails the check, it can directly read the data from other hard disks other than the target hard disk. Data information related to the data stored on the block, and then directly calculate and restore all data stored on the storage block according to the obtained data information. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block.
  • the processor may continue to verify the next storage block.
  • the processor compares the actual checksum of the data index area with the reference checksum of the data index area stored in the tail area of the storage block and finds that the two are not the same, it indicates that the data of the storage block is different.
  • the data index information stored in the index area is damaged.
  • the processor can directly read the data information related to the data stored on the storage block in other hard disks, and then directly calculate and restore the data based on the obtained data information. All data stored on the storage block. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block. After that, the processor can continue to verify the next storage block.
  • the processor can implement the above-mentioned data restoration on the target hard disk by establishing a background task.
  • the processor may divide the background task into multiple task fragments, and process the multiple task fragments in a concurrent manner, so as to improve the speed of data repair.
  • the processor can modify the hard disk state of the target hard disk from a risk state to a safe state.
  • the processor can verify each storage block in the target hard disk one by one by creating a background task, and perform verification on the storage that fails the verification. Block data is repaired. In this way, because there is less damaged or lost data in the target hard disk, only a small amount of data needs to be repaired, which reduces the processor's consumption of processing resources. At the same time, because there is less data to be restored, correspondingly, there are fewer related data that need to be read from other disks. Therefore, when the total hard disk bandwidth is fixed, the background task of repairing data and other normal tasks can be effectively avoided. I/O competes for bandwidth, which can reduce the performance fluctuation of normal I/O.
  • the processor can write the data to be written to the target hard disk according to the write request. That is, in the embodiment of the present application, for a hard disk whose hard disk status is in a risk state, the hard disk can still be used.
  • an embodiment of the present application provides a data reading device 400, and the device 400 includes:
  • the setting module 401 is used to execute step 201 in the above embodiment; wherein, the setting module 401 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the receiving module 402 is configured to execute step 202 in the above embodiment; wherein, the receiving module 402 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the reading module 403 is used to execute step 203 in the above embodiment; wherein, the reading module 403 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
  • the verification module 404 is configured to execute step 204 in the above embodiment; wherein, the verification module 404 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
  • the sending module 405 is used to execute step 205 in the above embodiment; wherein, the sending module 405 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the failure information is a software failure type identification
  • the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
  • the verification module is specifically used for:
  • the apparatus 400 further includes:
  • the repair module 406 is used to repair the data stored on the target hard disk
  • the modification module 407 is used to modify the hard disk state of the target hard disk after the data repair is completed to a safe state.
  • the repair module 406 is specifically configured to:
  • the data index area refers to the area where data index information is stored in the corresponding storage block.
  • the data index information includes the checksum of each data stored in the corresponding storage block ;
  • repair module 406 is specifically used to:
  • the repair module 406 and the modification module 407 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call program codes in the memory 022 for execution.
  • the hard disk state of the target hard disk may be set to the risk state.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
  • the data reading device provided in the above embodiment reads data
  • only the division of the above functional modules is used as an example for illustration.
  • the above functions can be allocated by different functional modules according to needs.
  • the data reading device provided in the above embodiment and the data reading method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD) )Wait.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

La présente invention concerne un procédé et un appareil de lecture de données, se rapportant au domaine technique du stockage. Dans la présente invention, pour un disque dur dans un RAID ayant un état de disque dur correspondant d'un état à risque, lorsqu'une demande de lecture pour ledit type de disque dur est reçue, les données correspondantes stockées sur le disque dur peuvent être lues sur la base de la demande de lecture et les données peuvent être vérifiées et, si la vérification réussit, alors les données lues peuvent être renvoyées ; ainsi, par rapport au procédé de l'état de la technique consistant à mettre en œuvre directement une restauration au moyen de la lecture des données sur les autres disques durs, quelle que soit la défaillance du disque dur, le retard de lecture de données est raccourci et la consommation de ressources du système est réduite.
PCT/CN2020/113420 2019-09-06 2020-09-04 Procédé et appareil de lecture de données WO2021043246A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910841114.1 2019-09-06
CN201910841114.1A CN112463019A (zh) 2019-09-06 2019-09-06 数据读取方法及装置

Publications (1)

Publication Number Publication Date
WO2021043246A1 true WO2021043246A1 (fr) 2021-03-11

Family

ID=74806893

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113420 WO2021043246A1 (fr) 2019-09-06 2020-09-04 Procédé et appareil de lecture de données

Country Status (2)

Country Link
CN (1) CN112463019A (fr)
WO (1) WO2021043246A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994236B (zh) * 2023-03-23 2023-08-04 杭州派迩信息技术有限公司 一种航空数据的协同处理方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147713A (zh) * 2011-02-18 2011-08-10 杭州宏杉科技有限公司 一种网络存储系统的管理方法及装置
CN103970481A (zh) * 2013-01-29 2014-08-06 国际商业机器公司 重建存储器阵列的方法和装置
CN105224891A (zh) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 磁盘光盘融合数据安全存储方法、系统及装置
CN105808161A (zh) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 一种硬盘坏道数据的读取方法
US20170358346A1 (en) * 2016-06-13 2017-12-14 SK Hynix Inc. Read threshold optimization in flash memories
US9891994B1 (en) * 2015-12-30 2018-02-13 EMC IP Holding Company LLC Updated raid 6 implementation
CN109582515A (zh) * 2018-12-03 2019-04-05 郑州云海信息技术有限公司 一种硬盘检测方法、系统及电子设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509156B (zh) * 2018-04-04 2021-06-11 腾讯科技(深圳)有限公司 数据读取方法、装置、设备及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147713A (zh) * 2011-02-18 2011-08-10 杭州宏杉科技有限公司 一种网络存储系统的管理方法及装置
CN103970481A (zh) * 2013-01-29 2014-08-06 国际商业机器公司 重建存储器阵列的方法和装置
CN105224891A (zh) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 磁盘光盘融合数据安全存储方法、系统及装置
US9891994B1 (en) * 2015-12-30 2018-02-13 EMC IP Holding Company LLC Updated raid 6 implementation
CN105808161A (zh) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 一种硬盘坏道数据的读取方法
US20170358346A1 (en) * 2016-06-13 2017-12-14 SK Hynix Inc. Read threshold optimization in flash memories
CN109582515A (zh) * 2018-12-03 2019-04-05 郑州云海信息技术有限公司 一种硬盘检测方法、系统及电子设备和存储介质

Also Published As

Publication number Publication date
CN112463019A (zh) 2021-03-09

Similar Documents

Publication Publication Date Title
EP2972871B1 (fr) Procédés et appareil de détection et de correction d'erreur dans des systèmes de stockage de données
US8171379B2 (en) Methods, systems and media for data recovery using global parity for multiple independent RAID levels
US7529965B2 (en) Program, storage control method, and storage system
US7788541B2 (en) Apparatus and method for identifying disk drives with unreported data corruption
US7062704B2 (en) Storage array employing scrubbing operations using multiple levels of checksums
US7017107B2 (en) Storage array employing scrubbing operations at the disk-controller level
US6990611B2 (en) Recovering data from arrays of storage devices after certain failures
CN106776130B (zh) 一种日志恢复方法、存储装置和存储节点
US20110264949A1 (en) Disk array
US7698592B2 (en) Apparatus and method for controlling raid array rebuild
CN109614276B (zh) 故障处理方法、装置、分布式存储系统和存储介质
US20130262919A1 (en) Systems and methods for preventing data loss
US7827441B1 (en) Disk-less quorum device for a clustered storage system
JP2001228980A (ja) ディスクアレイ用コントローラ
JP4324088B2 (ja) データ複製制御装置
JP2010033287A (ja) ストレージサブシステム及びこれを用いたデータ検証方法
JP2006139478A (ja) ディスクアレイシステム
WO2019210844A1 (fr) Procédé et appareil de détection d'anomalie destinés à un dispositif de stockage et système de stockage distribué
US8782465B1 (en) Managing drive problems in data storage systems by tracking overall retry time
JP2005309818A (ja) ストレージ装置、そのデータ読出方法、及びそのデータ読出プログラム
WO2021043246A1 (fr) Procédé et appareil de lecture de données
US9280431B2 (en) Prioritizing backups on a disk level within enterprise storage
JP7125602B2 (ja) データ処理装置および診断方法
CN117111860B (zh) 磁盘阵列降级时的io处理方法、装置及电子设备
CN106776142B (zh) 一种数据存储方法以及数据存储装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20860691

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20860691

Country of ref document: EP

Kind code of ref document: A1