WO2021043246A1 - Data reading method and apparatus - Google Patents

Data reading method and apparatus Download PDF

Info

Publication number
WO2021043246A1
WO2021043246A1 PCT/CN2020/113420 CN2020113420W WO2021043246A1 WO 2021043246 A1 WO2021043246 A1 WO 2021043246A1 CN 2020113420 W CN2020113420 W CN 2020113420W WO 2021043246 A1 WO2021043246 A1 WO 2021043246A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
hard disk
target
storage block
checksum
Prior art date
Application number
PCT/CN2020/113420
Other languages
French (fr)
Chinese (zh)
Inventor
张瑛
熊伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021043246A1 publication Critical patent/WO2021043246A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Definitions

  • This application relates to the field of data storage, and in particular to a data reading method and device.
  • Redundant array of independent disks refers to a technology that implements data reading and writing based on multiple hard disks. According to different implementation principles, RAID can be divided into hard RAID and soft RAID. Among them, hard RAID is to realize RAID functions including data reading and writing through hardware, while soft RAID refers to the realization of RAID functions through operating system and CPU.
  • the hard disk may send a failure signal to the processor.
  • the processor may mark the hard disk as a failed disk.
  • the processor can read the data in other hard disks except the faulty disk, and recover the data in the faulty disk based on the read data. Data, the recovered data is returned to the client.
  • the processor After the processor receives the read request for the failed disk, it needs to read the data on other hard disks to restore the data on the failed disk. Therefore, it will cause a large delay in data reading and will Cause greater resource consumption.
  • This application provides a data reading method and device, which can be used to solve the problem of data reading caused by the processor in the related art reading the failed disk by reading data on other hard disks to restore the data on the failed disk.
  • the problem of large time delay and large resource consumption. is as follows:
  • a data reading method includes: when receiving failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, setting the hard disk status of the target hard disk to Risk status, receiving a read request to read the target hard disk sent by the client.
  • the target hard disk refers to any hard disk in the redundant array of independent hard disks whose corresponding hard disk status is in the risk state, and the risk status is used to indicate the occurrence of the hard disk.
  • a read request sent by the client to read the target hard disk is received.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client.
  • the target hard disk refers to a hard disk whose corresponding hard disk status in the RAID is in a risk state
  • the risk state refers to a software failure due to a failure. That is, in the embodiment of the present application, after receiving the failure information sent by the target hard disk to indicate that the hard disk is in a risk state, the hard disk state corresponding to the hard disk may be set to the risk state.
  • the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned.
  • the recovery is directly performed by reading the data on other hard disks, which shortens the time delay of data reading and also reduces the consumption of system resources.
  • the failure information is a software failure type identification
  • the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
  • the hard disk that has failed may feed back to the operating system a fault type identifier used to indicate the type of the fault.
  • the operating system can identify the hard disk with a software failure based on the fault type identifier, and then respond to the software failure.
  • the hard disk of this application can read data through the method provided in this application.
  • the hard disk that has failed only feeds back the failure signal. In this way, the operating system cannot identify what type of failure has occurred in the hard disk. Therefore, it can only be processed according to the hard disk that has hardware failure, that is, the data is directly processed. restore.
  • the implementation process of verifying the target data may be: obtaining the reference checksum of the stored target data; calculating the actual checksum of the target data according to the target data; if the target data If the reference checksum of is the same as the actual checksum of the target data, it is determined that the verification of the target data is passed.
  • the method further includes: repairing the data stored on the target hard disk; and modifying the hard disk status of the target hard disk after the data repair is completed to a safe state.
  • the operating system can create a background task to modify the data stored on the target hard disk while reading the target data stored on the target hard disk according to the read request, and repair the data on the target hard disk.
  • the hard disk status is changed to a safe status.
  • the implementation process of repairing the data stored on the target hard disk may be: obtaining a checksum of the data index area of each storage block in the target hard disk, where the data index area refers to the corresponding storage block
  • the area for storing data index information in the corresponding storage block, the data index information includes the checksum of each data stored in the corresponding storage block; according to the checksum of the data index area of each storage block, the data on each storage block Index information is checked; for the first storage block that has passed the data index information check, the checksum of each data stored in the first storage block is obtained from the data index information of the first storage block; according to The obtained checksum of each data is verified for each data in the first storage block; and the data in the first storage block that fails the verification is repaired.
  • the verification information of each data stored in the data index area can be used to verify each storage block.
  • Each data is verified, and then only the data that fails the verification is restored. In this way, the amount of data restoration can be reduced.
  • the storage block can be directly reconstructed to restore all data on the storage block.
  • a data reading device in a second aspect, is provided, and the data reading device has the function of realizing the behavior of the data reading method in the first aspect.
  • the data reading device includes at least one module, and the at least one module is used to implement the data reading method provided in the above-mentioned first aspect.
  • a data reading device in a third aspect, includes a processor and a memory, and the memory is used for storing and supporting the data reading device to perform the data reading provided in the first aspect.
  • the processor is configured to execute the program stored in the memory.
  • the operating device of the storage device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.
  • a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which when run on a computer, cause the computer to execute the data reading method described in the first aspect.
  • a computer program product containing instructions, which when running on a computer, causes the computer to execute the data reading method described in the first aspect.
  • the hard disk state of the target hard disk may be set to the risk state. Subsequently, when a read request for the target hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In this way, compared with the related technology No matter what happens to the hard disk, it can be recovered directly by reading the data on other hard disks, which shortens the time delay of data reading and reduces the consumption of system resources.
  • FIG. 1 is a system architecture diagram involved in a data reading method provided by an embodiment of the present application
  • FIG. 2 is a flowchart of a data reading method provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of data distribution in a storage block provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a data reading device provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of another data reading device provided by an embodiment of the present application.
  • FIG. 1 is an architecture diagram of a storage system involved in a data reading method provided by an embodiment of the present application. As shown in Figure 1, the system includes client 01 and storage device 02. Among them, the client 01 and the storage device 02 can communicate.
  • the client 01 can send a read request or a write request to the storage device 02.
  • the storage device 02 may include a processor 021, a memory 022, and a hard disk 023.
  • the processor 021 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs used to control the execution of the program of this application. integrated circuit.
  • CPU Central Processing Unit
  • ASIC application-specific integrated circuit
  • the storage device may include multiple processors 021.
  • processors 021 may be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or devices including a processing core for processing data (for example, computer program instructions).
  • the memory 022 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM)) or can store information and Other types of dynamic storage devices for instructions can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory, CD-ROM or other optical discs Storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures And any other media that can be accessed by the computer, but not limited to this.
  • the memory 022 may exist independently and is connected to the processor 021.
  • the memory 022 may also be integrated with the processor 021.
  • the storage device 02 can be a storage array or a server.
  • the storage device 02 includes a controller and several hard disks.
  • the processor 021 and the memory 022 may be located in the controller of the storage array, and the controller and several hard disks are connected through a back-end interface card.
  • the processor 021, the storage 022 and several hard disks are all located inside the server.
  • FIG. 1 is only a schematic diagram of some components included in the device.
  • RAID technology is often used to store data in practical applications, for example, RAID5, RAID6, RAIDTP, etc.
  • RAID5 Take RAID5 as an example.
  • the space of each hard disk is divided into multiple storage blocks, each of which has the same size.
  • One storage block is taken out from different hard disks to form a storage block group.
  • the number of storage blocks is determined by the RAID type. Taking "4+1" RAID5 as an example, a storage block group consists of 4 data blocks and 1 parity block, so 5 storage blocks are required.
  • the storage device 02 When the storage device 02 stores data, the data can be split into 4 data fragments, and the check data of these 4 data fragments are calculated to generate the check fragments, and then the 4 data fragments are summed
  • the storage block group corresponds to a segment of logical address, which is the logical address of the data.
  • the storage device 02 stores the mapping relationship between the logical address and the physical address where the data is actually stored.
  • the read request includes the logical address of the data to be read, and the processor 021 can run the operating system to determine the location of the data to be read according to the logical address of the data to be read
  • the storage block and its location (physical address) in the storage block After the storage block where the data to be read is located and the location in the storage block are determined, the data can be read from the storage block.
  • the storage device 02 may also include a communication bus and a communication interface (not shown in FIG. 1). Among them, the communication bus is used to transfer information between various components included in the storage device 02.
  • the communication interface is used to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
  • devices or communication networks such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
  • Fig. 2 is a flowchart of a data reading method provided by an embodiment of the present application.
  • the execution subject of this method may be the processor deployed in the storage device 02 described in FIG. 1. Referring to Figure 2, the method includes the following steps:
  • Step 201 When receiving the failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk is set to the risk state.
  • the risk status is used to indicate that a small amount of data in the target hard disk is damaged or lost, and the target hard disk is allowed to continue writing data, and the remaining undamaged and lost data is also allowed to be read.
  • the failure information used to indicate that the target hard disk is at risk may refer to information used to indicate that the target hard disk is damaged or lost but the target hard disk is still usable.
  • the processor may send a failure detection instruction to each hard disk in the storage device every predetermined period of time. After each hard disk receives the fault detection instruction sent by the processor, it can detect whether it has read/write abnormality or whether there is data damage or loss. If it detects read/write abnormality or data damage or loss, the hard disk can determine its own occurrence. The fault type of the fault, and a fault type identifier for identifying the fault type is sent to the processor.
  • the processor may also send a fault detection instruction to the hard disk when it continuously receives an I/O abnormal error code sent by a certain hard disk, so as to query the type of fault of the hard disk. . That is, when the processor is processing a read request or write request for a hard disk, if it receives an I/O abnormal error code sent by the hard disk n times in a row, it means that the hard disk may be malfunctioning. At this time, the The processor may send a fault detection instruction to the hard disk to obtain the fault type of the hard disk.
  • n can be a preset value.
  • the failure types of hard disk failures may include software failures and hardware failures.
  • the fault type identification may include a software fault type identification and a hardware fault type identification.
  • the software failure type identification is used to identify the failure of the hard disk as a software failure
  • the hardware failure type identification is used to identify the failure of the hard disk as a hardware failure.
  • hardware failure refers to a hardware device failure.
  • software failure refers to data damage or loss caused by software abnormalities. It should be noted that when the hard disk has a software failure, only a small part of the data on the hard disk will be damaged or lost. However, the location of data damage or loss cannot be clarified.
  • the hard disk can continue to write data later, and data can also be read for undamaged and lost data.
  • the hard disk after the hard disk fails, if it is determined that it is a software failure, it can report the software failure type identification to the processor, and if it is a hardware failure, it can report the hardware failure type identification to the processor. .
  • each hard disk may also actively report the fault type identification to the processor when a fault is detected.
  • the hard disk state of the hard disk may be set to a risk state.
  • the processor may directly set the hard disk status of the hard disk to a failed state. At this time, the failed state is used to indicate It is forbidden to read and write data on the hard disk later.
  • the memory of the storage device may store the corresponding relationship between the hard disk identification and the status information of each of the multiple hard disks.
  • the status information may include a safety status, a risk status, and a failure status.
  • the safe state is used to indicate that the data stored on the hard disk is not damaged or lost, and the hard disk is currently not malfunctioning.
  • the risk status is used to indicate that a small part of the data stored on the hard disk is damaged or lost, the hard disk has a software failure, and the subsequent processor can continue to write data in the hard disk, and the subsequent steps in the embodiment of this application can be used Read data from the hard disk.
  • the invalid state is used to indicate that the hard disk has a hardware failure and is currently unavailable.
  • the processor When a read and write request for the hard disk is subsequently received, data reading and writing on the hard disk will be prohibited. Based on this, after the processor receives the software failure type identifier reported by the target hard disk, it can set the status information corresponding to the hard disk identifier of the target hard disk in the above corresponding relationship to the risk state to indicate that the target hard disk has a software failure. A small amount of data has been damaged or lost.
  • Step 202 Receive a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state.
  • the client when the data that the client wants to read is stored in the target hard disk, the client may send a read request carrying the logical address of the data to be read to the processor.
  • the processor may receive the read request sent by the client, and determine the hard disk to be read as the target hard disk according to the logical address of the data to be read carried in the read request. That is, the read request is a read request for reading data in the target hard disk.
  • Step 203 According to the read request, read the target data stored on the target hard disk.
  • the processor After receiving the read request, the processor can read the target data stored on the target hard disk according to the logical address carried in the read request.
  • the hard disk space can be divided into multiple storage blocks (that is, blocks). Among them, the size of each storage block is the same.
  • Each storage block may correspond to a segment of logical address, and each storage block may store multiple pieces of data, and the memory of the storage device may store the mapping relationship between the physical address of the data and the logical address.
  • the processor can determine the physical address for storing the target data to be read according to the logical address carried in the read request, that is, determine the target memory block to be read, and then the processor can read from the target memory block Get the stored target data.
  • Step 204 Verify the target data.
  • the target hard disk is a hard disk in a risk state
  • a small amount of data stored on the target hard disk may be damaged or lost.
  • the processor reads the target data according to the logical address in the read request, it cannot determine whether the target data has been damaged. Based on this, the processor can verify the target data.
  • the processor may obtain a reference checksum (checksum) of the stored target data, and calculate the actual checksum of the target data according to the target data. If the reference checksum of the target data is the same as the actual checksum, then Confirm that the verification of the target data is passed.
  • checksum checksum
  • the storage device may store metadata of various data stored on each of the multiple hard disks.
  • the metadata includes the storage address of each data and the checksum of each data.
  • the processor may obtain metadata containing the logical address from the stored metadata according to the logical address carried in the read request, and obtain the checksum of the target data from the obtained metadata. At this time, the obtained checksum of the target data is the reference checksum of the correct target data originally stored in the space indicated by the logical address carried in the read request.
  • the processor may also calculate the actual checksum of the target data according to the acquired target data.
  • the calculation method for obtaining the actual checksum is the same as the calculation method for the reference checksum in the stored metadata.
  • the processor can compare the two. If the two are the same, it means that the obtained target data is correct and not damaged. At this time, the verification of the target data is passed. If the two are not the same, it means that the acquired target data is corrupted data. At this time, the verification of the target data fails.
  • Step 205 If the verification of the target data is passed, the target data is sent to the client.
  • the processor can directly return the target data to the client.
  • the processor can recover the target data by reading data on other hard disks except the target hard disk among the multiple hard disks included in the RAID.
  • the hard disk state of the target hard disk may be set to the risk state.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
  • the processor after the processor receives the read request from the target hard disk, it considers that there is damaged data on the target hard disk. Therefore, the processor can also establish a background task to store data on the target hard disk. The data is repaired.
  • the space of the target hard disk can be divided into multiple storage blocks.
  • the processor may sequentially verify each storage block of the target hard disk starting from the first storage block of the target hard disk, and then repair the storage blocks that fail the verification.
  • the processor may obtain the checksum of the data index area of each storage block in the target hard disk.
  • the data index area refers to the area where the data index information is stored in the corresponding storage block, and the data index information includes the data stored in the corresponding storage block.
  • the checksum of each data according to the checksum of the data index area of each storage block, the data index information on each storage block is verified; for the first storage block that has passed the data index information verification, from Obtain the checksum of each data stored in the first storage block from the data index information of the first storage block; verify each data in the first storage block according to the obtained checksum of each data; Repair the data in the first storage block that has not passed the verification.
  • the first storage block refers to any storage block that passes the data index information verification, that is, the storage blocks that pass the data index information verification can all be referred to as the first storage block. Conversely, the storage block that fails the verification of the data index information can be called the second storage block.
  • each storage block may include a user data area, a data index area, and a tail area.
  • the data requested by the client is user data stored in the user data area, that is, the target data in this application is user data.
  • the data index area stores the mapping information corresponding to each piece of user data.
  • the mapping information includes the checksum of the corresponding user data, the offset of the corresponding data in the storage block, and so on.
  • the checksum of the data index area is stored in the tail area.
  • Figure 3 shows a schematic diagram of data layout in a hard disk.
  • the hard disk may include multiple Blocks, that is, multiple storage blocks.
  • Block1 multiple pieces of user data such as data0 and data1 can be stored in Block1, and the space occupied by multiple pieces of data can be called a user data region (user data region).
  • Each piece of user data corresponds to mapping information, as shown in Figure 3, the mapping information corresponding to data0 is data0 ref, and the mapping information corresponding to data1 is data1 ref.
  • the mapping information may include the offset and checksum of the user data.
  • the space occupied by the mapping information corresponding to multiple pieces of user data may be referred to as a data reference region.
  • the checksum of the data index area is stored in the last sector of Block1, that is, the tail area.
  • the processor can obtain the checksum of the data index area from the tail area of the storage block, that is, the data index At the same time, the processor can read the data index information in the data index area. The actual checksum of the data index area is calculated according to the acquired data index information.
  • the actual checksum is determined according to the data index information in the data index area using the same method as the reference checksum stored in the tail area. In this case, if the data index area is stored If the data index information is not damaged, the actual checksum of the data index area will be the same as the checksum of the data index area stored in the tail area. If the data index information stored in the data index area is damaged, the actual checksum will be The checksum will be different from the checksum stored in the tail area. Based on this, in the embodiment of the present application, after the processor calculates the actual checksum of the data index area, it can compare the actual checksum with the checksum of the data index area obtained from the tail area.
  • the processor can determine that the data index information in the data index area is not damaged, that is, the verification of the data index area passes.
  • the storage block is also the first storage block that passes the aforementioned data index information verification.
  • the processor can use the checksum of each piece of data stored in the data index area to verify each piece of data in the user data area.
  • the processor can first read the first piece of data in the user data area on the storage block, and read the checksum of the first piece of data from the data index area. After that, the processor can calculate a checksum according to the first piece of data, and compare the calculated checksum with the checksum of the piece of data read. If the two are the same, the first piece of data is indicated. It is not damaged, that is, the verification of the piece of data is passed. At this time, the processor can continue to verify the next piece of data in the user data area to determine whether it is damaged. If the two checksums are not the same, it means that the first piece of data has been damaged, that is, the check of this piece of data has failed.
  • the processor can mark the first piece of data, and then continue to check The next data is checked. In this way, the processor can repair the marked data after verifying all the data in the user data area.
  • the processor may also repair the data every time it determines that a piece of data fails the verification.
  • the processor when the processor repairs a piece of data, the processor can read data information related to the piece of data from other hard disks except the target hard disk, and calculate and restore the piece of data based on the read data information. After recovering the data, the processor can store the data in other hard disks or write to other storage blocks of the target hard disk. At the same time, the processor can store the user data area of the current storage block. Delete the piece of data stored in the data, that is, release the space occupied by the piece of data in the current storage block, and subsequently, the processor may also write new data in the space.
  • the processor when the processor determines that the first piece of data in the storage block that fails the check, it can directly read the data from other hard disks other than the target hard disk. Data information related to the data stored on the block, and then directly calculate and restore all data stored on the storage block according to the obtained data information. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block.
  • the processor may continue to verify the next storage block.
  • the processor compares the actual checksum of the data index area with the reference checksum of the data index area stored in the tail area of the storage block and finds that the two are not the same, it indicates that the data of the storage block is different.
  • the data index information stored in the index area is damaged.
  • the processor can directly read the data information related to the data stored on the storage block in other hard disks, and then directly calculate and restore the data based on the obtained data information. All data stored on the storage block. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block. After that, the processor can continue to verify the next storage block.
  • the processor can implement the above-mentioned data restoration on the target hard disk by establishing a background task.
  • the processor may divide the background task into multiple task fragments, and process the multiple task fragments in a concurrent manner, so as to improve the speed of data repair.
  • the processor can modify the hard disk state of the target hard disk from a risk state to a safe state.
  • the processor can verify each storage block in the target hard disk one by one by creating a background task, and perform verification on the storage that fails the verification. Block data is repaired. In this way, because there is less damaged or lost data in the target hard disk, only a small amount of data needs to be repaired, which reduces the processor's consumption of processing resources. At the same time, because there is less data to be restored, correspondingly, there are fewer related data that need to be read from other disks. Therefore, when the total hard disk bandwidth is fixed, the background task of repairing data and other normal tasks can be effectively avoided. I/O competes for bandwidth, which can reduce the performance fluctuation of normal I/O.
  • the processor can write the data to be written to the target hard disk according to the write request. That is, in the embodiment of the present application, for a hard disk whose hard disk status is in a risk state, the hard disk can still be used.
  • an embodiment of the present application provides a data reading device 400, and the device 400 includes:
  • the setting module 401 is used to execute step 201 in the above embodiment; wherein, the setting module 401 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the receiving module 402 is configured to execute step 202 in the above embodiment; wherein, the receiving module 402 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the reading module 403 is used to execute step 203 in the above embodiment; wherein, the reading module 403 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
  • the verification module 404 is configured to execute step 204 in the above embodiment; wherein, the verification module 404 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
  • the sending module 405 is used to execute step 205 in the above embodiment; wherein, the sending module 405 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
  • the failure information is a software failure type identification
  • the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
  • the verification module is specifically used for:
  • the apparatus 400 further includes:
  • the repair module 406 is used to repair the data stored on the target hard disk
  • the modification module 407 is used to modify the hard disk state of the target hard disk after the data repair is completed to a safe state.
  • the repair module 406 is specifically configured to:
  • the data index area refers to the area where data index information is stored in the corresponding storage block.
  • the data index information includes the checksum of each data stored in the corresponding storage block ;
  • repair module 406 is specifically used to:
  • the repair module 406 and the modification module 407 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call program codes in the memory 022 for execution.
  • the hard disk state of the target hard disk may be set to the risk state.
  • the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
  • the data reading device provided in the above embodiment reads data
  • only the division of the above functional modules is used as an example for illustration.
  • the above functions can be allocated by different functional modules according to needs.
  • the data reading device provided in the above embodiment and the data reading method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD) )Wait.
  • the program can be stored in a computer-readable storage medium.
  • the storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

Abstract

Disclosed in the present application are a data reading method and apparatus, relating to the technical field of storage. In the present application, for a hard disk in a RAID having a corresponding hard disk state of an at-risk state, when a read request for said type of hard disk is received, the corresponding data stored on the hard disk can be read on the basis of the read request and the data can be verified and, if verification passes, then the read data can be returned; thus, compared to the method in the prior art of directly implementing restoration by means of reading the data on the other hard disks no matter what the fault of the hard disk, the data reading delay is shortened and the consumption of system resources is reduced.

Description

数据读取方法及装置Data reading method and device 技术领域Technical field
本申请涉及数据存储领域,特别涉及一种数据读取方法及装置。This application relates to the field of data storage, and in particular to a data reading method and device.
背景技术Background technique
独立硬盘冗余阵列(redundant array of independent disks,RAID)是指基于多块硬盘实现数据读写的一种技术。根据实现原理不同,RAID可以分为硬RAID和软RAID。其中,硬RAID是通过硬件来实现包括数据读取、写入等RAID功能,而软RAID是指通过操作系统和CPU来实现RAID功能。Redundant array of independent disks (RAID) refers to a technology that implements data reading and writing based on multiple hard disks. According to different implementation principles, RAID can be divided into hard RAID and soft RAID. Among them, hard RAID is to realize RAID functions including data reading and writing through hardware, while soft RAID refers to the realization of RAID functions through operating system and CPU.
相关技术中,对于软RAID,当多块硬盘中的某块硬盘发生故障时,该硬盘可以向处理器发送故障信号。其中,当硬盘发生硬件故障时,该硬盘上存储的数据通常会全部损坏或丢失,而当硬盘发生软件故障时,该硬盘上存储的数据中通常只有部分数据会损坏或丢失。处理器在接收到该故障信号之后,可以将该硬盘标记为故障盘。后续,当接收到客户端发送的读取该故障盘的读请求时,处理器可以读取除该故障盘之外的其他硬盘中的数据,并根据读取的数据恢复出该故障盘中的数据,将恢复得到的数据返回给客户端。In related technologies, for soft RAID, when a hard disk among multiple hard disks fails, the hard disk may send a failure signal to the processor. Among them, when the hard disk has a hardware failure, the data stored on the hard disk will usually be completely damaged or lost, and when the hard disk has a software failure, usually only part of the data stored on the hard disk will be damaged or lost. After receiving the failure signal, the processor may mark the hard disk as a failed disk. Subsequently, when receiving a read request from the client to read the faulty disk, the processor can read the data in other hard disks except the faulty disk, and recover the data in the faulty disk based on the read data. Data, the recovered data is returned to the client.
然而,由于处理器在接收到针对故障盘的读请求后,需要通过读取其他硬盘上的数据来恢复该故障盘上的数据,因此,会导致数据读取存在较大的时延,并且会造成较大的资源消耗。However, after the processor receives the read request for the failed disk, it needs to read the data on other hard disks to restore the data on the failed disk. Therefore, it will cause a large delay in data reading and will Cause greater resource consumption.
发明内容Summary of the invention
本申请提供了一种数据读取方法及装置,可以用于解决相关技术中处理器在读取故障盘时,通过读取其他硬盘上的数据来对故障盘数据进行恢复所导致的数据读取时延较大、资源消耗大的问题。所述技术方案如下:This application provides a data reading method and device, which can be used to solve the problem of data reading caused by the processor in the related art reading the failed disk by reading data on other hard disks to restore the data on the failed disk. The problem of large time delay and large resource consumption. The technical solution is as follows:
第一方面,提供了一种数据读取方法,所述方法包括:当接收到目标硬盘发送的用于指示所述目标硬盘处于风险状态的故障信息时,将所述目标硬盘的硬盘状态设置为风险状态,接收客户端发送的读取目标硬盘的读请求,所述目标硬盘是指独立硬盘冗余阵列RAID中对应的硬盘状态为风险状态的任一硬盘,所述风险状态用于指示硬盘发生过软件故障;根据所述读请求,读取所述目标硬盘上存储的目标数据;对所述目标数据进行校验;如果对所述目标数据的校验通过,则将所述目标数据发送至客户端。In a first aspect, a data reading method is provided. The method includes: when receiving failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, setting the hard disk status of the target hard disk to Risk status, receiving a read request to read the target hard disk sent by the client. The target hard disk refers to any hard disk in the redundant array of independent hard disks whose corresponding hard disk status is in the risk state, and the risk status is used to indicate the occurrence of the hard disk. After a software failure; according to the read request, read the target data stored on the target hard disk; verify the target data; if the verification of the target data passes, send the target data to Client.
在本申请实施例中,接收客户端发送的读取目标硬盘的读请求。根据该读请求,读取目标硬盘上存储的目标数据,对目标数据进行校验,如果校验通过,则将目标数据返回至客户端。其中,目标硬盘是指RAID中对应的硬盘状态为风险状态的硬盘,风险状态是指故障发生了软件故障。也即,在本申请实施例中,在接收到目标硬盘发送的用于指示该硬盘处于风险状态的故障信息后,可以将硬盘对应的硬盘状态设置为风险状态。这样,后续在接收到针对这一类的硬盘的读请求时,可以根据该读请求读取相应的数据,并对数据校验,如果校验通过,则可以返回读取的数据,这样,相对于相关技术中不管硬盘发生什么故障,都直接通过读取其他硬盘上的数据来进行恢复,缩短了数据读取的时延,同时也减少了对系统资源的 消耗。In the embodiment of the present application, a read request sent by the client to read the target hard disk is received. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. Among them, the target hard disk refers to a hard disk whose corresponding hard disk status in the RAID is in a risk state, and the risk state refers to a software failure due to a failure. That is, in the embodiment of the present application, after receiving the failure information sent by the target hard disk to indicate that the hard disk is in a risk state, the hard disk state corresponding to the hard disk may be set to the risk state. In this way, when a subsequent read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In the related technology, no matter what the hard disk fails, the recovery is directly performed by reading the data on other hard disks, which shortens the time delay of data reading and also reduces the consumption of system resources.
可选地,所述故障信息为软件故障类型标识,所述软件故障类型标识用于指示所述目标硬盘发生的故障为软件故障。Optionally, the failure information is a software failure type identification, and the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
在本申请实施例中,发生故障的硬盘可以向操作系统反馈用于指示故障类型的故障类型标识,这样,操作系统根据该故障类型标识即可以识别出发生软件故障的硬盘,进而针对发生软件故障的硬盘,通过本申请提供的方法来进行数据读取。而相关技术中,发生故障的硬盘只反馈故障信号,这样,操作系统无法识别硬盘到底发生了何种故障,因此,只能将其按照发生硬件故障的硬盘来进行处理,也即,直接进行数据恢复。In the embodiment of the present application, the hard disk that has failed may feed back to the operating system a fault type identifier used to indicate the type of the fault. In this way, the operating system can identify the hard disk with a software failure based on the fault type identifier, and then respond to the software failure. The hard disk of this application can read data through the method provided in this application. In the related technology, the hard disk that has failed only feeds back the failure signal. In this way, the operating system cannot identify what type of failure has occurred in the hard disk. Therefore, it can only be processed according to the hard disk that has hardware failure, that is, the data is directly processed. restore.
可选地,对目标数据进行校验的实现过程可以为:获取存储的所述目标数据的参考校验和;根据所述目标数据计算所述目标数据的实际校验和;如果所述目标数据的参考校验和与所述目标数据的实际校验和相同,则确定对所述目标数据的校验通过。Optionally, the implementation process of verifying the target data may be: obtaining the reference checksum of the stored target data; calculating the actual checksum of the target data according to the target data; if the target data If the reference checksum of is the same as the actual checksum of the target data, it is determined that the verification of the target data is passed.
可选地,在接收客户端发送的读取目标硬盘的读请求之后,还包括:对所述目标硬盘上存储的数据进行修复;将完成数据修复的目标硬盘的硬盘状态修改为安全状态。Optionally, after receiving the read request for reading the target hard disk sent by the client, the method further includes: repairing the data stored on the target hard disk; and modifying the hard disk status of the target hard disk after the data repair is completed to a safe state.
在本申请实施例中,操作系统可以在根据读请求读取目标硬盘上存储的目标数据的同时,创建一个后台任务来对目标硬盘上存储的数据进行修改,并将数据修复后的目标硬盘的硬盘状态修改为安全状态。In the embodiment of the present application, the operating system can create a background task to modify the data stored on the target hard disk while reading the target data stored on the target hard disk according to the read request, and repair the data on the target hard disk. The hard disk status is changed to a safe status.
可选地,对所述目标硬盘上存储的数据进行修复的实现过程可以为:获取所述目标硬盘中每个存储块的数据索引区的校验和,所述数据索引区是指相应存储块中存储数据索引信息的区域,所述数据索引信息包括相应存储块中存储的每个数据的校验和;根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验;对于数据索引信息校验通过的第一存储块,从所述第一存储块的数据索引信息中获取所述第一存储块中存储的每个数据的校验和;根据获取的每个数据的校验和,对所述第一存储块中的每个数据进行校验;对所述第一存储块中校验未通过的数据进行修复。Optionally, the implementation process of repairing the data stored on the target hard disk may be: obtaining a checksum of the data index area of each storage block in the target hard disk, where the data index area refers to the corresponding storage block The area for storing data index information in the corresponding storage block, the data index information includes the checksum of each data stored in the corresponding storage block; according to the checksum of the data index area of each storage block, the data on each storage block Index information is checked; for the first storage block that has passed the data index information check, the checksum of each data stored in the first storage block is obtained from the data index information of the first storage block; according to The obtained checksum of each data is verified for each data in the first storage block; and the data in the first storage block that fails the verification is repaired.
在本申请实施例中,对于目标硬盘上的任一存储块,如果对该存储块的数据索引区的校验通过,则可以根据数据索引区中存储的每个数据的校验信息来对每个数据进行校验,进而仅对校验未通过的数据进行恢复,这样,可以减少数据恢复量。In the embodiment of the present application, for any storage block on the target hard disk, if the verification of the data index area of the storage block passes, the verification information of each data stored in the data index area can be used to verify each storage block. Each data is verified, and then only the data that fails the verification is restored. In this way, the amount of data restoration can be reduced.
可选地,根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验之后,对于数据索引信息校验未通过的第二存储块,则对所述第二存储块上存储的全部数据进行修复。Optionally, after the data index information on each storage block is verified according to the checksum of the data index area of each storage block, for the second storage block that fails the data index information verification, all All data stored on the second storage block is repaired.
也即,对于数据索引信息损坏或丢失的存储块,可以直接对该存储块进行重构,以恢复该存储块上的全部数据。That is, for a storage block whose data index information is damaged or lost, the storage block can be directly reconstructed to restore all data on the storage block.
第二方面,提供了一种数据读取装置,所述数据读取装置具有实现上述第一方面中数据读取方法行为的功能。所述数据读取装置包括至少一个模块,该至少一个模块用于实现上述第一方面所提供的数据读取方法。In a second aspect, a data reading device is provided, and the data reading device has the function of realizing the behavior of the data reading method in the first aspect. The data reading device includes at least one module, and the at least one module is used to implement the data reading method provided in the above-mentioned first aspect.
第三方面,提供了一种数据读取装置,所述数据读取装置的结构中包括处理器和存储器,所述存储器用于存储支持数据读取装置执行上述第一方面所提供的数据读取方法的程序,以及存储用于实现上述第一方面所提供的数据读取方法所涉及的数据。所述处理器被配置为用 于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。In a third aspect, a data reading device is provided. The structure of the data reading device includes a processor and a memory, and the memory is used for storing and supporting the data reading device to perform the data reading provided in the first aspect. The program of the method and the storage of the data involved in the data reading method provided in the first aspect. The processor is configured to execute the program stored in the memory. The operating device of the storage device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所述的数据读取方法。In a fourth aspect, a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which when run on a computer, cause the computer to execute the data reading method described in the first aspect.
第五方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面所述的数据读取方法。In a fifth aspect, a computer program product containing instructions is provided, which when running on a computer, causes the computer to execute the data reading method described in the first aspect.
上述第二方面、第三方面、第四方面和第五方面所获得的技术效果与第一方面中对应的技术手段获得的技术效果近似,在这里不再赘述。The technical effects obtained by the second, third, fourth, and fifth aspects described above are similar to those obtained by the corresponding technical means in the first aspect, and will not be repeated here.
本申请提供的技术方案带来的有益效果至少包括:The beneficial effects brought about by the technical solution provided in this application include at least:
在本申请实施例中,当接收到目标硬盘发送的用于指示目标硬盘处于风险状态的故障信息时,可以将目标硬盘的硬盘状态设置为风险状态。后续,在接收到针对目标硬盘的读请求时,可以根据该读请求读取相应的数据,并对数据校验,如果校验通过,则可以返回读取的数据,这样,相对于相关技术中不管硬盘发生什么故障,都直接通过读取其他硬盘上的数据来进行恢复,缩短了数据读取的时延,同时也减少了对系统资源的消耗。In the embodiment of the present application, when receiving the fault information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Subsequently, when a read request for the target hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In this way, compared with the related technology No matter what happens to the hard disk, it can be recovered directly by reading the data on other hard disks, which shortens the time delay of data reading and reduces the consumption of system resources.
附图说明Description of the drawings
图1是本申请实施例提供的数据读取方法所涉及的系统架构图;FIG. 1 is a system architecture diagram involved in a data reading method provided by an embodiment of the present application;
图2是本申请实施例提供的数据读取方法的流程图;FIG. 2 is a flowchart of a data reading method provided by an embodiment of the present application;
图3是本申请实施例提供的存储块中的数据分布示意图;FIG. 3 is a schematic diagram of data distribution in a storage block provided by an embodiment of the present application;
图4是本申请实施例提供的一种数据读取装置的结构示意图;FIG. 4 is a schematic structural diagram of a data reading device provided by an embodiment of the present application;
图5是本申请实施例提供的另一种数据读取装置的结构示意图。FIG. 5 is a schematic structural diagram of another data reading device provided by an embodiment of the present application.
具体实施方式detailed description
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions, and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below with reference to the accompanying drawings.
在对本申请实施例进行详细的解释说明之前,先对本申请实施例涉及的系统架构进行介绍。Before explaining the embodiments of the present application in detail, the system architecture involved in the embodiments of the present application will be introduced first.
图1是本申请实施例提供的数据读取方法所涉及的存储系统架构图。如图1所示,该系统中包括客户端01和存储设备02。其中,客户端01和存储设备02可以进行通信。FIG. 1 is an architecture diagram of a storage system involved in a data reading method provided by an embodiment of the present application. As shown in Figure 1, the system includes client 01 and storage device 02. Among them, the client 01 and the storage device 02 can communicate.
客户端01可以向存储设备02发送读请求或写请求。The client 01 can send a read request or a write request to the storage device 02.
存储设备02可以包括处理器021、存储器022和硬盘023。The storage device 02 may include a processor 021, a memory 022, and a hard disk 023.
处理器021可以是一个通用中央处理器(Central Processing Unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。The processor 021 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs used to control the execution of the program of this application. integrated circuit.
在具体实现中,作为一种实施例,存储设备可以包括多个处理器021。这些处理器中的每一个处理器可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理 器。这里的处理器可以指一个或多个设备、电路、和/或包括有用于处理数据(例如计算机程序指令)的处理核的设备。In a specific implementation, as an embodiment, the storage device may include multiple processors 021. Each of these processors may be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor. The processor here may refer to one or more devices, circuits, and/or devices including a processing core for processing data (for example, computer program instructions).
存储器022上安装有操作系统,处理器021可以通过运行该操作系统来实现对数据的读写。除此之外,该存储器中还可以存储有本申请方案的程序代码,并由处理器021来控制执行。其中,该存储器022可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其它类型的静态存储设备,随机存取存储器(random access memory,RAM))或者可存储信息和指令的其它类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其它磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。存储器022可以是独立存在,与处理器021相连接。存储器022也可以和处理器021集成在一起。An operating system is installed on the memory 022, and the processor 021 can read and write data by running the operating system. In addition, the memory may also store the program code of the solution of the present application, and the processor 021 controls the execution. Wherein, the memory 022 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM)) or can store information and Other types of dynamic storage devices for instructions can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory, CD-ROM or other optical discs Storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures And any other media that can be accessed by the computer, but not limited to this. The memory 022 may exist independently and is connected to the processor 021. The memory 022 may also be integrated with the processor 021.
存储设备02可以是一个存储阵列,也可以是一个服务器。当存储设备02是存储阵列时,它包括控制器和若干个硬盘,处理器021和存储器022可以位于存储阵列的控制器中,控制器与若干个硬盘通过后端接口卡连接。当存储设备02是服务器时,处理器021、存储器022和若干个硬盘均位于服务器内部。本实施例并没有限定存储设备02的产品形态,图1仅仅是该设备所包含的部分组件的示意图。The storage device 02 can be a storage array or a server. When the storage device 02 is a storage array, it includes a controller and several hard disks. The processor 021 and the memory 022 may be located in the controller of the storage array, and the controller and several hard disks are connected through a back-end interface card. When the storage device 02 is a server, the processor 021, the storage 022 and several hard disks are all located inside the server. This embodiment does not limit the product form of the storage device 02, and FIG. 1 is only a schematic diagram of some components included in the device.
为了保证数据的可靠性,在实际应用中往往采用RAID技术来存储数据,例如,RAID5、RAID6、RAIDTP等。以RAID5为例,首先将每个硬盘的空间划分为多个存储块(block),每个存储块的大小相同。从不同的硬盘中各取出一个存储块,组成一个存储块组。存储块的数量由RAID类型决定,以“4+1”的RAID5为例,一个存储块组由4个数据块和1个校验块组成,因此需要5个存储块。当存储设备02存储数据时,可以将该数据拆分成4个数据分片,并且计算这4个数据分片的校验数据,从而生成校验分片,然后将这4个数据分片和1个校验分片存储在存储块组中。存储块组对应有一段逻辑地址,就是所述数据的逻辑地址。存储设备02中保存逻辑地址与实际存储该数据的物理地址之间的映射关系。当存储设备02接收到客户端01发送的读请求时,该读请求包括待读取数据的逻辑地址,处理器021可以通过运行操作系统,根据待读取数据的逻辑地址确定待读取数据所在的存储块以及在该存储块中所处的位置(物理地址)。在确定出待读取数据所在的存储块和在该存储块中所处的位置之后,可以从该存储块中读取数据。In order to ensure the reliability of data, RAID technology is often used to store data in practical applications, for example, RAID5, RAID6, RAIDTP, etc. Take RAID5 as an example. First, the space of each hard disk is divided into multiple storage blocks, each of which has the same size. One storage block is taken out from different hard disks to form a storage block group. The number of storage blocks is determined by the RAID type. Taking "4+1" RAID5 as an example, a storage block group consists of 4 data blocks and 1 parity block, so 5 storage blocks are required. When the storage device 02 stores data, the data can be split into 4 data fragments, and the check data of these 4 data fragments are calculated to generate the check fragments, and then the 4 data fragments are summed One parity fragment is stored in the storage block group. The storage block group corresponds to a segment of logical address, which is the logical address of the data. The storage device 02 stores the mapping relationship between the logical address and the physical address where the data is actually stored. When the storage device 02 receives the read request sent by the client 01, the read request includes the logical address of the data to be read, and the processor 021 can run the operating system to determine the location of the data to be read according to the logical address of the data to be read The storage block and its location (physical address) in the storage block. After the storage block where the data to be read is located and the location in the storage block are determined, the data can be read from the storage block.
除此之外,存储设备02中还可以包括通信总线和通信接口(图1中未示出)。其中,通信总线用于存储设备02包括的各个组件之间传送信息。In addition, the storage device 02 may also include a communication bus and a communication interface (not shown in FIG. 1). Among them, the communication bus is used to transfer information between various components included in the storage device 02.
通信接口,用于与其它设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(Wireless Local Area Networks,WLAN)等。The communication interface is used to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
接下来对本申请实施例提供的数据读取方法进行介绍。Next, the data reading method provided by the embodiment of the present application will be introduced.
图2是本申请实施例提供的一种数据读取方法的流程图。该方法的执行主体可以为图1所述的部署于存储设备02中的处理器。参见图2,该方法包括以下步骤:Fig. 2 is a flowchart of a data reading method provided by an embodiment of the present application. The execution subject of this method may be the processor deployed in the storage device 02 described in FIG. 1. Referring to Figure 2, the method includes the following steps:
步骤201:当接收到目标硬盘发送的用于指示目标硬盘处于风险状态的故障信息时,将 目标硬盘的硬盘状态设置为风险状态。Step 201: When receiving the failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk is set to the risk state.
其中,风险状态用于指示该目标硬盘中的数据有少量损坏或丢失,且该目标硬盘还允许继续写入数据,剩余的未损坏和丢失的数据还允许读取。基于此,用于指示目标硬盘处于风险状态的故障信息可以是指用于指示目标硬盘存在数据损坏或丢失但是该目标硬盘还可用的信息。Among them, the risk status is used to indicate that a small amount of data in the target hard disk is damaged or lost, and the target hard disk is allowed to continue writing data, and the remaining undamaged and lost data is also allowed to be read. Based on this, the failure information used to indicate that the target hard disk is at risk may refer to information used to indicate that the target hard disk is damaged or lost but the target hard disk is still usable.
在本申请实施例中,处理器可以每隔预定时长,即向存储设备中的各个硬盘发送故障检测指令。各个硬盘在接收到处理器发送的故障检测指令之后,可以检测自身是否存在读写异常或是否存在数据损坏或丢失,如果检测到存在读写异常或数据损坏或丢失,则硬盘可以确定自身发生的故障的故障类型,并向处理器发送用于标识故障类型的故障类型标识。In the embodiment of the present application, the processor may send a failure detection instruction to each hard disk in the storage device every predetermined period of time. After each hard disk receives the fault detection instruction sent by the processor, it can detect whether it has read/write abnormality or whether there is data damage or loss. If it detects read/write abnormality or data damage or loss, the hard disk can determine its own occurrence. The fault type of the fault, and a fault type identifier for identifying the fault type is sent to the processor.
可选地,在另一种可能的实现方式中,处理器也可以在连续接收到某个硬盘发送的I/O异常错误码时,向该硬盘发送故障检测指令,以查询该硬盘的故障类型。也即,当处理器在处理针对某个硬盘的读请求或写请求时,如果连续n次接收到该硬盘发送的I/O异常错误码,则说明该硬盘可能发生了故障,此时,该处理器可以向该硬盘发送故障检测指令,以获取该硬盘的故障类型。其中,n可以为预设数值。Optionally, in another possible implementation manner, the processor may also send a fault detection instruction to the hard disk when it continuously receives an I/O abnormal error code sent by a certain hard disk, so as to query the type of fault of the hard disk. . That is, when the processor is processing a read request or write request for a hard disk, if it receives an I/O abnormal error code sent by the hard disk n times in a row, it means that the hard disk may be malfunctioning. At this time, the The processor may send a fault detection instruction to the hard disk to obtain the fault type of the hard disk. Among them, n can be a preset value.
其中,硬盘故障的故障类型可以包括软件故障和硬件故障。基于此,故障类型标识可以包括软件故障类型标识和硬件故障类型标识。其中,软件故障类型标识用于标识硬盘发生的故障为软件故障,硬件故障类型标识用于标识硬盘发生的故障为硬件故障。通常,硬件故障是指硬件器械故障。而软件故障是指由软件异常所引起的数据损坏或丢失。需要说明的是,当硬盘发生软件故障时,硬盘上只有少部分的数据会损坏或丢失。但是,无法明确数据损坏或丢失的位置,这样,硬盘后续还可以继续进行数据写入,且对于未损坏和丢失的数据,还可以进行数据读取。在本申请实施例中,硬盘在发生故障之后,如果确定自身发生的是软件故障,即可以向处理器上报软件故障类型标识,如果发生的是硬件故障,即可以向处理器上报硬件故障类型标识。Among them, the failure types of hard disk failures may include software failures and hardware failures. Based on this, the fault type identification may include a software fault type identification and a hardware fault type identification. Among them, the software failure type identification is used to identify the failure of the hard disk as a software failure, and the hardware failure type identification is used to identify the failure of the hard disk as a hardware failure. Generally, hardware failure refers to a hardware device failure. And software failure refers to data damage or loss caused by software abnormalities. It should be noted that when the hard disk has a software failure, only a small part of the data on the hard disk will be damaged or lost. However, the location of data damage or loss cannot be clarified. In this way, the hard disk can continue to write data later, and data can also be read for undamaged and lost data. In the embodiment of the present application, after the hard disk fails, if it is determined that it is a software failure, it can report the software failure type identification to the processor, and if it is a hardware failure, it can report the hardware failure type identification to the processor. .
可选地,在本申请实施例中,也可以由各个硬盘在检测到故障时主动向处理器上报故障类型标识。Optionally, in the embodiment of the present application, each hard disk may also actively report the fault type identification to the processor when a fault is detected.
基于前述描述可知,硬盘发生软件故障的情况下通常仅有少部分的数据会损坏或丢失,且还可以继续进行读写,因此,在本申请实施例中,用于指示硬盘处于风险状态的故障信息即可以为该软件故障类型标识。基于此,如果处理器接收到的某个硬盘发送的故障类型标识为软件故障类型标识,则可以将该硬盘的硬盘状态设置为风险状态。可选地,如果处理器接收到的某个硬盘发送的故障类型标识为硬件故障类型标识,则该处理器可以直接将该硬盘的硬盘状态设置为失效状态,此时,该失效状态用于指示后续禁止对该硬盘进行数据读写。Based on the foregoing description, in the case of a hard disk software failure, usually only a small part of the data will be damaged or lost, and reading and writing can continue. Therefore, in the embodiment of the present application, it is used to indicate that the hard disk is at risk. The information can be used to identify the type of software failure. Based on this, if the fault type identifier sent by a certain hard disk received by the processor is a software fault type identifier, the hard disk state of the hard disk may be set to a risk state. Optionally, if the fault type identifier sent by a certain hard disk received by the processor is a hardware fault type identifier, the processor may directly set the hard disk status of the hard disk to a failed state. At this time, the failed state is used to indicate It is forbidden to read and write data on the hard disk later.
需要说明的是,存储设备的存储器中可以存储有多个硬盘中每个硬盘的硬盘标识与状态信息的对应关系。其中,该状态信息可以包括安全状态、风险状态和失效状态。其中,安全状态用于指示硬盘上存储的数据没有损坏或丢失,该硬盘当前没有故障。风险状态用于指示硬盘上存储的数据有少部分损坏或丢失,该硬盘发生了软件故障,后续处理器还可以继续在该硬盘中写入数据,并且,可以通过本申请实施例中后续的步骤从该硬盘中读取数据。失效状态则用于指示硬盘发生了硬件故障,当前已不可用,后续当接收到针对该硬盘的读写请求时,将禁止对该硬盘进行数据读写。基于此,当处理器接收到目标硬盘上报的软件故障类型标识之后,可以将上述对应关系中目标硬盘的硬盘标识对应的状态信息设置为风险状态,以 此来指示该目标硬盘发生了软件故障,有少部分数据发生了损坏或丢失。It should be noted that the memory of the storage device may store the corresponding relationship between the hard disk identification and the status information of each of the multiple hard disks. Among them, the status information may include a safety status, a risk status, and a failure status. Among them, the safe state is used to indicate that the data stored on the hard disk is not damaged or lost, and the hard disk is currently not malfunctioning. The risk status is used to indicate that a small part of the data stored on the hard disk is damaged or lost, the hard disk has a software failure, and the subsequent processor can continue to write data in the hard disk, and the subsequent steps in the embodiment of this application can be used Read data from the hard disk. The invalid state is used to indicate that the hard disk has a hardware failure and is currently unavailable. When a read and write request for the hard disk is subsequently received, data reading and writing on the hard disk will be prohibited. Based on this, after the processor receives the software failure type identifier reported by the target hard disk, it can set the status information corresponding to the hard disk identifier of the target hard disk in the above corresponding relationship to the risk state to indicate that the target hard disk has a software failure. A small amount of data has been damaged or lost.
步骤202:接收客户端发送的读请求,该读请求用于读取处于风险状态的目标硬盘中的数据。Step 202: Receive a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state.
在本申请实施例中,当客户端想要读取的数据存储在目标硬盘中时,该客户端可以向处理器发送携带有待读取的数据的逻辑地址的读请求。处理器可以接收客户端发送的该读请求,并根据该读请求中携带的待读取数据的逻辑地址,确定待读取的硬盘为目标硬盘。也即,该读请求为读取目标硬盘中数据的读请求。In the embodiment of the present application, when the data that the client wants to read is stored in the target hard disk, the client may send a read request carrying the logical address of the data to be read to the processor. The processor may receive the read request sent by the client, and determine the hard disk to be read as the target hard disk according to the logical address of the data to be read carried in the read request. That is, the read request is a read request for reading data in the target hard disk.
步骤203:根据读请求,读取目标硬盘上存储的目标数据。Step 203: According to the read request, read the target data stored on the target hard disk.
在接收到该读请求之后,处理器可以根据该读请求中携带的逻辑地址,读取目标硬盘上存储的目标数据。After receiving the read request, the processor can read the target data stored on the target hard disk according to the logical address carried in the read request.
需要说明的是,硬盘的空间可以被划分为多个存储块(也即block)。其中,各个存储块的大小相同。每个存储块可以对应有一段逻辑地址,且各个存储块上可以存储有多条数据,存储设备的存储器中可以存储有数据的物理地址与逻辑地址之间的映射关系。基于此,处理器可以根据读请求中携带的逻辑地址,确定存储待读取的目标数据的物理地址,也即,确定待读取的目标存储块,之后,处理器可以从目标存储块中读取存储的目标数据。It should be noted that the hard disk space can be divided into multiple storage blocks (that is, blocks). Among them, the size of each storage block is the same. Each storage block may correspond to a segment of logical address, and each storage block may store multiple pieces of data, and the memory of the storage device may store the mapping relationship between the physical address of the data and the logical address. Based on this, the processor can determine the physical address for storing the target data to be read according to the logical address carried in the read request, that is, determine the target memory block to be read, and then the processor can read from the target memory block Get the stored target data.
步骤204:对目标数据进行校验。Step 204: Verify the target data.
由于目标硬盘是处于风险状态的硬盘,因此,该目标硬盘上存储的数据会存在少量的损坏或丢失。但是,由于并不能明确数据损坏或丢失的位置,因此,当处理器根据读请求中的逻辑地址读取到目标数据之后,并不能确定该目标数据是否已被损坏。基于此,处理器可以对目标数据进行校验。Since the target hard disk is a hard disk in a risk state, a small amount of data stored on the target hard disk may be damaged or lost. However, since it is not clear where the data is damaged or lost, after the processor reads the target data according to the logical address in the read request, it cannot determine whether the target data has been damaged. Based on this, the processor can verify the target data.
示例性地,处理器可以获取存储的目标数据的参考校验和(checksum),根据目标数据计算该目标数据的实际校验和,如果目标数据的参考校验和与实际校验和相同,则确定对目标数据的校验通过。Exemplarily, the processor may obtain a reference checksum (checksum) of the stored target data, and calculate the actual checksum of the target data according to the target data. If the reference checksum of the target data is the same as the actual checksum, then Confirm that the verification of the target data is passed.
需要说明的是,存储设备中可以存储有多个硬盘中每个硬盘上存储的各个数据的元数据。该元数据中包括各个数据的存储地址以及各个数据的校验和。处理器可以根据读请求中携带的逻辑地址,从存储的元数据中获取包含该逻辑地址的元数据,并从获取的元数据中获取目标数据的校验和。此时,获取的该目标数据的校验和即为读请求中携带的逻辑地址所指示的空间中原本存储的正确的目标数据的参考校验和。It should be noted that the storage device may store metadata of various data stored on each of the multiple hard disks. The metadata includes the storage address of each data and the checksum of each data. The processor may obtain metadata containing the logical address from the stored metadata according to the logical address carried in the read request, and obtain the checksum of the target data from the obtained metadata. At this time, the obtained checksum of the target data is the reference checksum of the correct target data originally stored in the space indicated by the logical address carried in the read request.
在获取目标数据的参考校验和的同时,该处理器还可以根据获取的目标数据计算得到该目标数据的实际校验和。其中,得到实际校验和的计算方法与存储的元数据中的参考校验和的计算方法相同。While acquiring the reference checksum of the target data, the processor may also calculate the actual checksum of the target data according to the acquired target data. Wherein, the calculation method for obtaining the actual checksum is the same as the calculation method for the reference checksum in the stored metadata.
由于该实际校验和是根据获取到的目标数据采用与存储的参考校验和相同的计算方法计算得到,因此,若获取到的目标数据已损坏,则计算得到实际校验和将与前述获取到的该目标诗句的参考校验和不相同。如果获取到的目标数据未损坏,则计算得到的该目标数据的实际校验和将与前述获取到的参考校验和相同。基于此,在从存储的元数据中获取到目标数据的参考校验和以及计算得到目标数据的实际校验和之后,该处理器可以将二者进行比较。如果二者相同,则说明获取的目标数据为正确的目标数据,未损坏,此时,对该目标数据的验证通过。如果二者不相同,则说明获取的目标数据为已损坏的数据,此时,对该目标数据的验证未通过。Since the actual checksum is calculated based on the obtained target data using the same calculation method as the stored reference checksum, if the obtained target data is damaged, the calculated actual checksum will be the same as the previously obtained The reference checksum of the target verse is different. If the obtained target data is not damaged, the calculated actual checksum of the target data will be the same as the previously obtained reference checksum. Based on this, after obtaining the reference checksum of the target data from the stored metadata and calculating the actual checksum of the target data, the processor can compare the two. If the two are the same, it means that the obtained target data is correct and not damaged. At this time, the verification of the target data is passed. If the two are not the same, it means that the acquired target data is corrupted data. At this time, the verification of the target data fails.
步骤205:如果对目标数据的校验通过,则将目标数据发送至客户端。Step 205: If the verification of the target data is passed, the target data is sent to the client.
由前述介绍可知,如果对目标数据的校验通过,则说明该目标数据实际上该目标硬盘中未被损坏的数据,在这种情况下,处理器可以直接将该目标数据返回至客户端。It can be seen from the foregoing introduction that if the verification of the target data passes, it means that the target data is actually uncorrupted data in the target hard disk. In this case, the processor can directly return the target data to the client.
可选地,如果对目标数据的校验未通过,则说明该客户端当前请求的数据中恰好包含该目标硬盘中已损坏的数据,在这种情况下,该目标数据不能作为读取结果返回至客户端。此时,处理器可以通过读取该RAID包括的多个硬盘中除该目标硬盘之外的其他硬盘上的数据来恢复该目标数据。Optionally, if the verification of the target data fails, it means that the data currently requested by the client happens to contain damaged data in the target hard disk. In this case, the target data cannot be returned as the read result To the client. At this time, the processor can recover the target data by reading data on other hard disks except the target hard disk among the multiple hard disks included in the RAID.
在本申请实施例中,当接收到目标硬盘发送的用于指示目标硬盘处于风险状态的故障信息时,可以将目标硬盘的硬盘状态设置为风险状态。接收客户端发送的读取目标硬盘的读请求。根据该读请求,读取目标硬盘上存储的目标数据,对目标数据进行校验,如果校验通过,则将目标数据返回至客户端。也即,在本申请实施例中,对于处于风险状态的硬盘,在接收到针对这一类的硬盘的读请求时,可以根据该读请求读取相应的数据,并对数据校验,如果校验通过,则可以返回读取的数据,这样,相对于相关技术中不管硬盘发生什么故障,都直接通过读取其他硬盘上的数据来进行恢复,缩短了数据读取的时延,同时也减少了对系统资源的消耗。In the embodiment of the present application, when receiving the fault information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Receive the read request sent by the client to read the target hard disk. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
需要说明的是,在本申请实施例中,处理器在接收到目标硬盘的读请求之后,考虑到该目标硬盘上存在已损坏的数据,因此,处理器还可以建立后台任务对目标硬盘上存储的数据进行修复。It should be noted that, in this embodiment of the application, after the processor receives the read request from the target hard disk, it considers that there is damaged data on the target hard disk. Therefore, the processor can also establish a background task to store data on the target hard disk. The data is repaired.
其中,由前述介绍可知,目标硬盘的空间可以被划分为多个存储块。基于此,在本申请实施例中,处理器可以从目标硬盘的第一个存储块开始依次对该目标硬盘的每个存储块进行校验,进而对校验未通过的存储块进行修复。Among them, it can be seen from the foregoing introduction that the space of the target hard disk can be divided into multiple storage blocks. Based on this, in the embodiment of the present application, the processor may sequentially verify each storage block of the target hard disk starting from the first storage block of the target hard disk, and then repair the storage blocks that fail the verification.
示例性地,处理器可以获取目标硬盘中每个存储块的数据索引区的校验和,数据索引区是指相应存储块中存储数据索引信息的区域,数据索引信息包括相应存储块中存储的每个数据的校验和;根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验;对于数据索引信息校验通过的第一存储块,从第一存储块的数据索引信息中获取第一存储块中存储的每个数据的校验和;根据获取的每个数据的校验和,对第一存储块中的每个数据进行校验;对第一存储块中校验未通过的数据进行修复。其中,第一存储块是指数据索引信息校验通过的任一存储块,也即,对于数据索引信息校验通过的存储块可以均称为第一存储块。反之,数据索引信息校验未通过的存储块则可以称为第二存储块。Exemplarily, the processor may obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where the data index information is stored in the corresponding storage block, and the data index information includes the data stored in the corresponding storage block. The checksum of each data; according to the checksum of the data index area of each storage block, the data index information on each storage block is verified; for the first storage block that has passed the data index information verification, from Obtain the checksum of each data stored in the first storage block from the data index information of the first storage block; verify each data in the first storage block according to the obtained checksum of each data; Repair the data in the first storage block that has not passed the verification. The first storage block refers to any storage block that passes the data index information verification, that is, the storage blocks that pass the data index information verification can all be referred to as the first storage block. Conversely, the storage block that fails the verification of the data index information can be called the second storage block.
需要说明的是,每个存储块可以包括用户数据区、数据索引区和尾区。其中,用户数据区中存储有多条用户数据,通常,客户端请求的数据即为存储在用户数据区的用户数据,也即,本申请中的目标数据即为用户数据。数据索引区存储有每条用户数据对应的映射信息。该映射信息中包括相应用户数据的校验和、相应数据在该存储块中的偏移等。尾区中则存储有数据索引区的校验和。It should be noted that each storage block may include a user data area, a data index area, and a tail area. There are multiple pieces of user data stored in the user data area. Generally, the data requested by the client is user data stored in the user data area, that is, the target data in this application is user data. The data index area stores the mapping information corresponding to each piece of user data. The mapping information includes the checksum of the corresponding user data, the offset of the corresponding data in the storage block, and so on. The checksum of the data index area is stored in the tail area.
图3示出了一种硬盘中数据布局的示意图。如图3所示,该硬盘中可以包括多个Block,也即,多个存储块。其中,以Block1为例,该Block1中可以存储有data0、data1等多条用户数据,多条data所占的空间即可以被称为用户数据区(user data region)。每条用户数据对应有映射信息,参见图3,data0对应的映射信息为data0 ref,data1对应的映射信息为data1 ref。 该映射信息可以包括用户数据的偏移和校验和。多条用户数据对应的映射信息所占的空间可以被称为数据索引区(data reference region)。Block1的最后一个扇区也即尾区中存储有数据索引区的校验和。Figure 3 shows a schematic diagram of data layout in a hard disk. As shown in Figure 3, the hard disk may include multiple Blocks, that is, multiple storage blocks. Taking Block1 as an example, multiple pieces of user data such as data0 and data1 can be stored in Block1, and the space occupied by multiple pieces of data can be called a user data region (user data region). Each piece of user data corresponds to mapping information, as shown in Figure 3, the mapping information corresponding to data0 is data0 ref, and the mapping information corresponding to data1 is data1 ref. The mapping information may include the offset and checksum of the user data. The space occupied by the mapping information corresponding to multiple pieces of user data may be referred to as a data reference region. The checksum of the data index area is stored in the last sector of Block1, that is, the tail area.
基于上述介绍,在本申请实施例中,以该目标硬盘中的任一存储块为例,处理器可以从该存储块的尾区中获取数据索引区的校验和,也即,该数据索引区的参考校验和,同时,该处理器可以读取数据索引区中的数据索引信息。根据获取的数据索引信息计算得到数据索引区的实际校验和。Based on the above introduction, in the embodiment of the present application, taking any storage block in the target hard disk as an example, the processor can obtain the checksum of the data index area from the tail area of the storage block, that is, the data index At the same time, the processor can read the data index information in the data index area. The actual checksum of the data index area is calculated according to the acquired data index information.
需要说明的是,实际校验和是根据数据索引区的数据索引信息,采用与尾区中存储的参考校验和相同的方法确定得到的,在这种情况下,如果数据索引区中存放的数据索引信息未发生损坏,则该数据索引区的实际校验和将与尾区中存放的该数据索引区的校验和相同,如果数据索引区中存放的数据索引信息损坏,则该实际校验和将与尾区中存放的校验和不同。基于此,在本申请实施例中,处理器在计算得到该数据索引区的实际校验和之后,可以将该实际校验和与从尾区中获取的数据索引区的校验和进行比较。如果二者相同,则该处理器可以确定该数据索引区中的数据索引信息未发生损坏,也即,对数据索引区的校验通过。此时,该存储块也即是前述的数据索引信息校验通过的第一存储块。接下来,该处理器可以利用数据索引区中存储的每条数据的校验和对用户数据区的每条数据进行校验。It should be noted that the actual checksum is determined according to the data index information in the data index area using the same method as the reference checksum stored in the tail area. In this case, if the data index area is stored If the data index information is not damaged, the actual checksum of the data index area will be the same as the checksum of the data index area stored in the tail area. If the data index information stored in the data index area is damaged, the actual checksum will be The checksum will be different from the checksum stored in the tail area. Based on this, in the embodiment of the present application, after the processor calculates the actual checksum of the data index area, it can compare the actual checksum with the checksum of the data index area obtained from the tail area. If the two are the same, the processor can determine that the data index information in the data index area is not damaged, that is, the verification of the data index area passes. At this time, the storage block is also the first storage block that passes the aforementioned data index information verification. Next, the processor can use the checksum of each piece of data stored in the data index area to verify each piece of data in the user data area.
其中,处理器首先可以读取该存储块上的用户数据区中的第一条数据,并从数据索引区中读取第一条数据的校验和。之后,处理器可以根据第一条数据计算得到一个校验和,并将计算得到的校验和和读取的该条数据的校验和进行比较,如果二者相同,则说明第一条数据未损坏,也即,对该条数据的校验通过,此时,该处理器可以继续对用户数据区的下一条数据进行校验,以判断其是否损坏。如果两个校验和不相同,则说明第一条数据已损坏,也即,对该条数据的校验未通过,此时,该处理器可以标记该第一条数据,之后,再继续对下一条数据进行校验。这样,处理器对用户数据区的全部数据校验完毕之后,可以对标记的数据进行修复。当然,在一种可能的实现方式中,处理器也可以在每确定一条数据校验未通过时,即对该数据进行修复。Among them, the processor can first read the first piece of data in the user data area on the storage block, and read the checksum of the first piece of data from the data index area. After that, the processor can calculate a checksum according to the first piece of data, and compare the calculated checksum with the checksum of the piece of data read. If the two are the same, the first piece of data is indicated. It is not damaged, that is, the verification of the piece of data is passed. At this time, the processor can continue to verify the next piece of data in the user data area to determine whether it is damaged. If the two checksums are not the same, it means that the first piece of data has been damaged, that is, the check of this piece of data has failed. At this time, the processor can mark the first piece of data, and then continue to check The next data is checked. In this way, the processor can repair the marked data after verifying all the data in the user data area. Of course, in a possible implementation manner, the processor may also repair the data every time it determines that a piece of data fails the verification.
其中,处理器对某条数据进行修复时,该处理器可以从除目标硬盘外的其他硬盘中读取与该条数据相关的数据信息,并根据读取的数据信息计算恢复出该条数据。在恢复出该条数据之后,该处理器可以将该条数据存储到其他硬盘中或者是写到该目标硬盘的其他存储块中,与此同时,该处理器可以将当前存储块的用户数据区中存储的该条数据进行删除,也即,将该条数据在当前存储块中所占用的空间进行释放,后续,处理器还可以在该空间内写入新的数据。Wherein, when the processor repairs a piece of data, the processor can read data information related to the piece of data from other hard disks except the target hard disk, and calculate and restore the piece of data based on the read data information. After recovering the data, the processor can store the data in other hard disks or write to other storage blocks of the target hard disk. At the same time, the processor can store the user data area of the current storage block. Delete the piece of data stored in the data, that is, release the space occupied by the piece of data in the current storage block, and subsequently, the processor may also write new data in the space.
可选地,在一种可能的情况中,处理器也可以在确定出该存储块中的第一条校验未通过的数据时,即直接读取除目标硬盘外的其他硬盘中与该存储块上存储的数据相关的数据信息,进而直接根据获取的数据信息计算恢复该存储块上存储的所有数据。在恢复出所有数据之后,该处理器可以将恢复出的数据存储到其他硬盘或者是该目标硬盘的其他存储块中,并将存储块中的所有数据进行删除,以释放该存储块。Optionally, in a possible situation, when the processor determines that the first piece of data in the storage block that fails the check, it can directly read the data from other hard disks other than the target hard disk. Data information related to the data stored on the block, and then directly calculate and restore all data stored on the storage block according to the obtained data information. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block.
可选地,如果对该存储块中的所有数据的校验均通过,则说明该存储块中不存在损坏或丢失的数据,此时,处理器可以继续对下一个存储块进行校验。Optionally, if the verification of all data in the storage block passes, it means that there is no damaged or missing data in the storage block. At this time, the processor may continue to verify the next storage block.
可选地,如果处理器在比较数据索引区的实际校验和和该存储块的尾区中存储的数据索 引区的参考校验和时,发现二者不相同,则说明该存储块的数据索引区中存储的数据索引信息已损坏,在这种情况下,该处理器可以直接读取其他硬盘中与该存储块上存储的数据相关的数据信息,进而直接根据获取的数据信息计算恢复该存储块上存储的所有数据。在恢复出所有数据之后,该处理器可以将恢复出的数据存储到其他硬盘或者是该目标硬盘的其他存储块中,并将存储块中的所有数据进行删除,以释放该存储块。之后,该处理器可以对下一个存储块继续进行校验。Optionally, if the processor compares the actual checksum of the data index area with the reference checksum of the data index area stored in the tail area of the storage block and finds that the two are not the same, it indicates that the data of the storage block is different. The data index information stored in the index area is damaged. In this case, the processor can directly read the data information related to the data stored on the storage block in other hard disks, and then directly calculate and restore the data based on the obtained data information. All data stored on the storage block. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block. After that, the processor can continue to verify the next storage block.
另外,如前所述,在本申请实施例中,处理器可以通过建立后台任务来实现上述对目标硬盘的数据修复。其中,处理器可以将通过将后台任务切分为多个任务片段,通过并发的方式来处理该多个任务片段,以此来提高数据修复的速度。In addition, as mentioned above, in the embodiment of the present application, the processor can implement the above-mentioned data restoration on the target hard disk by establishing a background task. Among them, the processor may divide the background task into multiple task fragments, and process the multiple task fragments in a concurrent manner, so as to improve the speed of data repair.
当通过上述方法对该目标硬盘中的所有存储块进行校验,并对目标硬盘中已损坏的数据进行修复之后,该处理器可以将该目标硬盘的硬盘状态从风险状态修改为安全状态。After verifying all the storage blocks in the target hard disk by the above method and repairing the damaged data in the target hard disk, the processor can modify the hard disk state of the target hard disk from a risk state to a safe state.
由此可见,在本申请实施例中,对于处于风险状态的目标硬盘,处理器可以通过创建一个后台任务来逐一对该目标硬盘中的各个存储块进行校验,并对校验未通过的存储块的数据进行修复,这样,由于目标硬盘中损坏或丢失的数据较少,因此,只有少量的数据需要修复,减少了处理器对处理资源的消耗。同时,由于需要恢复的数据较少,相应地,需要从其他盘读取的相关数据也较少,因此,在总硬盘带宽固定的情况下,可以有效的避免该修复数据的后台任务与其他正常I/O争抢带宽,从而可以减小正常I/O的性能波动。It can be seen that, in the embodiment of the present application, for a target hard disk in a risk state, the processor can verify each storage block in the target hard disk one by one by creating a background task, and perform verification on the storage that fails the verification. Block data is repaired. In this way, because there is less damaged or lost data in the target hard disk, only a small amount of data needs to be repaired, which reduces the processor's consumption of processing resources. At the same time, because there is less data to be restored, correspondingly, there are fewer related data that need to be read from other disks. Therefore, when the total hard disk bandwidth is fixed, the background task of repairing data and other normal tasks can be effectively avoided. I/O competes for bandwidth, which can reduce the performance fluctuation of normal I/O.
另外,还需要说明的是,如果处理器接收到针对目标硬盘的写请求,该处理器可以根据该写请求将待写入数据写入到该目标硬盘。也即,在本申请实施例中,对于硬盘状态为风险状态的硬盘,该硬盘还可以继续使用。In addition, it should be noted that if the processor receives a write request for the target hard disk, the processor can write the data to be written to the target hard disk according to the write request. That is, in the embodiment of the present application, for a hard disk whose hard disk status is in a risk state, the hard disk can still be used.
接下来对本申请实施例提供的数据读取装置进行介绍。Next, the data reading device provided by the embodiment of the present application will be introduced.
参见图4,本申请实施例提供了一种数据读取装置400,该装置400包括:Referring to FIG. 4, an embodiment of the present application provides a data reading device 400, and the device 400 includes:
设置模块401,用于执行上述实施例中的步骤201;其中,该设置模块401可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The setting module 401 is used to execute step 201 in the above embodiment; wherein, the setting module 401 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
接收模块402,用于执行上述实施例中的步骤202;其中,该接收模块402可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The receiving module 402 is configured to execute step 202 in the above embodiment; wherein, the receiving module 402 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
读取模块403,用于执行上述实施例中的步骤203;其中,该读取模块403可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The reading module 403 is used to execute step 203 in the above embodiment; wherein, the reading module 403 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
校验模块404,用于执行上述实施例中的步骤204;其中,该校验模块404可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The verification module 404 is configured to execute step 204 in the above embodiment; wherein, the verification module 404 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.
发送模块405,用于执行上述实施例中的步骤205;其中,该发送模块405可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The sending module 405 is used to execute step 205 in the above embodiment; wherein, the sending module 405 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.
可选地,故障信息为软件故障类型标识,该软件故障类型标识用于指示该目标硬盘发生的故障为软件故障。Optionally, the failure information is a software failure type identification, and the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
可选地,校验模块具体用于:Optionally, the verification module is specifically used for:
获取存储的目标数据的参考校验和;Obtain the reference checksum of the stored target data;
根据目标数据计算目标数据的实际校验和;Calculate the actual checksum of the target data according to the target data;
如果目标数据的参考校验和与目标数据的实际校验和相同,则确定对目标数据的校验通 过。If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the verification of the target data has passed.
可选地,参见图5,该装置400还包括:Optionally, referring to FIG. 5, the apparatus 400 further includes:
修复模块406,用于对目标硬盘上存储的数据进行修复;The repair module 406 is used to repair the data stored on the target hard disk;
修改模块407,用于将完成数据修复的目标硬盘的硬盘状态修改为安全状态。The modification module 407 is used to modify the hard disk state of the target hard disk after the data repair is completed to a safe state.
可选地,修复模块406具体用于:Optionally, the repair module 406 is specifically configured to:
获取目标硬盘中每个存储块的数据索引区的校验和,数据索引区是指相应存储块中存储数据索引信息的区域,数据索引信息包括相应存储块中存储的每个数据的校验和;Get the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block. The data index information includes the checksum of each data stored in the corresponding storage block ;
根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验;Check the data index information on each storage block according to the checksum of the data index area of each storage block;
对于数据索引信息校验通过的第一存储块,从第一存储块的数据索引信息中获取第一存储块中存储的每个数据的校验和;For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;
根据获取的每个数据的校验和,对第一存储块中的每个数据进行校验;Perform verification on each data in the first storage block according to the obtained checksum of each data;
对第一存储块中校验未通过的数据进行修复。Repair the data in the first storage block that has not passed the verification.
可选地,修复模块406具体还用于:Optionally, the repair module 406 is specifically used to:
对于数据索引信息校验未通过的第二存储块,则对第二存储块上存储的全部数据进行修复。For the second storage block that fails the verification of the data index information, all data stored on the second storage block is repaired.
修复模块406和修改模块407可以由图1所示的存储设备中的处理器021来执行,或者由处理器021调用存储器022中的程序代码执行。The repair module 406 and the modification module 407 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call program codes in the memory 022 for execution.
综上所述,在本申请实施例中,当接收到目标硬盘发送的用于指示目标硬盘处于风险状态的故障信息时,可以将目标硬盘的硬盘状态设置为风险状态。接收客户端发送的读取目标硬盘的读请求。根据该读请求,读取目标硬盘上存储的目标数据,对目标数据进行校验,如果校验通过,则将目标数据返回至客户端。也即,在本申请实施例中,对于处于风险状态的硬盘,在接收到针对这一类的硬盘的读请求时,可以根据该读请求读取相应的数据,并对数据校验,如果校验通过,则可以返回读取的数据,这样,相对于相关技术中不管硬盘发生什么故障,都直接通过读取其他硬盘上的数据来进行恢复,缩短了数据读取的时延,同时也减少了对系统资源的消耗。In summary, in the embodiment of the present application, when receiving the fault information sent by the target hard disk for indicating that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Receive the read request sent by the client to read the target hard disk. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.
需要说明的是:上述实施例提供的数据读取装置在读取数据时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的数据读取装置与数据读取方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that when the data reading device provided in the above embodiment reads data, only the division of the above functional modules is used as an example for illustration. In actual applications, the above functions can be allocated by different functional modules according to needs. , Divide the internal structure of the device into different functional modules to complete all or part of the functions described above. In addition, the data reading device provided in the above embodiment and the data reading method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(Digital  Subscriber Line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(Digital Versatile Disc,DVD))、或者半导体介质(例如:固态硬盘(Solid State Disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (for example: coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (for example: infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD) )Wait.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。A person of ordinary skill in the art can understand that all or part of the steps in the above embodiments can be implemented by hardware, or by a program to instruct relevant hardware. The program can be stored in a computer-readable storage medium. The storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.
应当理解的是,本文提及的“多个”是指两个或两个以上。在本申请的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。It should be understood that the "plurality" mentioned herein refers to two or more. In the description of this application, unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in this document is only an association relationship describing associated objects, It means that there can be three kinds of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. In addition, in order to facilitate a clear description of the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as "first" and "second" are used to distinguish the same or similar items with substantially the same function and effect. Those skilled in the art can understand that words such as "first" and "second" do not limit the quantity and execution order, and words such as "first" and "second" do not limit the difference.
以上所述为本申请提供的实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above-mentioned examples provided for this application are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application. Inside.

Claims (12)

  1. 一种数据读取方法,其特征在于,所述方法包括:A data reading method, characterized in that the method includes:
    当接收到目标硬盘发送的用于指示所述目标硬盘处于风险状态的故障信息时,将所述目标硬盘的硬盘状态设置为风险状态;When receiving the failure information sent by the target hard disk for indicating that the target hard disk is in a risk state, set the hard disk state of the target hard disk to the risk state;
    接收客户端发送的读请求,所述读请求用于读取处于风险状态的目标硬盘中的数据;Receiving a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state;
    根据所述读请求,读取所述目标硬盘上存储的目标数据;Reading the target data stored on the target hard disk according to the read request;
    对所述目标数据进行校验;Verifying the target data;
    如果对所述目标数据的校验通过,则将所述目标数据发送至客户端。If the verification of the target data is passed, the target data is sent to the client.
  2. 根据权利要求1所述的方法,其特征在于,所述故障信息为软件故障类型标识,所述软件故障类型标识用于指示所述目标硬盘发生的故障为软件故障。The method according to claim 1, wherein the failure information is a software failure type identification, and the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
  3. 根据权利要求1所述的方法,其特征在于,所述对所述目标数据进行校验,包括:The method according to claim 1, wherein the verifying the target data comprises:
    获取存储的所述目标数据的参考校验和;Obtaining a reference checksum of the stored target data;
    根据所述目标数据计算所述目标数据的实际校验和;Calculating the actual checksum of the target data according to the target data;
    如果所述目标数据的参考校验和与所述目标数据的实际校验和相同,则确定对所述目标数据的校验通过。If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the checksum of the target data is passed.
  4. 如权利要求1所述的方法,其特征在于,所述接收客户端发送的读取目标硬盘的读请求之后,还包括:The method according to claim 1, wherein after receiving the read request sent by the client to read the target hard disk, the method further comprises:
    对所述目标硬盘上存储的数据进行修复;Repairing the data stored on the target hard disk;
    将完成数据修复的目标硬盘的硬盘状态修改为安全状态。Modify the hard disk status of the target hard disk that has completed data repair to a safe state.
  5. 根据权利要求4所述的方法,其特征在于,所述对所述目标硬盘上存储的数据进行修复,包括:The method according to claim 4, wherein the repairing the data stored on the target hard disk comprises:
    获取所述目标硬盘中每个存储块的数据索引区的校验和,所述数据索引区是指相应存储块中存储数据索引信息的区域,所述数据索引信息包括相应存储块中存储的每个数据的校验和;Obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block, and the data index information includes every data stored in the corresponding storage block. Checksum of each data;
    根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验;Check the data index information on each storage block according to the checksum of the data index area of each storage block;
    对于数据索引信息校验通过的第一存储块,从所述第一存储块的数据索引信息中获取所述第一存储块中存储的每个数据的校验和;For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;
    根据获取的每个数据的校验和,对所述第一存储块中的每个数据进行校验;Verify each data in the first storage block according to the obtained checksum of each data;
    对所述第一存储块中校验未通过的数据进行修复。Repair the data in the first storage block that has not passed the check.
  6. 根据权利要求5所述的方法,其特征在于,所述根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验之后,还包括:The method according to claim 5, wherein, after verifying the data index information on each storage block according to the checksum of the data index area of each storage block, the method further comprises:
    对于数据索引信息校验未通过的第二存储块,对所述第二存储块上存储的全部数据进行修复。For the second storage block that fails the verification of the data index information, repair all the data stored on the second storage block.
  7. 一种数据读取装置,其特征在于,所述装置包括:A data reading device, characterized in that the device comprises:
    设置模块,用于当接收到目标硬盘发送的用于指示所述目标硬盘处于风险状态的故障信息时,将所述目标硬盘的硬盘状态设置为风险状态;A setting module, which is used to set the hard disk state of the target hard disk to the risk state when receiving failure information sent by the target hard disk for indicating that the target hard disk is in a risk state;
    接收模块,用于接收客户端发送的读请求,所述读请求用于读取处于风险状态的目标硬盘中的数据;A receiving module, configured to receive a read request sent by a client, the read request being used to read data in a target hard disk in a risk state;
    读取模块,用于根据所述读请求,读取所述目标硬盘上存储的目标数据;A reading module, configured to read the target data stored on the target hard disk according to the read request;
    校验模块,用于对所述目标数据进行校验;A verification module for verifying the target data;
    发送模块,用于如果对所述目标数据的校验通过,则将所述目标数据发送至客户端。The sending module is configured to send the target data to the client if the verification of the target data is passed.
  8. 根据权利要求7所述的装置,其特征在于,所述故障信息为软件故障类型标识,所述软件故障类型标识用于指示所述目标硬盘发生的故障为软件故障。7. The device according to claim 7, wherein the failure information is a software failure type identifier, and the software failure type identifier is used to indicate that the failure of the target hard disk is a software failure.
  9. 根据权利要求7所述的装置,其特征在于,所述校验模块具体用于:The device according to claim 7, wherein the verification module is specifically configured to:
    获取存储的所述目标数据的参考校验和;Obtaining a reference checksum of the stored target data;
    根据所述目标数据计算所述目标数据的实际校验和;Calculating the actual checksum of the target data according to the target data;
    如果所述目标数据的参考校验和与所述目标数据的实际校验和相同,则确定对所述目标数据的校验通过。If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the checksum of the target data is passed.
  10. 如权利要求7所述的装置,其特征在于,所述装置还包括:The device according to claim 7, wherein the device further comprises:
    修复模块,用于对所述目标硬盘上存储的数据进行修复;The repair module is used to repair the data stored on the target hard disk;
    修改模块,用于将完成数据修复的目标硬盘的硬盘状态修改为安全状态。The modification module is used to modify the hard disk status of the target hard disk that has completed data repair to a safe state.
  11. 根据权利要求10所述的装置,其特征在于,所述修复模块具体用于:The device according to claim 10, wherein the repair module is specifically configured to:
    获取所述目标硬盘中每个存储块的数据索引区的校验和,所述数据索引区是指相应存储块中存储数据索引信息的区域,所述数据索引信息包括相应存储块中存储的每个数据的校验和;Obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block, and the data index information includes every data stored in the corresponding storage block. Checksum of each data;
    根据每个存储块的数据索引区的校验和,对每个存储块上的数据索引信息进行校验;Check the data index information on each storage block according to the checksum of the data index area of each storage block;
    对于数据索引信息校验通过的第一存储块,从所述第一存储块的数据索引信息中获取所述第一存储块中存储的每个数据的校验和;For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;
    根据获取的每个数据的校验和,对所述第一存储块中的每个数据进行校验;Verify each data in the first storage block according to the obtained checksum of each data;
    对所述第一存储块中校验未通过的数据进行修复。Repair the data in the first storage block that has not passed the check.
  12. 根据权利要求11所述的装置,其特征在于,所述修复模块具体用于:The device according to claim 11, wherein the repair module is specifically configured to:
    对于数据索引信息校验未通过的第二存储块,对所述第二存储块上存储的全部数据进行修复。For the second storage block that fails the verification of the data index information, repair all the data stored on the second storage block.
PCT/CN2020/113420 2019-09-06 2020-09-04 Data reading method and apparatus WO2021043246A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910841114.1 2019-09-06
CN201910841114.1A CN112463019A (en) 2019-09-06 2019-09-06 Data reading method and device

Publications (1)

Publication Number Publication Date
WO2021043246A1 true WO2021043246A1 (en) 2021-03-11

Family

ID=74806893

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113420 WO2021043246A1 (en) 2019-09-06 2020-09-04 Data reading method and apparatus

Country Status (2)

Country Link
CN (1) CN112463019A (en)
WO (1) WO2021043246A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994236B (en) * 2023-03-23 2023-08-04 杭州派迩信息技术有限公司 Collaborative processing method and system for aviation data

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147713A (en) * 2011-02-18 2011-08-10 杭州宏杉科技有限公司 Method and device for managing network storage system
CN103970481A (en) * 2013-01-29 2014-08-06 国际商业机器公司 Method and device for reconstructing memory array
CN105224891A (en) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 Magnetic disc optic disc fused data method for secure storing, system and device
CN105808161A (en) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 Reading method of bad sector data of hard disk
US20170358346A1 (en) * 2016-06-13 2017-12-14 SK Hynix Inc. Read threshold optimization in flash memories
US9891994B1 (en) * 2015-12-30 2018-02-13 EMC IP Holding Company LLC Updated raid 6 implementation
CN109582515A (en) * 2018-12-03 2019-04-05 郑州云海信息技术有限公司 A kind of hard disk detection method, system and electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147713A (en) * 2011-02-18 2011-08-10 杭州宏杉科技有限公司 Method and device for managing network storage system
CN103970481A (en) * 2013-01-29 2014-08-06 国际商业机器公司 Method and device for reconstructing memory array
CN105224891A (en) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 Magnetic disc optic disc fused data method for secure storing, system and device
US9891994B1 (en) * 2015-12-30 2018-02-13 EMC IP Holding Company LLC Updated raid 6 implementation
CN105808161A (en) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 Reading method of bad sector data of hard disk
US20170358346A1 (en) * 2016-06-13 2017-12-14 SK Hynix Inc. Read threshold optimization in flash memories
CN109582515A (en) * 2018-12-03 2019-04-05 郑州云海信息技术有限公司 A kind of hard disk detection method, system and electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112463019A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
EP2972871B1 (en) Methods and apparatus for error detection and correction in data storage systems
US8171379B2 (en) Methods, systems and media for data recovery using global parity for multiple independent RAID levels
US7529965B2 (en) Program, storage control method, and storage system
US7062704B2 (en) Storage array employing scrubbing operations using multiple levels of checksums
US7017107B2 (en) Storage array employing scrubbing operations at the disk-controller level
US8930750B2 (en) Systems and methods for preventing data loss
US20110264949A1 (en) Disk array
CN106776130B (en) Log recovery method, storage device and storage node
US7698592B2 (en) Apparatus and method for controlling raid array rebuild
US7827441B1 (en) Disk-less quorum device for a clustered storage system
JP2001228980A (en) Controller for disk array
WO2020107829A1 (en) Fault processing method, apparatus, distributed storage system, and storage medium
JP4324088B2 (en) Data replication control device
JP2010033287A (en) Storage subsystem and data-verifying method using the same
JP2006139478A (en) Disk array system
CN109726036B (en) Data reconstruction method and device in storage system
WO2019210844A1 (en) Anomaly detection method and apparatus for storage device, and distributed storage system
US8782465B1 (en) Managing drive problems in data storage systems by tracking overall retry time
JP2005309818A (en) Storage device, data reading method, and data reading program
WO2021043246A1 (en) Data reading method and apparatus
US9280431B2 (en) Prioritizing backups on a disk level within enterprise storage
JP7125602B2 (en) Data processing device and diagnostic method
CN106776142B (en) Data storage method and data storage device
JP3180737B2 (en) System redundancy method
US20130110789A1 (en) Method of, and apparatus for, recovering data on a storage system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20860691

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20860691

Country of ref document: EP

Kind code of ref document: A1