WO2021043246A1

WO2021043246A1 - Data reading method and apparatus

Info

Publication number: WO2021043246A1
Application number: PCT/CN2020/113420
Authority: WO
Inventors: 张瑛; 熊伟
Original assignee: 华为技术有限公司
Priority date: 2019-09-06
Filing date: 2020-09-04
Publication date: 2021-03-11
Also published as: CN112463019A

Abstract

Disclosed in the present application are a data reading method and apparatus, relating to the technical field of storage. In the present application, for a hard disk in a RAID having a corresponding hard disk state of an at-risk state, when a read request for said type of hard disk is received, the corresponding data stored on the hard disk can be read on the basis of the read request and the data can be verified and, if verification passes, then the read data can be returned; thus, compared to the method in the prior art of directly implementing restoration by means of reading the data on the other hard disks no matter what the fault of the hard disk, the data reading delay is shortened and the consumption of system resources is reduced.

Description

Data reading method and device

Technical field

This application relates to the field of data storage, and in particular to a data reading method and device.

Background technique

Redundant array of independent disks (RAID) refers to a technology that implements data reading and writing based on multiple hard disks. According to different implementation principles, RAID can be divided into hard RAID and soft RAID. Among them, hard RAID is to realize RAID functions including data reading and writing through hardware, while soft RAID refers to the realization of RAID functions through operating system and CPU.

In related technologies, for soft RAID, when a hard disk among multiple hard disks fails, the hard disk may send a failure signal to the processor. Among them, when the hard disk has a hardware failure, the data stored on the hard disk will usually be completely damaged or lost, and when the hard disk has a software failure, usually only part of the data stored on the hard disk will be damaged or lost. After receiving the failure signal, the processor may mark the hard disk as a failed disk. Subsequently, when receiving a read request from the client to read the faulty disk, the processor can read the data in other hard disks except the faulty disk, and recover the data in the faulty disk based on the read data. Data, the recovered data is returned to the client.

However, after the processor receives the read request for the failed disk, it needs to read the data on other hard disks to restore the data on the failed disk. Therefore, it will cause a large delay in data reading and will Cause greater resource consumption.

Summary of the invention

This application provides a data reading method and device, which can be used to solve the problem of data reading caused by the processor in the related art reading the failed disk by reading data on other hard disks to restore the data on the failed disk. The problem of large time delay and large resource consumption. The technical solution is as follows:

In a first aspect, a data reading method is provided. The method includes: when receiving failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, setting the hard disk status of the target hard disk to Risk status, receiving a read request to read the target hard disk sent by the client. The target hard disk refers to any hard disk in the redundant array of independent hard disks whose corresponding hard disk status is in the risk state, and the risk status is used to indicate the occurrence of the hard disk. After a software failure; according to the read request, read the target data stored on the target hard disk; verify the target data; if the verification of the target data passes, send the target data to Client.

In the embodiment of the present application, a read request sent by the client to read the target hard disk is received. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. Among them, the target hard disk refers to a hard disk whose corresponding hard disk status in the RAID is in a risk state, and the risk state refers to a software failure due to a failure. That is, in the embodiment of the present application, after receiving the failure information sent by the target hard disk to indicate that the hard disk is in a risk state, the hard disk state corresponding to the hard disk may be set to the risk state. In this way, when a subsequent read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In the related technology, no matter what the hard disk fails, the recovery is directly performed by reading the data on other hard disks, which shortens the time delay of data reading and also reduces the consumption of system resources.

Optionally, the failure information is a software failure type identification, and the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.

In the embodiment of the present application, the hard disk that has failed may feed back to the operating system a fault type identifier used to indicate the type of the fault. In this way, the operating system can identify the hard disk with a software failure based on the fault type identifier, and then respond to the software failure. The hard disk of this application can read data through the method provided in this application. In the related technology, the hard disk that has failed only feeds back the failure signal. In this way, the operating system cannot identify what type of failure has occurred in the hard disk. Therefore, it can only be processed according to the hard disk that has hardware failure, that is, the data is directly processed. restore.

Optionally, the implementation process of verifying the target data may be: obtaining the reference checksum of the stored target data; calculating the actual checksum of the target data according to the target data; if the target data If the reference checksum of is the same as the actual checksum of the target data, it is determined that the verification of the target data is passed.

Optionally, after receiving the read request for reading the target hard disk sent by the client, the method further includes: repairing the data stored on the target hard disk; and modifying the hard disk status of the target hard disk after the data repair is completed to a safe state.

In the embodiment of the present application, the operating system can create a background task to modify the data stored on the target hard disk while reading the target data stored on the target hard disk according to the read request, and repair the data on the target hard disk. The hard disk status is changed to a safe status.

Optionally, the implementation process of repairing the data stored on the target hard disk may be: obtaining a checksum of the data index area of each storage block in the target hard disk, where the data index area refers to the corresponding storage block The area for storing data index information in the corresponding storage block, the data index information includes the checksum of each data stored in the corresponding storage block; according to the checksum of the data index area of each storage block, the data on each storage block Index information is checked; for the first storage block that has passed the data index information check, the checksum of each data stored in the first storage block is obtained from the data index information of the first storage block; according to The obtained checksum of each data is verified for each data in the first storage block; and the data in the first storage block that fails the verification is repaired.

In the embodiment of the present application, for any storage block on the target hard disk, if the verification of the data index area of the storage block passes, the verification information of each data stored in the data index area can be used to verify each storage block. Each data is verified, and then only the data that fails the verification is restored. In this way, the amount of data restoration can be reduced.

Optionally, after the data index information on each storage block is verified according to the checksum of the data index area of each storage block, for the second storage block that fails the data index information verification, all All data stored on the second storage block is repaired.

That is, for a storage block whose data index information is damaged or lost, the storage block can be directly reconstructed to restore all data on the storage block.

In a second aspect, a data reading device is provided, and the data reading device has the function of realizing the behavior of the data reading method in the first aspect. The data reading device includes at least one module, and the at least one module is used to implement the data reading method provided in the above-mentioned first aspect.

In a third aspect, a data reading device is provided. The structure of the data reading device includes a processor and a memory, and the memory is used for storing and supporting the data reading device to perform the data reading provided in the first aspect. The program of the method and the storage of the data involved in the data reading method provided in the first aspect. The processor is configured to execute the program stored in the memory. The operating device of the storage device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.

In a fourth aspect, a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which when run on a computer, cause the computer to execute the data reading method described in the first aspect.

In a fifth aspect, a computer program product containing instructions is provided, which when running on a computer, causes the computer to execute the data reading method described in the first aspect.

The technical effects obtained by the second, third, fourth, and fifth aspects described above are similar to those obtained by the corresponding technical means in the first aspect, and will not be repeated here.

The beneficial effects brought about by the technical solution provided in this application include at least:

In the embodiment of the present application, when receiving the fault information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Subsequently, when a read request for the target hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification passes, the read data can be returned. In this way, compared with the related technology No matter what happens to the hard disk, it can be recovered directly by reading the data on other hard disks, which shortens the time delay of data reading and reduces the consumption of system resources.

Description of the drawings

FIG. 1 is a system architecture diagram involved in a data reading method provided by an embodiment of the present application;

FIG. 2 is a flowchart of a data reading method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of data distribution in a storage block provided by an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a data reading device provided by an embodiment of the present application;

FIG. 5 is a schematic structural diagram of another data reading device provided by an embodiment of the present application.

detailed description

In order to make the objectives, technical solutions, and advantages of the present application clearer, the implementation manners of the present application will be further described in detail below with reference to the accompanying drawings.

Before explaining the embodiments of the present application in detail, the system architecture involved in the embodiments of the present application will be introduced first.

FIG. 1 is an architecture diagram of a storage system involved in a data reading method provided by an embodiment of the present application. As shown in Figure 1, the system includes client 01 and storage device 02. Among them, the client 01 and the storage device 02 can communicate.

The client 01 can send a read request or a write request to the storage device 02.

The storage device 02 may include a processor 021, a memory 022, and a hard disk 023.

The processor 021 may be a general-purpose central processing unit (Central Processing Unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs used to control the execution of the program of this application. integrated circuit.

In a specific implementation, as an embodiment, the storage device may include multiple processors 021. Each of these processors may be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor. The processor here may refer to one or more devices, circuits, and/or devices including a processing core for processing data (for example, computer program instructions).

An operating system is installed on the memory 022, and the processor 021 can read and write data by running the operating system. In addition, the memory may also store the program code of the solution of the present application, and the processor 021 controls the execution. Wherein, the memory 022 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM)) or can store information and Other types of dynamic storage devices for instructions can also be Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory, CD-ROM or other optical discs Storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures And any other media that can be accessed by the computer, but not limited to this. The memory 022 may exist independently and is connected to the processor 021. The memory 022 may also be integrated with the processor 021.

The storage device 02 can be a storage array or a server. When the storage device 02 is a storage array, it includes a controller and several hard disks. The processor 021 and the memory 022 may be located in the controller of the storage array, and the controller and several hard disks are connected through a back-end interface card. When the storage device 02 is a server, the processor 021, the storage 022 and several hard disks are all located inside the server. This embodiment does not limit the product form of the storage device 02, and FIG. 1 is only a schematic diagram of some components included in the device.

In order to ensure the reliability of data, RAID technology is often used to store data in practical applications, for example, RAID5, RAID6, RAIDTP, etc. Take RAID5 as an example. First, the space of each hard disk is divided into multiple storage blocks, each of which has the same size. One storage block is taken out from different hard disks to form a storage block group. The number of storage blocks is determined by the RAID type. Taking "4+1" RAID5 as an example, a storage block group consists of 4 data blocks and 1 parity block, so 5 storage blocks are required. When the storage device 02 stores data, the data can be split into 4 data fragments, and the check data of these 4 data fragments are calculated to generate the check fragments, and then the 4 data fragments are summed One parity fragment is stored in the storage block group. The storage block group corresponds to a segment of logical address, which is the logical address of the data. The storage device 02 stores the mapping relationship between the logical address and the physical address where the data is actually stored. When the storage device 02 receives the read request sent by the client 01, the read request includes the logical address of the data to be read, and the processor 021 can run the operating system to determine the location of the data to be read according to the logical address of the data to be read The storage block and its location (physical address) in the storage block. After the storage block where the data to be read is located and the location in the storage block are determined, the data can be read from the storage block.

In addition, the storage device 02 may also include a communication bus and a communication interface (not shown in FIG. 1). Among them, the communication bus is used to transfer information between various components included in the storage device 02.

The communication interface is used to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.

Next, the data reading method provided by the embodiment of the present application will be introduced.

Fig. 2 is a flowchart of a data reading method provided by an embodiment of the present application. The execution subject of this method may be the processor deployed in the storage device 02 described in FIG. 1. Referring to Figure 2, the method includes the following steps:

Step 201: When receiving the failure information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk is set to the risk state.

Among them, the risk status is used to indicate that a small amount of data in the target hard disk is damaged or lost, and the target hard disk is allowed to continue writing data, and the remaining undamaged and lost data is also allowed to be read. Based on this, the failure information used to indicate that the target hard disk is at risk may refer to information used to indicate that the target hard disk is damaged or lost but the target hard disk is still usable.

In the embodiment of the present application, the processor may send a failure detection instruction to each hard disk in the storage device every predetermined period of time. After each hard disk receives the fault detection instruction sent by the processor, it can detect whether it has read/write abnormality or whether there is data damage or loss. If it detects read/write abnormality or data damage or loss, the hard disk can determine its own occurrence. The fault type of the fault, and a fault type identifier for identifying the fault type is sent to the processor.

Optionally, in another possible implementation manner, the processor may also send a fault detection instruction to the hard disk when it continuously receives an I/O abnormal error code sent by a certain hard disk, so as to query the type of fault of the hard disk. . That is, when the processor is processing a read request or write request for a hard disk, if it receives an I/O abnormal error code sent by the hard disk n times in a row, it means that the hard disk may be malfunctioning. At this time, the The processor may send a fault detection instruction to the hard disk to obtain the fault type of the hard disk. Among them, n can be a preset value.

Among them, the failure types of hard disk failures may include software failures and hardware failures. Based on this, the fault type identification may include a software fault type identification and a hardware fault type identification. Among them, the software failure type identification is used to identify the failure of the hard disk as a software failure, and the hardware failure type identification is used to identify the failure of the hard disk as a hardware failure. Generally, hardware failure refers to a hardware device failure. And software failure refers to data damage or loss caused by software abnormalities. It should be noted that when the hard disk has a software failure, only a small part of the data on the hard disk will be damaged or lost. However, the location of data damage or loss cannot be clarified. In this way, the hard disk can continue to write data later, and data can also be read for undamaged and lost data. In the embodiment of the present application, after the hard disk fails, if it is determined that it is a software failure, it can report the software failure type identification to the processor, and if it is a hardware failure, it can report the hardware failure type identification to the processor. .

Optionally, in the embodiment of the present application, each hard disk may also actively report the fault type identification to the processor when a fault is detected.

Based on the foregoing description, in the case of a hard disk software failure, usually only a small part of the data will be damaged or lost, and reading and writing can continue. Therefore, in the embodiment of the present application, it is used to indicate that the hard disk is at risk. The information can be used to identify the type of software failure. Based on this, if the fault type identifier sent by a certain hard disk received by the processor is a software fault type identifier, the hard disk state of the hard disk may be set to a risk state. Optionally, if the fault type identifier sent by a certain hard disk received by the processor is a hardware fault type identifier, the processor may directly set the hard disk status of the hard disk to a failed state. At this time, the failed state is used to indicate It is forbidden to read and write data on the hard disk later.

It should be noted that the memory of the storage device may store the corresponding relationship between the hard disk identification and the status information of each of the multiple hard disks. Among them, the status information may include a safety status, a risk status, and a failure status. Among them, the safe state is used to indicate that the data stored on the hard disk is not damaged or lost, and the hard disk is currently not malfunctioning. The risk status is used to indicate that a small part of the data stored on the hard disk is damaged or lost, the hard disk has a software failure, and the subsequent processor can continue to write data in the hard disk, and the subsequent steps in the embodiment of this application can be used Read data from the hard disk. The invalid state is used to indicate that the hard disk has a hardware failure and is currently unavailable. When a read and write request for the hard disk is subsequently received, data reading and writing on the hard disk will be prohibited. Based on this, after the processor receives the software failure type identifier reported by the target hard disk, it can set the status information corresponding to the hard disk identifier of the target hard disk in the above corresponding relationship to the risk state to indicate that the target hard disk has a software failure. A small amount of data has been damaged or lost.

Step 202: Receive a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state.

In the embodiment of the present application, when the data that the client wants to read is stored in the target hard disk, the client may send a read request carrying the logical address of the data to be read to the processor. The processor may receive the read request sent by the client, and determine the hard disk to be read as the target hard disk according to the logical address of the data to be read carried in the read request. That is, the read request is a read request for reading data in the target hard disk.

Step 203: According to the read request, read the target data stored on the target hard disk.

After receiving the read request, the processor can read the target data stored on the target hard disk according to the logical address carried in the read request.

It should be noted that the hard disk space can be divided into multiple storage blocks (that is, blocks). Among them, the size of each storage block is the same. Each storage block may correspond to a segment of logical address, and each storage block may store multiple pieces of data, and the memory of the storage device may store the mapping relationship between the physical address of the data and the logical address. Based on this, the processor can determine the physical address for storing the target data to be read according to the logical address carried in the read request, that is, determine the target memory block to be read, and then the processor can read from the target memory block Get the stored target data.

Step 204: Verify the target data.

Since the target hard disk is a hard disk in a risk state, a small amount of data stored on the target hard disk may be damaged or lost. However, since it is not clear where the data is damaged or lost, after the processor reads the target data according to the logical address in the read request, it cannot determine whether the target data has been damaged. Based on this, the processor can verify the target data.

Exemplarily, the processor may obtain a reference checksum (checksum) of the stored target data, and calculate the actual checksum of the target data according to the target data. If the reference checksum of the target data is the same as the actual checksum, then Confirm that the verification of the target data is passed.

It should be noted that the storage device may store metadata of various data stored on each of the multiple hard disks. The metadata includes the storage address of each data and the checksum of each data. The processor may obtain metadata containing the logical address from the stored metadata according to the logical address carried in the read request, and obtain the checksum of the target data from the obtained metadata. At this time, the obtained checksum of the target data is the reference checksum of the correct target data originally stored in the space indicated by the logical address carried in the read request.

While acquiring the reference checksum of the target data, the processor may also calculate the actual checksum of the target data according to the acquired target data. Wherein, the calculation method for obtaining the actual checksum is the same as the calculation method for the reference checksum in the stored metadata.

Since the actual checksum is calculated based on the obtained target data using the same calculation method as the stored reference checksum, if the obtained target data is damaged, the calculated actual checksum will be the same as the previously obtained The reference checksum of the target verse is different. If the obtained target data is not damaged, the calculated actual checksum of the target data will be the same as the previously obtained reference checksum. Based on this, after obtaining the reference checksum of the target data from the stored metadata and calculating the actual checksum of the target data, the processor can compare the two. If the two are the same, it means that the obtained target data is correct and not damaged. At this time, the verification of the target data is passed. If the two are not the same, it means that the acquired target data is corrupted data. At this time, the verification of the target data fails.

Step 205: If the verification of the target data is passed, the target data is sent to the client.

It can be seen from the foregoing introduction that if the verification of the target data passes, it means that the target data is actually uncorrupted data in the target hard disk. In this case, the processor can directly return the target data to the client.

Optionally, if the verification of the target data fails, it means that the data currently requested by the client happens to contain damaged data in the target hard disk. In this case, the target data cannot be returned as the read result To the client. At this time, the processor can recover the target data by reading data on other hard disks except the target hard disk among the multiple hard disks included in the RAID.

In the embodiment of the present application, when receiving the fault information sent by the target hard disk to indicate that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Receive the read request sent by the client to read the target hard disk. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.

It should be noted that, in this embodiment of the application, after the processor receives the read request from the target hard disk, it considers that there is damaged data on the target hard disk. Therefore, the processor can also establish a background task to store data on the target hard disk. The data is repaired.

Among them, it can be seen from the foregoing introduction that the space of the target hard disk can be divided into multiple storage blocks. Based on this, in the embodiment of the present application, the processor may sequentially verify each storage block of the target hard disk starting from the first storage block of the target hard disk, and then repair the storage blocks that fail the verification.

Exemplarily, the processor may obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where the data index information is stored in the corresponding storage block, and the data index information includes the data stored in the corresponding storage block. The checksum of each data; according to the checksum of the data index area of each storage block, the data index information on each storage block is verified; for the first storage block that has passed the data index information verification, from Obtain the checksum of each data stored in the first storage block from the data index information of the first storage block; verify each data in the first storage block according to the obtained checksum of each data; Repair the data in the first storage block that has not passed the verification. The first storage block refers to any storage block that passes the data index information verification, that is, the storage blocks that pass the data index information verification can all be referred to as the first storage block. Conversely, the storage block that fails the verification of the data index information can be called the second storage block.

It should be noted that each storage block may include a user data area, a data index area, and a tail area. There are multiple pieces of user data stored in the user data area. Generally, the data requested by the client is user data stored in the user data area, that is, the target data in this application is user data. The data index area stores the mapping information corresponding to each piece of user data. The mapping information includes the checksum of the corresponding user data, the offset of the corresponding data in the storage block, and so on. The checksum of the data index area is stored in the tail area.

Figure 3 shows a schematic diagram of data layout in a hard disk. As shown in Figure 3, the hard disk may include multiple Blocks, that is, multiple storage blocks. Taking Block1 as an example, multiple pieces of user data such as data0 and data1 can be stored in Block1, and the space occupied by multiple pieces of data can be called a user data region (user data region). Each piece of user data corresponds to mapping information, as shown in Figure 3, the mapping information corresponding to data0 is data0 ref, and the mapping information corresponding to data1 is data1 ref. The mapping information may include the offset and checksum of the user data. The space occupied by the mapping information corresponding to multiple pieces of user data may be referred to as a data reference region. The checksum of the data index area is stored in the last sector of Block1, that is, the tail area.

Based on the above introduction, in the embodiment of the present application, taking any storage block in the target hard disk as an example, the processor can obtain the checksum of the data index area from the tail area of the storage block, that is, the data index At the same time, the processor can read the data index information in the data index area. The actual checksum of the data index area is calculated according to the acquired data index information.

It should be noted that the actual checksum is determined according to the data index information in the data index area using the same method as the reference checksum stored in the tail area. In this case, if the data index area is stored If the data index information is not damaged, the actual checksum of the data index area will be the same as the checksum of the data index area stored in the tail area. If the data index information stored in the data index area is damaged, the actual checksum will be The checksum will be different from the checksum stored in the tail area. Based on this, in the embodiment of the present application, after the processor calculates the actual checksum of the data index area, it can compare the actual checksum with the checksum of the data index area obtained from the tail area. If the two are the same, the processor can determine that the data index information in the data index area is not damaged, that is, the verification of the data index area passes. At this time, the storage block is also the first storage block that passes the aforementioned data index information verification. Next, the processor can use the checksum of each piece of data stored in the data index area to verify each piece of data in the user data area.

Among them, the processor can first read the first piece of data in the user data area on the storage block, and read the checksum of the first piece of data from the data index area. After that, the processor can calculate a checksum according to the first piece of data, and compare the calculated checksum with the checksum of the piece of data read. If the two are the same, the first piece of data is indicated. It is not damaged, that is, the verification of the piece of data is passed. At this time, the processor can continue to verify the next piece of data in the user data area to determine whether it is damaged. If the two checksums are not the same, it means that the first piece of data has been damaged, that is, the check of this piece of data has failed. At this time, the processor can mark the first piece of data, and then continue to check The next data is checked. In this way, the processor can repair the marked data after verifying all the data in the user data area. Of course, in a possible implementation manner, the processor may also repair the data every time it determines that a piece of data fails the verification.

Wherein, when the processor repairs a piece of data, the processor can read data information related to the piece of data from other hard disks except the target hard disk, and calculate and restore the piece of data based on the read data information. After recovering the data, the processor can store the data in other hard disks or write to other storage blocks of the target hard disk. At the same time, the processor can store the user data area of the current storage block. Delete the piece of data stored in the data, that is, release the space occupied by the piece of data in the current storage block, and subsequently, the processor may also write new data in the space.

Optionally, in a possible situation, when the processor determines that the first piece of data in the storage block that fails the check, it can directly read the data from other hard disks other than the target hard disk. Data information related to the data stored on the block, and then directly calculate and restore all data stored on the storage block according to the obtained data information. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block.

Optionally, if the verification of all data in the storage block passes, it means that there is no damaged or missing data in the storage block. At this time, the processor may continue to verify the next storage block.

Optionally, if the processor compares the actual checksum of the data index area with the reference checksum of the data index area stored in the tail area of the storage block and finds that the two are not the same, it indicates that the data of the storage block is different. The data index information stored in the index area is damaged. In this case, the processor can directly read the data information related to the data stored on the storage block in other hard disks, and then directly calculate and restore the data based on the obtained data information. All data stored on the storage block. After recovering all the data, the processor can store the recovered data in another hard disk or another storage block of the target hard disk, and delete all the data in the storage block to release the storage block. After that, the processor can continue to verify the next storage block.

In addition, as mentioned above, in the embodiment of the present application, the processor can implement the above-mentioned data restoration on the target hard disk by establishing a background task. Among them, the processor may divide the background task into multiple task fragments, and process the multiple task fragments in a concurrent manner, so as to improve the speed of data repair.

After verifying all the storage blocks in the target hard disk by the above method and repairing the damaged data in the target hard disk, the processor can modify the hard disk state of the target hard disk from a risk state to a safe state.

It can be seen that, in the embodiment of the present application, for a target hard disk in a risk state, the processor can verify each storage block in the target hard disk one by one by creating a background task, and perform verification on the storage that fails the verification. Block data is repaired. In this way, because there is less damaged or lost data in the target hard disk, only a small amount of data needs to be repaired, which reduces the processor's consumption of processing resources. At the same time, because there is less data to be restored, correspondingly, there are fewer related data that need to be read from other disks. Therefore, when the total hard disk bandwidth is fixed, the background task of repairing data and other normal tasks can be effectively avoided. I/O competes for bandwidth, which can reduce the performance fluctuation of normal I/O.

In addition, it should be noted that if the processor receives a write request for the target hard disk, the processor can write the data to be written to the target hard disk according to the write request. That is, in the embodiment of the present application, for a hard disk whose hard disk status is in a risk state, the hard disk can still be used.

Next, the data reading device provided by the embodiment of the present application will be introduced.

Referring to FIG. 4, an embodiment of the present application provides a data reading device 400, and the device 400 includes:

The setting module 401 is used to execute step 201 in the above embodiment; wherein, the setting module 401 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.

The receiving module 402 is configured to execute step 202 in the above embodiment; wherein, the receiving module 402 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.

The reading module 403 is used to execute step 203 in the above embodiment; wherein, the reading module 403 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.

The verification module 404 is configured to execute step 204 in the above embodiment; wherein, the verification module 404 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call the memory 022 in the memory 022 Program code execution.

The sending module 405 is used to execute step 205 in the above embodiment; wherein, the sending module 405 can be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 can call the program code in the memory 022 carried out.

Optionally, the verification module is specifically used for:

Obtain the reference checksum of the stored target data;

Calculate the actual checksum of the target data according to the target data;

If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the verification of the target data has passed.

Optionally, referring to FIG. 5, the apparatus 400 further includes:

The repair module 406 is used to repair the data stored on the target hard disk;

The modification module 407 is used to modify the hard disk state of the target hard disk after the data repair is completed to a safe state.

Optionally, the repair module 406 is specifically configured to:

Get the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block. The data index information includes the checksum of each data stored in the corresponding storage block ；

Check the data index information on each storage block according to the checksum of the data index area of each storage block;

For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;

Perform verification on each data in the first storage block according to the obtained checksum of each data;

Repair the data in the first storage block that has not passed the verification.

Optionally, the repair module 406 is specifically used to:

For the second storage block that fails the verification of the data index information, all data stored on the second storage block is repaired.

The repair module 406 and the modification module 407 may be executed by the processor 021 in the storage device shown in FIG. 1, or the processor 021 may call program codes in the memory 022 for execution.

In summary, in the embodiment of the present application, when receiving the fault information sent by the target hard disk for indicating that the target hard disk is in a risk state, the hard disk state of the target hard disk may be set to the risk state. Receive the read request sent by the client to read the target hard disk. According to the read request, the target data stored on the target hard disk is read, the target data is verified, and if the verification passes, the target data is returned to the client. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for this type of hard disk is received, the corresponding data can be read according to the read request, and the data can be verified. If the verification is passed, the read data can be returned. In this way, compared with the related technology, no matter what the hard disk fails, it can be restored directly by reading the data on other hard disks, which shortens the time delay of data reading and also reduces The consumption of system resources.

It should be noted that when the data reading device provided in the above embodiment reads data, only the division of the above functional modules is used as an example for illustration. In actual applications, the above functions can be allocated by different functional modules according to needs. , Divide the internal structure of the device into different functional modules to complete all or part of the functions described above. In addition, the data reading device provided in the above embodiment and the data reading method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present invention are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (for example: coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (for example: infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example: floppy disk, hard disk, tape), optical medium (for example: Digital Versatile Disc (DVD)), or semiconductor medium (for example: Solid State Disk (SSD) )Wait.

A person of ordinary skill in the art can understand that all or part of the steps in the above embodiments can be implemented by hardware, or by a program to instruct relevant hardware. The program can be stored in a computer-readable storage medium. The storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

It should be understood that the "plurality" mentioned herein refers to two or more. In the description of this application, unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in this document is only an association relationship describing associated objects, It means that there can be three kinds of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. In addition, in order to facilitate a clear description of the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as "first" and "second" are used to distinguish the same or similar items with substantially the same function and effect. Those skilled in the art can understand that words such as "first" and "second" do not limit the quantity and execution order, and words such as "first" and "second" do not limit the difference.

The above-mentioned examples provided for this application are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application. Inside.

Claims

A data reading method, characterized in that the method includes:

When receiving the failure information sent by the target hard disk for indicating that the target hard disk is in a risk state, set the hard disk state of the target hard disk to the risk state;

Receiving a read request sent by the client, where the read request is used to read data in the target hard disk in a risk state;

Reading the target data stored on the target hard disk according to the read request;

Verifying the target data;

If the verification of the target data is passed, the target data is sent to the client.
The method according to claim 1, wherein the failure information is a software failure type identification, and the software failure type identification is used to indicate that the failure of the target hard disk is a software failure.
The method according to claim 1, wherein the verifying the target data comprises:

Obtaining a reference checksum of the stored target data;

Calculating the actual checksum of the target data according to the target data;

If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the checksum of the target data is passed.
The method according to claim 1, wherein after receiving the read request sent by the client to read the target hard disk, the method further comprises:

Repairing the data stored on the target hard disk;

Modify the hard disk status of the target hard disk that has completed data repair to a safe state.
The method according to claim 4, wherein the repairing the data stored on the target hard disk comprises:

Obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block, and the data index information includes every data stored in the corresponding storage block. Checksum of each data;

Check the data index information on each storage block according to the checksum of the data index area of each storage block;

For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;

Verify each data in the first storage block according to the obtained checksum of each data;

Repair the data in the first storage block that has not passed the check.
The method according to claim 5, wherein, after verifying the data index information on each storage block according to the checksum of the data index area of each storage block, the method further comprises:

For the second storage block that fails the verification of the data index information, repair all the data stored on the second storage block.
A data reading device, characterized in that the device comprises:

A setting module, which is used to set the hard disk state of the target hard disk to the risk state when receiving failure information sent by the target hard disk for indicating that the target hard disk is in a risk state;

A receiving module, configured to receive a read request sent by a client, the read request being used to read data in a target hard disk in a risk state;

A reading module, configured to read the target data stored on the target hard disk according to the read request;

A verification module for verifying the target data;

The sending module is configured to send the target data to the client if the verification of the target data is passed.
7. The device according to claim 7, wherein the failure information is a software failure type identifier, and the software failure type identifier is used to indicate that the failure of the target hard disk is a software failure.
The device according to claim 7, wherein the verification module is specifically configured to:

Obtaining a reference checksum of the stored target data;

Calculating the actual checksum of the target data according to the target data;

If the reference checksum of the target data is the same as the actual checksum of the target data, it is determined that the checksum of the target data is passed.
The device according to claim 7, wherein the device further comprises:

The repair module is used to repair the data stored on the target hard disk;

The modification module is used to modify the hard disk status of the target hard disk that has completed data repair to a safe state.
The device according to claim 10, wherein the repair module is specifically configured to:

Obtain the checksum of the data index area of each storage block in the target hard disk. The data index area refers to the area where data index information is stored in the corresponding storage block, and the data index information includes every data stored in the corresponding storage block. Checksum of each data;

Check the data index information on each storage block according to the checksum of the data index area of each storage block;

For the first storage block whose data index information has passed the check, obtain the checksum of each data stored in the first storage block from the data index information of the first storage block;

Verify each data in the first storage block according to the obtained checksum of each data;

Repair the data in the first storage block that has not passed the check.
The device according to claim 11, wherein the repair module is specifically configured to:

For the second storage block that fails the verification of the data index information, repair all the data stored on the second storage block.