CN112463019A - Data reading method and device - Google Patents

Data reading method and device Download PDF

Info

Publication number
CN112463019A
CN112463019A CN201910841114.1A CN201910841114A CN112463019A CN 112463019 A CN112463019 A CN 112463019A CN 201910841114 A CN201910841114 A CN 201910841114A CN 112463019 A CN112463019 A CN 112463019A
Authority
CN
China
Prior art keywords
data
hard disk
target
storage block
checksum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910841114.1A
Other languages
Chinese (zh)
Inventor
张瑛
熊伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201910841114.1A priority Critical patent/CN112463019A/en
Priority to PCT/CN2020/113420 priority patent/WO2021043246A1/en
Publication of CN112463019A publication Critical patent/CN112463019A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The application discloses a data reading method and device, and belongs to the technical field of storage. In the application, for a hard disk with a corresponding hard disk state in the RAID being a risk state, when a read request for such a hard disk is received, corresponding data stored on the hard disk can be read according to the read request, and the data is verified, and if the verification passes, the read data can be returned, so that, in comparison with the related art, regardless of a failure of the hard disk, recovery is performed by directly reading data on other hard disks, a time delay of data reading is shortened, and consumption of system resources is also reduced.

Description

Data reading method and device
Technical Field
The present disclosure relates to the field of data storage, and in particular, to a data reading method and apparatus.
Background
Redundant Array of Independent Disks (RAID) refers to a technology for implementing data reading and writing based on multiple hard disks. RAID may be classified into hard RAID and soft RAID, depending on implementation principles. The hard RAID is implemented by hardware to implement RAID functions including data reading and writing, and the soft RAID is implemented by an operating system and a CPU to implement RAID functions.
In the related art, for soft RAID, when a certain hard disk of a plurality of hard disks fails, the hard disk may send a failure signal to a processor. When a hardware failure occurs to a hard disk, the data stored on the hard disk is usually damaged or lost completely, and when a software failure occurs to the hard disk, only part of the data stored on the hard disk is usually damaged or lost. The processor, upon receiving the failure signal, may mark the hard disk as a failed disk. Subsequently, when a read request sent by the client for reading the failed disk is received, the processor may read data in other hard disks except the failed disk, recover the data in the failed disk according to the read data, and return the recovered data to the client.
However, since the processor needs to recover the data on the failed disk by reading the data on the other hard disk after receiving the read request for the failed disk, there is a large time delay for data reading and a large resource consumption is caused.
Disclosure of Invention
The application provides a data reading method and a data reading device, which can be used for solving the problems of large data reading time delay and large resource consumption caused by the fact that a processor in the related technology recovers data of a fault disk by reading the data on other hard disks when reading the fault disk. The technical scheme is as follows:
in a first aspect, a data reading method is provided, the method including: when fault information which is sent by a target hard disk and used for indicating that the target hard disk is in a risk state is received, setting the hard disk state of the target hard disk into the risk state, and receiving a reading request which is sent by a client and used for reading the target hard disk, wherein the target hard disk refers to any hard disk of which the corresponding hard disk state in an independent hard disk redundant array RAID is in the risk state, and the risk state is used for indicating that the hard disk has a software fault; reading target data stored on the target hard disk according to the reading request; verifying the target data; and if the target data passes the verification, sending the target data to a client.
In the embodiment of the application, a read request for reading a target hard disk sent by a client is received. And reading the target data stored on the target hard disk according to the reading request, checking the target data, and returning the target data to the client if the target data passes the checking. The target hard disk refers to a hard disk of which the corresponding hard disk state in the RAID is a risk state, and the risk state refers to a software fault occurring in the fault. That is, in the embodiment of the present application, after receiving the failure information sent by the target hard disk and used for indicating that the hard disk is in the risk state, the hard disk state corresponding to the hard disk may be set to the risk state. Therefore, when a read request aiming at the hard disk is received subsequently, corresponding data can be read according to the read request, the data can be verified, and if the read data passes the verification, the read data can be returned.
Optionally, the fault information is a software fault type identifier, and the software fault type identifier is used to indicate that a fault occurring in the target hard disk is a software fault.
In the embodiment of the application, the hard disk with the fault can feed back the fault type identifier for indicating the fault type to the operating system, so that the operating system can identify the hard disk with the software fault according to the fault type identifier, and further read data for the hard disk with the software fault by the method provided by the application. In the related art, the hard disk with the fault only feeds back the fault signal, so that the operating system cannot identify which fault occurs to the end of the hard disk, and therefore, the hard disk can only be processed according to the hard disk with the hardware fault, that is, data recovery is directly performed.
Optionally, the implementation process of checking the target data may be: acquiring a reference checksum of the stored target data; calculating an actual checksum of the target data according to the target data; and if the reference checksum of the target data is the same as the actual checksum of the target data, determining that the checksum of the target data passes.
Optionally, after receiving a read request sent by a client to read a target hard disk, the method further includes: repairing the data stored on the target hard disk; and modifying the hard disk state of the target hard disk for completing data repair into a safe state.
In the embodiment of the application, the operating system can create a background task to modify the data stored on the target hard disk while reading the target data stored on the target hard disk according to the read request, and modify the hard disk state of the target hard disk after data recovery into a safe state.
Optionally, the implementation process of repairing the data stored on the target hard disk may be: acquiring a checksum of a data index area of each storage block in the target hard disk, wherein the data index area refers to an area in the corresponding storage block for storing data index information, and the data index information comprises the checksum of each data stored in the corresponding storage block; verifying the data index information on each storage block according to the checksum of the data index area of each storage block; for a first storage block which passes data index information verification, acquiring a checksum of each datum stored in the first storage block from the data index information of the first storage block; according to the obtained checksum of each piece of data, checking each piece of data in the first storage block; and repairing the data which is not verified in the first storage block.
In the embodiment of the application, for any storage block on the target hard disk, if the check on the data index area of the storage block passes, each piece of data can be checked according to the check information of each piece of data stored in the data index area, and only the data which fails in the check is recovered, so that the data recovery amount can be reduced.
Optionally, after the data index information on each storage block is verified according to the checksum of the data index area of each storage block, for a second storage block for which the data index information verification fails, all data stored on the second storage block is repaired.
That is, for a memory block with damaged or lost data index information, the memory block may be directly reconstructed to recover all data on the memory block.
In a second aspect, a data reading apparatus is provided, which has a function of implementing the behavior of the data reading method in the first aspect described above. The data reading device comprises at least one module, and the at least one module is used for realizing the data reading method provided by the first aspect.
In a third aspect, a data reading apparatus is provided, where the structure of the data reading apparatus includes a processor and a memory, and the memory is used to store a program that supports the data reading apparatus to execute the data reading method provided in the first aspect, and store data used to implement the data reading method provided in the first aspect. The processor is configured to execute programs stored in the memory. The operating means of the memory device may further comprise a communication bus for establishing a connection between the processor and the memory.
In a fourth aspect, a computer-readable storage medium is provided, which has stored therein instructions that, when run on a computer, cause the computer to perform the data reading method of the first aspect described above.
In a fifth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the data reading method of the first aspect described above.
The technical effects obtained by the above second, third, fourth and fifth aspects are similar to the technical effects obtained by the corresponding technical means in the first aspect, and are not described herein again.
The beneficial effect that technical scheme that this application provided brought includes at least:
in the embodiment of the application, when fault information which is sent by a target hard disk and used for indicating that the target hard disk is in a risk state is received, the hard disk state of the target hard disk can be set to be in the risk state. Subsequently, when a read request aiming at the target hard disk is received, corresponding data can be read according to the read request, the data can be verified, and if the read data passes the verification, the read data can be returned, so that compared with the prior art that the data can be recovered by directly reading the data on other hard disks regardless of the fault of the hard disk, the time delay of data reading is shortened, and meanwhile, the consumption of system resources is reduced.
Drawings
Fig. 1 is a system architecture diagram according to a data reading method provided in an embodiment of the present application;
FIG. 2 is a flowchart of a data reading method provided in an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating data distribution in a memory block according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data reading apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of another data reading apparatus according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Before explaining the embodiments of the present application in detail, a system architecture related to the embodiments of the present application will be described.
Fig. 1 is a diagram of a memory system architecture according to a data reading method provided in an embodiment of the present application. As shown in fig. 1, the system includes a client 01 and a storage device 02. Wherein the client 01 and the storage device 02 can communicate.
The client 01 may send a read request or a write request to the storage device 02.
The storage device 02 may include a processor 021, a memory 022, and a hard disk 023.
The processor 021 can be a Central Processing Unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to control the execution of programs in accordance with the present invention.
In particular implementations, a storage device may include multiple processors 021, as one embodiment. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or devices that include a processing core for processing data (e.g., computer program instructions).
The memory 022 has an operating system installed thereon, and the processor 021 can implement reading and writing of data by running the operating system. Besides, the memory can store the program code of the present application, and the processor 021 can control the execution. The Memory 022 can be a Read-Only Memory (ROM) or other types of static storage devices that can store static information and instructions, a Random Access Memory (RAM) or other types of dynamic storage devices that can store information and instructions, an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile disks, blu-ray disks, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 022 may be separate and coupled to the processor 021. The memory 022 may also be integrated with the processor 021.
The storage device 02 may be a storage array or a server. When the storage device 02 is a storage array, it includes a controller and several hard disks, the processor 021 and the memory 022 may be located in the controller of the storage array, and the controller is connected with several hard disks through a backend interface card. When the storage device 02 is a server, the processor 021, the memory 022 and the several hard disks are all located inside the server. The product form of the storage device 02 is not limited in this embodiment, and fig. 1 is only a schematic diagram of a part of the components included in the device.
In order to ensure the reliability of data, RAID technology, such as RAID5, RAID6, RAIDTP, etc., is often used in practical applications to store data. Taking RAID5 as an example, the space of each hard disk is first divided into a plurality of storage blocks (blocks), and the size of each storage block is the same. And respectively taking out one storage block from different hard disks to form a storage block group. The number of storage blocks is determined by the RAID type, and taking RAID5 of "4 + 1" as an example, one storage block group is composed of 4 data blocks and 1 check block, and thus 5 storage blocks are required. When the storage device 02 stores data, the data may be split into 4 data fragments, and check data of the 4 data fragments is calculated, so as to generate check fragments, and then the 4 data fragments and 1 check fragment are stored in the storage block group. The storage block group is corresponding to a section of logical address, which is the logical address of the data. The storage device 02 stores a mapping between logical addresses and physical addresses where the data is actually stored. When the storage device 02 receives a read request sent by the client 01, where the read request includes a logical address of data to be read, the processor 021 may determine, by running the operating system, a storage block where the data to be read is located and a location (physical address) where the data to be read is located in the storage block according to the logical address of the data to be read. After determining the memory block where the data to be read is located and the location in the memory block, the data may be read from the memory block.
In addition to this, a communication bus and a communication interface (not shown in fig. 1) may be included in the storage device 02. Wherein a communication bus is used for transferring information between the various components comprised by the storage device 02.
A communication interface for communicating with other devices or a communication network, such as ethernet, Radio Access Network (RAN), Wireless Local Area Network (WLAN), etc.
Next, a data reading method provided in an embodiment of the present application is described.
Fig. 2 is a flowchart of a data reading method according to an embodiment of the present application. The execution subject of the method may be the processor described in fig. 1 deployed in the storage device 02. Referring to fig. 2, the method comprises the steps of:
step 201: and when fault information which is sent by the target hard disk and used for indicating that the target hard disk is in the risk state is received, setting the hard disk state of the target hard disk to be in the risk state.
Wherein the risk status is used to indicate that the data in the target hard disk is damaged or lost in a small amount, the target hard disk also allows the data to be continuously written, and the remaining undamaged and lost data also allows the data to be read. Based on this, the failure information for indicating that the target hard disk is in a risk state may refer to information for indicating that there is data damage or loss in the target hard disk but the target hard disk is still available.
In this embodiment, the processor may send the failure detection instruction to each hard disk in the storage device every predetermined time. After receiving a fault detection instruction sent by a processor, each hard disk can detect whether the hard disk has read-write abnormality or data damage or loss, and if the hard disk detects that the hard disk has read-write abnormality or data damage or loss, the hard disk can determine the fault type of the fault of the hard disk and send a fault type identifier for identifying the fault type to the processor.
Optionally, in another possible implementation manner, the processor may also send a fault detection instruction to a hard disk to query a fault type of the hard disk when an I/O abnormal error code sent by the hard disk is continuously received. That is, when a processor is processing a read request or a write request for a certain hard disk, if an I/O abnormal error code sent by the hard disk is received n times consecutively, it indicates that a fault may occur in the hard disk, and at this time, the processor may send a fault detection instruction to the hard disk to obtain a fault type of the hard disk. Wherein n may be a predetermined number.
The failure types of the hard disk failure may include a software failure and a hardware failure. Based on this, the fault type identification may include a software fault type identification and a hardware fault type identification. The hardware fault type identifier is used for identifying the fault of the hard disk as a hardware fault. Typically, a hardware failure refers to a hardware instrument failure. And a software failure refers to a corruption or loss of data caused by a software exception. It should be noted that, when the hard disk fails in software, only a small portion of data on the hard disk may be damaged or lost. However, the location of the damaged or lost data cannot be specified, so that the hard disk can continue to write data subsequently, and can read data for the undamaged and lost data. In the embodiment of the application, after the hard disk fails, if the hard disk determines that the hard disk itself has the software failure, the hard disk can report the software failure type identifier to the processor, and if the hard disk has the hardware failure, the hard disk can report the hardware failure type identifier to the processor.
Optionally, in this embodiment of the present application, each hard disk may also actively report the fault type identifier to the processor when detecting a fault.
Based on the foregoing description, it can be known that only a small portion of data is usually damaged or lost when a hard disk has a software failure, and reading and writing can be continued, so in this embodiment of the present application, failure information used for indicating that the hard disk is in a risk state may be identified as the software failure type. Based on this, if the fault type identifier sent by a certain hard disk received by the processor is a software fault type identifier, the hard disk state of the hard disk can be set to be a risk state. Optionally, if the fault type identifier sent by a certain hard disk and received by the processor is a hardware fault type identifier, the processor may directly set the hard disk state of the hard disk to a failure state, where the failure state is used to indicate that data reading and writing on the hard disk are prohibited subsequently.
It should be noted that the memory of the storage device may store a corresponding relationship between the hard disk identifier and the state information of each of the plurality of hard disks. The status information may include, among other things, a safe status, a risky status, and a failed status. Wherein the security status is used to indicate that the data stored on the hard disk is not corrupted or lost, the hard disk not currently failing. The risk state is used for indicating that the data stored on the hard disk is damaged or lost at a small part, the hard disk has a software fault, the subsequent processor can also continuously write the data in the hard disk, and the data can be read from the hard disk through the subsequent steps in the embodiment of the application. And the failure state is used for indicating that the hard disk has a hardware fault and is unavailable at present, and subsequently forbidding data reading and writing on the hard disk when a reading and writing request aiming at the hard disk is received. Based on this, after the processor receives the software failure type identifier reported by the target hard disk, the state information corresponding to the hard disk identifier of the target hard disk in the corresponding relationship may be set to be a risk state, so as to indicate that the target hard disk has a software failure and that a small part of data is damaged or lost.
Step 202: and receiving a read request sent by a client, wherein the read request is used for reading data in the target hard disk in the risk state.
In this embodiment of the present application, when data that a client wants to read is stored in a target hard disk, the client may send a read request carrying a logical address of the data to be read to a processor. The processor can receive the read request sent by the client, and determine that the hard disk to be read is the target hard disk according to the logic address of the data to be read carried in the read request. That is, the read request is a read request for reading data in the target hard disk.
Step 203: and reading the target data stored on the target hard disk according to the reading request.
After receiving the read request, the processor may read the target data stored on the target hard disk according to the logical address carried in the read request.
It should be noted that the space of the hard disk may be divided into a plurality of storage blocks (i.e., blocks). Wherein the sizes of the respective memory blocks are the same. Each storage block may correspond to a segment of logical address, each storage block may store a plurality of pieces of data, and a memory of the storage device may store a mapping relationship between a physical address and a logical address of the data. Based on this, the processor may determine, according to the logical address carried in the read request, a physical address where the target data to be read is stored, that is, determine the target storage block to be read, and then, the processor may read the stored target data from the target storage block.
Step 204: and checking the target data.
Since the target hard disk is a hard disk in a risk state, there is a small amount of damage or loss to the data stored on the target hard disk. However, since the location of the data corruption or loss cannot be specified, after the processor reads the target data according to the logical address in the read request, it cannot be determined whether the target data is corrupted. Based on this, the processor may verify the target data.
For example, the processor may obtain a reference checksum (checksum) of the stored target data, calculate an actual checksum of the target data according to the target data, and determine that the checksum of the target data passes if the reference checksum of the target data is the same as the actual checksum.
It should be noted that the storage device may store metadata of respective data stored on each of the plurality of hard disks. The metadata includes a storage address of each data and a checksum of each data. The processor may obtain, according to the logical address carried in the read request, metadata including the logical address from the stored metadata, and obtain a checksum of the target data from the obtained metadata. At this time, the obtained checksum of the target data is the reference checksum of the correct target data originally stored in the space indicated by the logical address carried in the read request.
While obtaining the reference checksum of the target data, the processor may also calculate an actual checksum of the target data according to the obtained target data. Wherein the calculation method of obtaining the actual checksum is the same as the calculation method of the reference checksum in the stored metadata.
Because the actual checksum is calculated according to the acquired target data by adopting the same calculation method as the stored reference checksum, if the acquired target data is damaged, the calculated actual checksum is different from the acquired reference checksum of the target verse. If the acquired target data is not damaged, the calculated actual checksum of the target data is the same as the acquired reference checksum. Based on this, the processor may compare the reference checksum of the target data obtained from the stored metadata and the actual checksum of the target data calculated. If the two are the same, the acquired target data is correct target data and is not damaged, and at the moment, the verification of the target data is passed. If the two are not the same, the acquired target data is the damaged data, and at the moment, the verification on the target data is failed.
Step 205: and if the verification of the target data passes, sending the target data to the client.
As can be seen from the foregoing description, if the verification of the target data passes, it indicates that the target data is actually data that is not damaged in the target hard disk, in which case the processor may directly return the target data to the client.
Optionally, if the check on the target data fails, it indicates that the data currently requested by the client includes the data that is damaged in the target hard disk, in which case the target data cannot be returned to the client as the read result. At this time, the processor may restore the target data by reading data on a hard disk other than the target hard disk among the plurality of hard disks included in the RAID.
In the embodiment of the application, when fault information which is sent by a target hard disk and used for indicating that the target hard disk is in a risk state is received, the hard disk state of the target hard disk can be set to be in the risk state. And receiving a read request for reading the target hard disk sent by the client. And reading the target data stored on the target hard disk according to the reading request, checking the target data, and returning the target data to the client if the target data passes the checking. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for such a hard disk is received, corresponding data may be read according to the read request, and the data is verified, and if the read data passes the verification, the read data may be returned, so that, compared to the related art, regardless of a failure of the hard disk, recovery is performed by directly reading data on other hard disks, a time delay of data reading is shortened, and consumption of system resources is also reduced.
It should be noted that, in this embodiment of the present application, after receiving a read request of a target hard disk, the processor may further establish a background task to repair data stored on the target hard disk, in consideration of the fact that damaged data exists on the target hard disk.
As can be seen from the foregoing description, the space of the target hard disk may be divided into a plurality of storage blocks. Based on this, in the embodiment of the present application, the processor may sequentially check each storage block of the target hard disk from the first storage block of the target hard disk, and further repair the storage block that fails in the check.
For example, the processor may obtain a checksum of a data index area of each storage block in the target hard disk, where the data index area refers to an area in the corresponding storage block where data index information is stored, and the data index information includes the checksum of each data stored in the corresponding storage block; verifying the data index information on each storage block according to the checksum of the data index area of each storage block; for a first storage block which passes the data index information verification, acquiring the checksum of each data stored in the first storage block from the data index information of the first storage block; according to the obtained checksum of each piece of data, checking each piece of data in the first storage block; and repairing the data which is not verified in the first storage block. The first storage block refers to any storage block that the data index information passes the check, that is, the storage blocks that the data index information passes the check may be both referred to as the first storage block. Conversely, the storage block for which the data index information check fails may be referred to as a second storage block.
It should be noted that each storage block may include a user data area, a data index area, and a tail area. The user data area stores a plurality of pieces of user data, and generally, data requested by the client is the user data stored in the user data area, that is, target data in the application is the user data. The data index area stores mapping information corresponding to each piece of user data. The mapping information includes a checksum of the corresponding user data, an offset of the corresponding data in the memory block, and the like. The tail area stores the checksum of the data index area.
Fig. 3 shows a schematic diagram of a data layout in a hard disk. As shown in fig. 3, the hard disk may include a plurality of blocks, i.e., a plurality of memory blocks. Taking Block1 as an example, the Block1 may store multiple pieces of user data such as data0 and data1, and the space occupied by the multiple pieces of data may be referred to as a user data area (user data area). Mapping information corresponds to each piece of user data, referring to fig. 3, the mapping information corresponding to data0 is data0 ref, and the mapping information corresponding to data1 is data1 ref. The mapping information may include an offset and a checksum of the user data. The space occupied by the mapping information corresponding to the plurality of pieces of user data may be referred to as a data reference region. The last sector of Block1, the tail, stores the checksum of the data index area.
Based on the above description, in this embodiment of the application, taking any storage block in the target hard disk as an example, the processor may obtain a checksum of the data index area from the tail area of the storage block, that is, a reference checksum of the data index area, and at the same time, the processor may read data index information in the data index area. And calculating to obtain the actual checksum of the data index area according to the acquired data index information.
It should be noted that the actual checksum is determined by the same method as the reference checksum stored in the tail area according to the data index information in the data index area, in this case, if the data index information stored in the data index area is not damaged, the actual checksum in the data index area will be the same as the checksum in the data index area stored in the tail area, and if the data index information stored in the data index area is damaged, the actual checksum will be different from the checksum stored in the tail area. Based on this, in the embodiment of the present application, after calculating the actual checksum of the data index area, the processor may compare the actual checksum with the checksum of the data index area obtained from the tail area. If the two are the same, the processor may determine that the data index information in the data index area is not damaged, i.e., the check on the data index area passes. At this time, the storage block is also the first storage block that the data index information check passes. Next, the processor may check each piece of data of the user data area using the checksum of each piece of data stored in the data index area.
The processor may first read a first piece of data in the user data area on the storage block, and read a checksum of the first piece of data from the data index area. Then, the processor may calculate a checksum according to the first piece of data, compare the calculated checksum with the checksum read from the piece of data, and if the two are the same, indicate that the first piece of data is not damaged, that is, the checksum of the piece of data passes through, at this time, the processor may continue to check the next piece of data in the user data area to determine whether the next piece of data is damaged. If the two checksums are not the same, it indicates that the first piece of data is damaged, i.e. the check on the piece of data fails, at this time, the processor may mark the first piece of data, and then continue checking the next piece of data. In this way, the marked data can be repaired after the processor verifies all the data in the user data area. Of course, in one possible implementation, the processor may repair a piece of data each time it is determined that the data check fails.
When the processor repairs a piece of data, the processor can read data information related to the piece of data from other hard disks except the target hard disk, and calculate and recover the piece of data according to the read data information. After the piece of data is recovered, the processor may store the piece of data in another hard disk or write the piece of data in another storage block of the target hard disk, and at the same time, the processor may delete the piece of data stored in the user data area of the current storage block, that is, release the space occupied by the piece of data in the current storage block, and subsequently, the processor may write new data in the space.
Optionally, in a possible case, the processor may also directly read data information related to the data stored on the storage block in other hard disks except the target hard disk when determining that the first piece of data in the storage block fails to be checked, and then directly calculate and recover all data stored on the storage block according to the obtained data information. After all data is recovered, the processor may store the recovered data in another hard disk or another storage block of the target hard disk, and delete all data in the storage block to release the storage block.
Optionally, if the check on all the data in the memory block passes, it indicates that there is no damaged or lost data in the memory block, and at this time, the processor may continue to check the next memory block.
Optionally, if the processor finds that the actual checksum of the data index area is different from the reference checksum of the data index area stored in the tail area of the storage block when comparing the actual checksum of the data index area with the reference checksum of the data index area stored in the tail area of the storage block, it indicates that the data index information stored in the data index area of the storage block is damaged. After all data is recovered, the processor may store the recovered data in another hard disk or another storage block of the target hard disk, and delete all data in the storage block to release the storage block. Thereafter, the processor may proceed with checking the next memory block.
In addition, as described above, in the embodiment of the present application, the processor may implement the above-mentioned data repair on the target hard disk by establishing a background task. The processor can divide the background task into a plurality of task segments and process the task segments in a concurrent manner, so that the data repair speed is increased.
After all the storage blocks in the target hard disk are checked by the method and the damaged data in the target hard disk are repaired, the processor can modify the hard disk state of the target hard disk from the risk state to the safe state.
Therefore, in the embodiment of the application, for the target hard disk in the risk state, the processor can check each storage block in the target hard disk one by creating a background task and repair data of the storage block which fails to be checked, so that the damaged or lost data in the target hard disk is less, only a small amount of data needs to be repaired, and the consumption of the processor on processing resources is reduced. Meanwhile, because the data needing to be recovered is less and correspondingly the related data needing to be read from other disks is less, under the condition that the bandwidth of the total hard disk is fixed, the situation that the background task for repairing the data contends for the bandwidth with other normal I/O can be effectively avoided, and the performance fluctuation of the normal I/O can be reduced.
In addition, it should be noted that, if the processor receives a write request for the target hard disk, the processor may write the data to be written to the target hard disk according to the write request. That is, in the embodiment of the present application, for a hard disk whose hard disk state is a risk state, the hard disk may be continuously used.
Next, a data reading apparatus provided in an embodiment of the present application will be described.
Referring to fig. 4, an embodiment of the present application provides a data reading apparatus 400, where the apparatus 400 includes:
a setting module 401, configured to perform step 201 in the foregoing embodiment; the setup module 401 can be executed by the processor 021 in the storage device shown in fig. 1, or by the processor 021 invoking program code in the memory 022.
A receiving module 402, configured to perform step 202 in the foregoing embodiment; the receiving module 402 can be executed by the processor 021 in the storage device shown in fig. 1 or by the processor 021 calling the program code in the memory 022.
A reading module 403, configured to perform step 203 in the foregoing embodiment; the reading module 403 can be executed by the processor 021 in the storage device shown in fig. 1 or by the processor 021 calling the program code in the memory 022.
A verification module 404, configured to perform step 204 in the foregoing embodiment; the verification module 404 can be executed by the processor 021 in the storage device shown in fig. 1, or by the processor 021 invoking program code in the memory 022.
A sending module 405, configured to execute step 205 in the foregoing embodiment; the sending module 405 can be executed by the processor 021 in the storage device shown in fig. 1 or by the processor 021 calling the program code in the memory 022.
Optionally, the failure information is a software failure type identifier, where the software failure type identifier is used to indicate that a failure occurring in the target hard disk is a software failure.
Optionally, the verification module is specifically configured to:
acquiring a reference checksum of the stored target data;
calculating an actual checksum of the target data according to the target data;
and if the reference checksum of the target data is the same as the actual checksum of the target data, determining that the checksum of the target data passes.
Optionally, referring to fig. 5, the apparatus 400 further comprises:
a repair module 406, configured to repair data stored in the target hard disk;
and a modification module 407, configured to modify a hard disk state of the target hard disk for which data repair is completed into a secure state.
Optionally, the repair module 406 is specifically configured to:
acquiring a checksum of a data index area of each storage block in a target hard disk, wherein the data index area refers to an area in the corresponding storage block for storing data index information, and the data index information comprises the checksum of each data stored in the corresponding storage block;
verifying the data index information on each storage block according to the checksum of the data index area of each storage block;
for a first storage block which passes the data index information verification, acquiring the checksum of each data stored in the first storage block from the data index information of the first storage block;
according to the obtained checksum of each piece of data, checking each piece of data in the first storage block;
and repairing the data which is not verified in the first storage block.
Optionally, the repair module 406 is specifically further configured to:
and for the second storage block which fails in the data index information verification, repairing all data stored on the second storage block.
The repair module 406 and the modification module 407 can be executed by the processor 021 in the storage device shown in fig. 1, or by the processor 021 invoking program code in the memory 022.
In summary, in the embodiment of the present application, when receiving the failure information sent by the target hard disk and used for indicating that the target hard disk is in the risk state, the hard disk state of the target hard disk may be set to the risk state. And receiving a read request for reading the target hard disk sent by the client. And reading the target data stored on the target hard disk according to the reading request, checking the target data, and returning the target data to the client if the target data passes the checking. That is, in the embodiment of the present application, for a hard disk in a risk state, when a read request for such a hard disk is received, corresponding data may be read according to the read request, and the data is verified, and if the read data passes the verification, the read data may be returned, so that, compared to the related art, regardless of a failure of the hard disk, recovery is performed by directly reading data on other hard disks, a time delay of data reading is shortened, and consumption of system resources is also reduced.
It should be noted that: in the data reading apparatus provided in the above embodiment, when reading data, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the data reading apparatus and the data reading method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
In the above embodiments, the implementation may be wholly or partly realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with embodiments of the invention, to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., Digital Versatile Disk (DVD)), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
It should be understood that reference herein to "a plurality" means two or more. In the description of the present application, "/" indicates an OR meaning, for example, A/B may indicate A or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in order to facilitate clear description of technical solutions of the embodiments of the present application, in the embodiments of the present application, terms such as "first" and "second" are used to distinguish the same items or similar items having substantially the same functions and actions. Those skilled in the art will appreciate that the terms "first," "second," etc. do not denote any order or quantity, nor do the terms "first," "second," etc. denote any order or importance.
The above-mentioned embodiments are provided not to limit the present application, and any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (12)

1. A method of reading data, the method comprising:
when fault information which is sent by a target hard disk and used for indicating that the target hard disk is in a risk state is received, setting the hard disk state of the target hard disk to be in the risk state;
receiving a reading request sent by a client, wherein the reading request is used for reading data in a target hard disk in a risk state;
reading target data stored on the target hard disk according to the reading request;
verifying the target data;
and if the target data passes the verification, sending the target data to a client.
2. The method according to claim 1, wherein the failure information is a software failure type identifier, and the software failure type identifier is used to indicate that a failure occurring in the target hard disk is a software failure.
3. The method of claim 1, wherein the verifying the target data comprises:
acquiring a reference checksum of the stored target data;
calculating an actual checksum of the target data according to the target data;
and if the reference checksum of the target data is the same as the actual checksum of the target data, determining that the checksum of the target data passes.
4. The method of claim 1, wherein after receiving a read request sent by a client to read a target hard disk, the method further comprises:
repairing the data stored on the target hard disk;
and modifying the hard disk state of the target hard disk for completing data repair into a safe state.
5. The method of claim 4, wherein the repairing the data stored on the target hard disk comprises:
acquiring a checksum of a data index area of each storage block in the target hard disk, wherein the data index area refers to an area in the corresponding storage block for storing data index information, and the data index information comprises the checksum of each data stored in the corresponding storage block;
verifying the data index information on each storage block according to the checksum of the data index area of each storage block;
for a first storage block which passes data index information verification, acquiring a checksum of each datum stored in the first storage block from the data index information of the first storage block;
according to the obtained checksum of each piece of data, checking each piece of data in the first storage block;
and repairing the data which is not verified in the first storage block.
6. The method according to claim 5, wherein after checking the data index information on each storage block according to the checksum of the data index area of each storage block, further comprising:
and for the second storage block with data index information which is not checked, repairing all data stored on the second storage block.
7. A data reading apparatus, characterized in that the apparatus comprises:
the device comprises a setting module, a risk state setting module and a processing module, wherein the setting module is used for setting the hard disk state of a target hard disk into the risk state when receiving fault information which is sent by the target hard disk and used for indicating that the target hard disk is in the risk state;
the system comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving a reading request sent by a client, and the reading request is used for reading data in a target hard disk in a risk state;
the reading module is used for reading the target data stored on the target hard disk according to the reading request;
the checking module is used for checking the target data;
and the sending module is used for sending the target data to the client if the target data passes the verification.
8. The apparatus according to claim 7, wherein the failure information is a software failure type identifier, and the software failure type identifier is used to indicate that the failure occurring in the target hard disk is a software failure.
9. The apparatus of claim 7, wherein the verification module is specifically configured to:
acquiring a reference checksum of the stored target data;
calculating an actual checksum of the target data according to the target data;
and if the reference checksum of the target data is the same as the actual checksum of the target data, determining that the checksum of the target data passes.
10. The apparatus of claim 7, wherein the apparatus further comprises:
the repair module is used for repairing the data stored on the target hard disk;
and the modification module is used for modifying the hard disk state of the target hard disk for completing data repair into a safe state.
11. The apparatus of claim 10, wherein the repair module is specifically configured to:
acquiring a checksum of a data index area of each storage block in the target hard disk, wherein the data index area refers to an area in the corresponding storage block for storing data index information, and the data index information comprises the checksum of each data stored in the corresponding storage block;
verifying the data index information on each storage block according to the checksum of the data index area of each storage block;
for a first storage block which passes data index information verification, acquiring a checksum of each datum stored in the first storage block from the data index information of the first storage block;
according to the obtained checksum of each piece of data, checking each piece of data in the first storage block;
and repairing the data which is not verified in the first storage block.
12. The apparatus according to claim 11, wherein the repair module is specifically configured to:
and for the second storage block with data index information which is not checked, repairing all data stored on the second storage block.
CN201910841114.1A 2019-09-06 2019-09-06 Data reading method and device Pending CN112463019A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910841114.1A CN112463019A (en) 2019-09-06 2019-09-06 Data reading method and device
PCT/CN2020/113420 WO2021043246A1 (en) 2019-09-06 2020-09-04 Data reading method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910841114.1A CN112463019A (en) 2019-09-06 2019-09-06 Data reading method and device

Publications (1)

Publication Number Publication Date
CN112463019A true CN112463019A (en) 2021-03-09

Family

ID=74806893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910841114.1A Pending CN112463019A (en) 2019-09-06 2019-09-06 Data reading method and device

Country Status (2)

Country Link
CN (1) CN112463019A (en)
WO (1) WO2021043246A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994236A (en) * 2023-03-23 2023-04-21 杭州派迩信息技术有限公司 Collaborative processing method and system for aviation data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745751A (en) * 2013-12-23 2014-04-23 华为技术有限公司 Failure alarming method and device
CN105224891A (en) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 Magnetic disc optic disc fused data method for secure storing, system and device
CN105808161A (en) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 Reading method of bad sector data of hard disk
CN108153618A (en) * 2017-12-22 2018-06-12 国网浙江杭州市萧山区供电有限公司 Hard disk data recovery, device and hard disc data restorer
CN108509156A (en) * 2018-04-04 2018-09-07 腾讯科技(深圳)有限公司 Method for reading data, device, equipment and system
CN109284207A (en) * 2018-08-30 2019-01-29 紫光华山信息技术有限公司 Hard disc failure processing method, device, server and computer-readable medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102147713B (en) * 2011-02-18 2013-06-12 杭州宏杉科技有限公司 Method and device for managing network storage system
CN103970481B (en) * 2013-01-29 2017-03-01 国际商业机器公司 The method and apparatus rebuilding memory array
US9891994B1 (en) * 2015-12-30 2018-02-13 EMC IP Holding Company LLC Updated raid 6 implementation
US10347331B2 (en) * 2016-06-13 2019-07-09 SK Hynix Inc. Read threshold optimization in flash memories
CN109582515A (en) * 2018-12-03 2019-04-05 郑州云海信息技术有限公司 A kind of hard disk detection method, system and electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103745751A (en) * 2013-12-23 2014-04-23 华为技术有限公司 Failure alarming method and device
CN105224891A (en) * 2015-09-22 2016-01-06 苏州互盟信息存储技术有限公司 Magnetic disc optic disc fused data method for secure storing, system and device
CN105808161A (en) * 2016-02-26 2016-07-27 四川效率源信息安全技术股份有限公司 Reading method of bad sector data of hard disk
CN108153618A (en) * 2017-12-22 2018-06-12 国网浙江杭州市萧山区供电有限公司 Hard disk data recovery, device and hard disc data restorer
CN108509156A (en) * 2018-04-04 2018-09-07 腾讯科技(深圳)有限公司 Method for reading data, device, equipment and system
CN109284207A (en) * 2018-08-30 2019-01-29 紫光华山信息技术有限公司 Hard disc failure processing method, device, server and computer-readable medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115994236A (en) * 2023-03-23 2023-04-21 杭州派迩信息技术有限公司 Collaborative processing method and system for aviation data
CN115994236B (en) * 2023-03-23 2023-08-04 杭州派迩信息技术有限公司 Collaborative processing method and system for aviation data

Also Published As

Publication number Publication date
WO2021043246A1 (en) 2021-03-11

Similar Documents

Publication Publication Date Title
EP2048579B1 (en) System and method for managing memory errors in an information handling system
CN109614276B (en) Fault processing method and device, distributed storage system and storage medium
CN102681794B (en) Method and system for realizing redundant array protection of a disk based on double controllers
CN109656896B (en) Fault repairing method and device, distributed storage system and storage medium
CN109726036B (en) Data reconstruction method and device in storage system
CN103534688A (en) Data recovery method, storage equipment and storage system
CN113377569B (en) Method, apparatus and computer program product for recovering data
JP2006139478A (en) Disk array system
CN110941394A (en) Data reading and writing method and device for automatic train control system
WO2019210844A1 (en) Anomaly detection method and apparatus for storage device, and distributed storage system
CN114203253A (en) Chip memory fault repair device and chip
CN111625199B (en) Method, device, computer equipment and storage medium for improving reliability of solid state disk data path
US9009548B2 (en) Memory testing of three dimensional (3D) stacked memory
CN112732163B (en) Data verification method and device
CN112000513A (en) Computer and VPD data operation method, device and storage medium thereof
CN117391099B (en) Data downloading and checking method and system for smart card and storage medium
CN117111860B (en) IO processing method and device during disk array degradation and electronic equipment
CN112463019A (en) Data reading method and device
WO2021088368A1 (en) Method and device for repairing memory
CN114579163A (en) Disk firmware upgrading method, computing device and system
CN116244127A (en) Hard disk detection method, device, equipment and storage medium
CN106776142B (en) Data storage method and data storage device
US20140359399A1 (en) Storage integrity validator
CN112328182B (en) RAID data management method, device and computer readable storage medium
CN114442953A (en) Data verification method, system, chip and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination