CN116048880A - SMR hard disk data recovery method, node and storage medium - Google Patents

SMR hard disk data recovery method, node and storage medium Download PDF

Info

Publication number
CN116048880A
CN116048880A CN202211743857.3A CN202211743857A CN116048880A CN 116048880 A CN116048880 A CN 116048880A CN 202211743857 A CN202211743857 A CN 202211743857A CN 116048880 A CN116048880 A CN 116048880A
Authority
CN
China
Prior art keywords
data
zone
recovered
recovery
fragments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211743857.3A
Other languages
Chinese (zh)
Inventor
李文俊
王志豪
周明伟
江文龙
戴恩亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202211743857.3A priority Critical patent/CN116048880A/en
Publication of CN116048880A publication Critical patent/CN116048880A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an SMR hard disk data recovery method, a node and a storage medium, wherein the SMR hard disk data recovery method comprises the following steps: responding to a data recovery task issued by a metadata node in a storage system, and analyzing the data recovery task to obtain a target recovery mode, fragment information of fragments to be recovered and a zone to be recovered to which the fragments to be recovered belong; the target recovery mode is any one of zone incremental recovery and zone full recovery; and based on the fragment information of the fragments to be recovered and the zone to be recovered to which the fragments to be recovered belong, adopting a recovery strategy matched with the target recovery mode to recover the data. According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.

Description

SMR hard disk data recovery method, node and storage medium
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to an SMR hard disk data recovery method, a node, and a storage medium.
Background
In recent years, a new type of magnetic recording medium shingled magnetic recording disk (Shingled Magnetic Recording, SMR) has been widely used, which has characteristics of high unit storage density, excellent sequential writing performance, and the like.
However, due to the limitation of the sequential writing characteristic of the SMR disk, the minimum granularity of SMR disk management is Zone, and only additional writing is allowed in Zone, and partial data or random writing is not allowed to be erased in Zone, so that a conventional file system cannot be applied to the SMR disk, data fragments cannot be stored in a file form, and the damaged data cannot be directly recovered by using an EC algorithm.
Therefore, how to recover when data on the SMR disk is corrupted becomes an urgent issue.
Disclosure of Invention
The technical problem mainly solved by the application is to provide a method for recovering SMR hard disk data, a node and a storage medium, which can recover the stored data when the data on the SMR hard disk is damaged.
In order to solve the above problems, a first aspect of the present application provides an SMR hard disk data recovery method, including: responding to a data recovery task issued by a metadata node in a storage system, and analyzing the data recovery task to obtain a target recovery mode, fragment information of fragments to be recovered and a zone to be recovered to which the fragments to be recovered belong; the target recovery mode is any one of zone incremental recovery and zone full recovery; and based on the fragment information of the fragments to be recovered and the zone to be recovered to which the fragments to be recovered belong, adopting a recovery strategy matched with the target recovery mode to recover the data.
In order to solve the above problem, a second aspect of the present application provides an SMR hard disk data recovery method, including: determining a target recovery mode of a to-be-recovered slice, and acquiring slice information of the to-be-recovered slice and a to-be-recovered zone to which the to-be-recovered slice belongs; the target recovery mode is any one of zone incremental recovery and zone full recovery; generating a data recovery task based on the target recovery mode of the to-be-recovered fragments, the fragment information and the affiliated to-be-recovered zone; issuing the data recovery task to a target storage node in a storage system; the target storage node responds to the data recovery task, analyzes the data recovery task, obtains the target recovery mode, the fragmentation information of the fragments to be recovered and the fragments to be recovered to the zone to be recovered, and performs data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragmentation information of the fragments to be recovered and the fragments to be recovered to the zone to be recovered.
In order to solve the above problem, a third aspect of the present application provides a storage node, including a communication circuit, a memory, and a processor, where the communication circuit and the memory are respectively coupled to the processor, the memory stores program instructions, and the processor is configured to execute the program instructions to implement the SMR hard disk data recovery method of the first aspect.
In order to solve the above problem, a fourth aspect of the present application provides a metadata node, including a communication circuit, a memory, and a processor, where the communication circuit and the memory are respectively coupled to the processor, the memory stores program instructions, and the processor is configured to execute the program instructions to implement the SMR hard disk data recovery method of the second aspect.
In order to solve the above-mentioned problems, a fifth aspect of the present application provides a computer-readable storage medium storing program instructions executable by a processor for implementing the SMR hard disk data recovery method of the first and second aspects.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Drawings
FIG. 1 is a flowchart of an embodiment of a method for recovering SMR hard disk data according to a first aspect of the present invention;
FIG. 2 is a schematic illustration of zone tail damage in the present application;
FIG. 3 is a schematic diagram of zone full recovery in the present application;
FIG. 4 is a flowchart of an embodiment of an SMR hard disk data recovery method according to the second aspect of the present application;
FIG. 5 is a schematic diagram showing zone damage elsewhere in the present application;
FIG. 6 is a schematic diagram of a frame of an embodiment of an SMR hard disk data recovery device of the present application;
FIG. 7 is a schematic diagram of a frame of another embodiment of an SMR hard disk data recovery device of the present application;
FIG. 8 is a schematic diagram of a framework structure of a storage node of the present application;
FIG. 9 is a schematic diagram of a framework structure of a metadata node of the present application;
FIG. 10 is a schematic diagram of a frame structure of a storage system of the present application;
FIG. 11 is a schematic diagram of a framework of an embodiment of a computer readable storage medium of the present application.
Detailed Description
The following describes the embodiments of the present application in detail with reference to the drawings.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two.
Prior to further elaboration of the present application, the application background of the present application is first further elaborated: currently, shingled magnetic recording disks (Shingled Magnetic Recording, SMR) can be classified as conventional magnetic recording disks (Conventional Magnetic Recording, CMR) according to the different track arrangements of the magnetic recording disks on the disks, with the greatest difference between the SMR disks and the CMR disks being that the zones except for about 1% of the CMR Zone are divided into forced Sequential write zones (SR-zones) of a fixed size in the remaining SMR disk Zone, which are abbreviated herein as zones; the data is forced to be written in sequence in the zone and the previously written data cannot be erased independently, so that a conventional file system cannot be applied to an SMR disk, cannot store the data fragments in the form of files, and cannot directly recover the damaged data by using an EC algorithm.
Common scenarios for data corruption generally include:
1. during the writing process, the data of the tail part of the slice is lost due to the abnormality of network, power failure and the like;
2. partial data resulting from disk sector corruption is unreadable;
3. the offline storage node causes the loss of stored data;
the above three data damage scenes may correspond to three zone damage scenes, respectively:
Scene one: the tail data of the zone physical sector position is missing and the data before the pointer is written is not damaged;
scene II: zone physical sector location data is corrupted;
scene III: the front and middle position data of the zone physical sector are empty.
According to a first aspect of the present application, an SMR hard disk data recovery method is provided, please refer to fig. 1, which is a schematic flow chart of an embodiment of the SMR hard disk data recovery method according to the first aspect of the present application; specifically, the method may include the steps of:
step S11: responding to a data recovery task issued by a metadata node in a storage system, analyzing the data recovery task to obtain a target recovery mode, fragment information of fragments to be recovered and a zone to be recovered to which the fragments to be recovered belong; the target recovery mode is any one of zone incremental recovery and zone full recovery.
An implementation subject of the SMR hard disk data recovery method set forth in the first aspect of the present application may be a storage node; the storage nodes and the metadata nodes are nodes in a distributed storage system, and in the distributed storage system, the metadata nodes are in charge of managing file information, metadata information, task scheduling and the like of a cluster, and the storage nodes are in charge of storing data, managing hard disks and the like.
It should be noted that after the zone concept of SMR disk is introduced, one or more slices may be stored in a zone, and when one slice stored in a zone is considered to be damaged, the zone is considered to be a damaged zone; for a first zone damage scene, please refer to fig. 2, fig. 2 is a schematic diagram of zone tail damage in the present application, and it is assumed that zone1 stores 3 slices, where slice 3 has partial data that cannot be successfully written into zone1 due to network jitter, abnormal network outage, etc., at this time, tail data of zone1 is empty, and the write pointer is not damaged before pointing to the position, since there is no damaged data before the write pointer, only abnormal data exists after the position pointed to by the write pointer, and only normal data needs to be obtained on this basis and normal data needs to be written into zone1 tail, then the target recovery mode may be zone incremental recovery; typically, the exception data for recovery with zone delta is missing data; for the second and third scenes of zone damage, the write pointer can point to the tail of the zone to be recovered, or point to other positions, and because the abnormal data exists before the position pointed by the write pointer and the write pointer cannot fall back, whether the write pointer is normal data, abnormal data or is finished (i.e. the write pointer points to the tail of the zone to be recovered), all the data in the zone needs to be replaced, the target recovery mode can be zone full recovery, and the abnormal data can be missing data or damaged data when zone full recovery is adopted.
Step S12: and carrying out data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragment information of the fragments to be recovered and the zone to be recovered to which the fragments to be recovered belong.
In one implementation scenario, in the case where the target recovery mode is zone incremental recovery, the zone to be recovered lacks data after the write pointer, the missing data is located at a first location of the zone to be recovered, and the recovery strategy matched with zone incremental recovery includes: missing data is recovered at the first location. It should be noted that, in this context, the first position of the zone to be recovered includes the tail of the zone to be recovered. In this implementation scenario, based on the fragmentation information of the to-be-recovered fragments and the to-be-recovered zone to which the to-be-recovered fragments belong, a recovery strategy matched with the target recovery mode is adopted to perform data recovery, including the following steps:
s121: based on the fragment information of the fragments to be restored, searching a first target zone where the normal fragments belonging to the same object as the fragments to be restored are located in a storage system;
s122: and restoring the missing data at the first position of the zone to be restored based on the normal data read from the first position of the first target zone.
In this implementation scenario, when data storage is performed, the same data to be stored (referred to herein as an object) may be stored in different zones; if one or more slices in one of the zones are damaged (the damaged slice is called a to-be-restored slice, the damaged zone is called a to-be-restored zone), a zone corresponding to an undamaged slice belonging to one object and the to-be-restored slice is searched in the whole storage system based on the slice information of the to-be-restored slice in the to-be-restored zone, namely, a first target zone, after the first target zone is obtained, normal data stored on the first target zone can be read, and data which should be stored in a first position of the to-be-restored zone is obtained, so that abnormal data are restored. It should be noted that there may be more than one first target zone. Specifically, when the data is damaged, the storage node with the abnormal zone data (namely, the storage node corresponding to the zone to be restored) can select the zone increment to restore according to the received target restoration mode, meanwhile, obtain the information of the zone to be restored, find the specific position of the corresponding zone to be restored in the storage node according to the information of the zone to be restored, find the storage position of the normal zone belonging to the same object as the zone to be restored according to the information of the zone to be restored (the storage position of the normal zone may be other storage nodes in the same storage system), and establish a data reading network connection for each other normal zone, so that the normal data in other normal zones can be read.
In one implementation scenario, in the case where the target recovery mode is zone incremental recovery, based on normal data read from a first location in a first target zone, missing data is recovered at the first location of the zone to be recovered, comprising the steps of:
s1221: respectively selecting each missing block in the missing data as a current block;
s1222: responding to the fact that data blocks which are in the same position as the current blocks are read from the first target zones of the target number respectively, reconstructing based on the read data blocks, and writing the recovered current blocks into the zones to be recovered;
s1223: and re-executing the step of respectively selecting each missing block in the missing data as the current block and the subsequent step until each missing block is recovered.
In this implementation scenario, instead of reading all data on the first target zone at one time, the zone incremental recovery may be divided into multiple data blocks for batch recovery; specifically, in this implementation scenario, the size of the data block may be 32KB, and of course, in other implementation scenarios, the size of the data block may be selected from other values, and the specific value of the data block is not limited in this application; the sizes of the missing block and the current block are the same as those of the data blocks, namely 32KB, the data blocks which are positioned at the same position as the current block are read from the first target zones of the target number each time, reconstruction is carried out according to the obtained data blocks, the reconstructed current block is used as the restored current block, and then the restoration of the next missing block is carried out; in this embodiment, if the missing data is 32MB and the missing block size is 32KB, the steps S1221 to S1223 need to be performed in a loop for 1024 times to complete the recovery of the missing data. In addition, in the present embodiment, the method for reconstructing the current data block according to other normal data blocks may use an Erasure Coding (EC) algorithm, and in other embodiments, other algorithms may also be used, which is not limited in this application.
It should be noted that, in the present implementation scenario, the number of the first target zones, that is, "target number" is obtained based on the redundancy rule, specifically, the object (that is, the data to be stored above) is divided into n+m slices, that is, the object original data is divided into N data slices, and the data slices calculate M redundancy slices according to the EC erasure code, that is, at least the object to which the N slices belong is the same as the object to which the slices to be restored belong in the zones to be restored, and the N slices correspond to the N first target zones respectively, so the target number is N.
In this implementation scenario, normal data may be obtained through an asynchronous read callback. Specifically, when the normal data stored on the N first target zones are read, asynchronously waiting for the N data blocks to be successfully read, and if the reading is overtime or the reading is failed, considering that the execution of the recovery task is failed and reporting the recovery task; if the reading is successful, the current block to be recovered can be rebuilt through the erasure code algorithm, and the recovered current block is written into the zone to be recovered. In other implementations, normal data may also be obtained through a synchronization callback.
In one implementation scenario, when the target recovery mode is zone full recovery, there is abnormal data of a slice to be recovered before a write pointer in a zone to be recovered, the abnormal data is located at a second position of the zone to be recovered, and a recovery strategy matched with zone full recovery includes: and copying normal data in the zone to be recovered to the new allocation zone in situ, and recovering abnormal data at a second position of the new allocation zone. In this context, the second position of the zone to be restored may be the tail or other position of the zone to be restored. It should be noted that the second position of the newly allocated zone should correspond to the second position of the zone to be restored, namely: if the abnormal data is not matched with the tail of the zone to be restored, the restored data should be written into the tail of the newly allocated zone.
In one implementation scenario, under the condition that the target recovery mode is zone full recovery, performing data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragmentation information of the fragments to be recovered and the zone to be recovered to which the fragments to be recovered belong, including:
s121: detecting whether current data in the zone to be restored is normal data from beginning to end; specifically, the data anomaly includes at least one of data missing, empty data, data damaged, and the like, and a specific data anomaly detection method will be described below, which is not repeated here.
S122: in response to the current data being normal data, copying the normal data to a new allocation zone in situ;
s123: responding to the current data as abnormal data, searching a second target zone where a normal fragment belonging to the same object as the fragment to be restored is located in a storage system based on fragment information of the fragment to be restored, and restoring the abnormal data at a second position of the newly allocated zone based on the normal data read from the second position of the second target zone;
the difference between the description in the embodiment of "restore missing data at the first location of the zone to be restored based on the normal data read from the first location of the first target zone" may be only that: in the foregoing embodiment, after the recovered missing data is obtained, the recovered missing data is directly added to the first position of the zone to be recovered, and in the present embodiment, the recovered normal data needs to be added to the second position of the newly allocated zone.
S124: and re-executing the step of detecting whether the current data in the zone to be restored is normal data from beginning to end until the zone to be restored is fully restored to the newly allocated zone.
Specifically, referring to fig. 3, fig. 3 is a schematic diagram of zone full recovery in the present application; in the figure, zone2 is a zone to be recovered, zone1, zone3, zone4 and zone5 are all second target zones, new zones are newly allocated zones, and the initial data is not available, namely, the slice 2-1 in zone2 is damaged, namely, the slice 2-1 in zone2 is the slice to be recovered; when the total quantity of the zones is recovered, a storage node firstly distributes a brand new zone which is not written with data and is used for writing the recovered data, the storage positions of other slices belonging to the same object are found according to the slicing information of the slices to be recovered, namely, a target number of second target zones are found, and a read data network connection is established for the target number of second target zones; detecting whether the data in the zone2 is normal data from beginning to end, and copying the normal data into a new zone; specifically, detecting whether the data in the zone2 is normal data and copying the normal data to the new zone can read all the normal data at one time until the abnormal data in the to-be-restored fragment is read, and copy the normal data to the new zone at one time, or divide the normal data into a plurality of data blocks for batch restoration, wherein the batch restoration method has been described in the foregoing embodiments and will not be repeated here; after the total number of the zones is restored to the damaged position, reading data from the second target zones of the target number, reconstructing the data based on the read data blocks, and writing the restored data in the corresponding positions of the new zones, wherein the reconstruction process based on the read data blocks is described in the foregoing embodiments and is not repeated herein; after the recovery of the recovery fragmentation is completed, judging whether the recovery of the zone to be recovered is completed, if the zone to be recovered is not recovered, re-executing the steps S121-S123 until the zone to be recovered is completely recovered,
Closing the newly allocated zone and deleting the original zone to be restored; illustratively, after all 5 data in zone2 is restored and copied to the new zone, the new zone is shut down and zone2 is deleted.
It should be noted that, in the case that the target recovery mode is zone full recovery and zone incremental recovery, asynchronous read-back may need to be performed to obtain normal data stored in the target number of first target zones or second target zones; if the first target zone or the second target zone
The target zone and the zone to be restored are stored on the same storage node, and then the normal data are directly read from the storage node 0 and stored in the memory; if the first target zone or the second target zone and the zone to be restored are stored on different storage nodes, the normal data are stored in the memory of the storage node after asynchronous read-back is needed.
In the above embodiment, the asynchronous read callback can prevent that a certain data block cannot be read, so that the whole task is blocked.
5 in one implementation scenario, data recovery in response to metadata node issues in a storage system
The multi-task analyzes the data recovery task to obtain a target recovery mode and a fragmentation message of the to-be-recovered fragments
Before the zone to be restored to which the partition to be restored belongs, the method further comprises the following steps:
s101: comparing a first check code calculated when the data fragments in the SMR hard disk are written with a second check code calculated when the SMR hard disk is scanned;
0S102: in response to the first check code being different from the second check code, fragmenting the data as to be recovered
Multiplexing the fragments, and reporting the fragments to be recovered to the metadata node; for each zone in the SMR hard disk, the metadata node generates and issues a data recovery task based on the distribution condition of the to-be-recovered fragments in the zone.
The method is characterized in that a cyclic redundancy check method (Cyclic Redundancy Check,5 CRC) or a theoretical storage length check method is commonly used for judging data damage, wherein the cyclic redundancy check method utilizes original data to calculate and generate a short fixed bit check code and encodes the original data, error detection is carried out through the principles of division and remainder, and when the data are inconsistent with the actual data, the slight difference can lead to the difference of the calculated CRC check code, so that whether the data are tampered or damaged can be easily checked; the storage system synchronously calculates and stores CRC check codes of original data when writing the data, the storage node periodically scans the data on the disk, calculates and compares the CRC check codes with the CRC check codes when writing, and considers that the data is damaged and reported when the check codes are inconsistent, so that the fragments and the Zone to which the fragments belong are marked as damaged; the theoretical storage length check rule stores the theoretical storage length of each fragment when data is written; after the data writing is finished or in the regular scanning of the storage node, the actual storage length of the fragments is reported so as to compare the actual length with the recorded theoretical length, if the actual length is inconsistent with the recorded theoretical length, the recorded theoretical length is corrected to the actual length, the fragments are marked as lost, and the fragments zone damage is marked. Therefore, in this implementation scenario, the first check code may be a CRC check code obtained when writing data based on a cyclic redundancy check method or a theoretical storage length obtained when writing based on a theoretical storage length check method, and the second check code may be a CRC check code obtained when periodically scanning data based on a cyclic redundancy check method or an actual storage length obtained when periodically scanning based on a theoretical storage length check method, and when the first check code and the second check code are different, the scanned data is considered to be abnormal, and the corresponding slices and zones are marked as slices to be recovered and zones to be recovered.
According to a second aspect of the present application, an SMR hard disk data recovery method is provided, please refer to fig. 4, fig. 4 is a flow chart of an embodiment of an SMR hard disk data recovery method according to the second aspect of the present application; in the second aspect of the present application, the execution body of the SMR hard disk data recovery method may be a metadata node, specifically, the method includes the following steps:
s41, determining a target recovery mode of the to-be-recovered fragments, and acquiring fragment information of the to-be-recovered fragments and to-be-recovered zones to which the to-be-recovered fragments belong; the target recovery mode is any one of zone incremental recovery and zone full recovery;
reference may be made specifically to the foregoing embodiments, and details are not repeated herein.
S42, generating a data recovery task based on the target recovery mode of the to-be-recovered fragments, the fragment information and the affiliated to-be-recovered zone;
reference may be made specifically to the foregoing embodiments, and details are not repeated herein.
S43, issuing a data recovery task to a target storage node in the storage system; the target storage node responds to the data recovery task, analyzes the data recovery task, obtains a target recovery mode, the fragmentation information of the fragments to be recovered and the fragments to be recovered, and performs data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragmentation information of the fragments to be recovered and the fragments to be recovered; reference may be made specifically to the foregoing embodiments, and details are not repeated herein.
In one implementation scenario, the target storage node is the storage node where the zone to be restored is located in case the target restoration mode is zone incremental restoration, and/or is any storage node in the storage system in case the target restoration mode is zone full restoration. Reference may be made specifically to the foregoing embodiments, and details are not repeated herein.
In one implementation scenario, determining a target recovery pattern for a tile to be recovered includes:
s411, acquiring a first fragment length stored when the data fragments are written into a storage node, and receiving a second fragment length of the data fragments reported by the storage node when the storage node scans an SMR hard disk;
s412, determining whether to take the data fragments as fragments to be recovered based on whether the first fragment length and the second fragment length of the same data fragments are consistent;
s413, for each zone in the SMR hard disk, determining a target recovery mode of the to-be-recovered fragments based on the distribution condition of the to-be-recovered fragments in the zone.
In a specific implementation scenario, please refer to fig. 3 and 5, fig. 5 is a schematic diagram illustrating damage to other locations of the zone in the present application; the process of determining the to-be-restored slice in step S411 and step S412 has been described in the foregoing embodiments, and will not be described in detail herein; in fig. 3, the zone tail is damaged, and the zone before the position pointed by the write pointer is not damaged, so that the target recovery mode can adopt a zone increment recovery mode; in fig. 5, assuming that zone2 includes three slices, where slice 1 and slice 2 have data anomalies, slice 3 has normal data, and there is a damaged slice before the write pointer, then all the target recovery modes of the slices to be recovered in the zone are zone full recovery. In addition, it should be noted that the data anomalies in fig. 5 may include both anomalies of data corruption and null data, and may include other anomalies, which are not limited herein.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Referring to fig. 6, fig. 6 is a schematic diagram of a frame of an embodiment of an SMR hard disk data recovery device 60 of the present application, specifically, the SMR hard disk data recovery device includes an parsing module 61 configured to parse a data recovery task in response to a data recovery task issued by a metadata node in a storage system, to obtain a target recovery mode, partition information of a partition to be recovered, and a zone to be recovered to which the partition to be recovered belongs; the target recovery mode is any one of zone incremental recovery and zone full recovery; and the recovery module 62 is configured to perform data recovery by adopting a recovery policy matched with the target recovery mode based on the fragmentation information of the to-be-recovered fragments and the to-be-recovered zone to which the to-be-recovered fragments belong.
In some disclosed embodiments, the recovery module 62 further includes a searching sub-module, configured to search, based on the shard information of the shard to be recovered, a first target zone where a normal shard belonging to the same object as the shard to be recovered is located in the storage system; and the recovery sub-module is used for recovering the missing data at the first position of the zone to be recovered based on the normal data read from the first position of the first target zone.
In some disclosed embodiments, the recovery submodule is further configured to select each missing block in the missing data as a current block; responding to the fact that data blocks which are in the same position as the current blocks are read from the first target zones of the target number respectively, reconstructing based on the read data blocks, and writing the recovered current blocks into the zones to be recovered; and re-executing the step of respectively selecting each missing block in the missing data as the current block and the subsequent step until each missing block is recovered.
In some disclosed embodiments, restoration module 62 also includes a second restoration sub-module for copying normal data in the zone to be restored to the newly allocated zone in-situ and restoring the abnormal data at a second location of the newly allocated zone.
In some disclosed embodiments, the second recovery submodule is further configured to detect, from beginning to end, whether current data in the zone to be recovered is normal data; in response to the current data being normal data, copying the normal data to a new allocation zone in situ; responding to the current data as abnormal data, searching a second target zone where a normal fragment belonging to the same object as the fragment to be restored is located in a storage system based on fragment information of the fragment to be restored, and restoring the abnormal data at a second position of the newly allocated zone based on the normal data read from the second position of the second target zone; and re-executing the step of detecting whether the current data in the zone to be restored is normal data from beginning to end until the zone to be restored is fully restored to the newly allocated zone.
In some disclosed embodiments, the SMR hard disk data recovery device 60 further includes a comparing module, a reporting module, and a task generating module, where the comparing module is configured to compare a first check code calculated when writing a data slice in the SMR hard disk with a second check code calculated when scanning the SMR hard disk; the reporting module responds to the fact that the first check code is different from the second check code, the data fragments are used as fragments to be recovered, and the fragments to be recovered are reported to the metadata node; the task generating module is used for generating and transmitting data recovery tasks for each zone in the SMR hard disk based on the distribution condition of the to-be-recovered fragments in the zone by the metadata node.
Referring to fig. 7, fig. 7 is a schematic diagram of a frame of another embodiment of an SMR hard disk data recovery device 70 according to the present application; the method specifically comprises the following steps: a determining module 71, configured to determine a target recovery mode of the to-be-recovered partition; the target recovery mode is any one of zone incremental recovery and zone full recovery; an obtaining module 72, configured to obtain fragment information of a fragment to be recovered and a zone to be recovered to which the fragment to be recovered belongs; a generating module 73, configured to generate a data recovery task based on the target recovery mode of the to-be-recovered slice, the slice information, and the to-be-recovered zone to which the to-be-recovered slice belongs; a issuing module 74, configured to issue a data recovery task to a target storage node in the storage system; the target storage node responds to the data recovery task, analyzes the data recovery task, obtains a target recovery mode, the fragmentation information of the fragments to be recovered and the fragments to be recovered, and performs data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragmentation information of the fragments to be recovered and the fragments to be recovered.
In one implementation scenario, the target storage node is the storage node where the zone to be restored is located in case the target restoration mode is zone incremental restoration, and/or is any storage node in the storage system in case the target restoration mode is zone full restoration.
In one implementation scenario, the determining module 71 includes an analyzing sub-module, configured to obtain a first segment length stored when the data segment is written into the storage node, and receive a second segment length of the data segment reported by the storage node when the storage node scans the SMR hard disk; the resolution sub-module is used for determining whether to take the data fragments as fragments to be recovered or not based on whether the first fragment length and the second fragment length of the same data fragments are consistent or not; the mode determining submodule is used for determining a target recovery mode of the to-be-recovered fragments for each zone in the SMR hard disk based on the distribution condition of the to-be-recovered fragments in the zone.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Referring to fig. 8, fig. 8 is a schematic diagram of a frame structure of a storage node 80 according to the present application, including a communication circuit 81, a memory 82 and a processor 83, where the communication circuit 81 and the memory 82 are respectively coupled to the processor 83, the memory 82 stores program instructions, and the processor 83 is configured to execute the program instructions to implement the SMR hard disk data recovery method of the first aspect.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Referring to fig. 9, fig. 9 is a schematic diagram of a frame structure of a metadata node 90 according to the present application, which includes a communication circuit 91, a memory 92 and a processor 93, wherein the communication circuit 91 and the memory 92 are respectively coupled to the processor 93, the memory 92 stores program instructions, and the processor 93 is configured to execute the program instructions to implement the SMR hard disk data recovery method of the second aspect.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Referring to fig. 10, fig. 10 is a schematic diagram of a frame structure of a storage system 100 according to the present application, including a user device 101, a metadata node 102 and a storage node 103 connected to a network, where the user device 101 sends data to be stored to the metadata node 102, the metadata node 102 slices the data to be stored to obtain data slices, the data slices are distributed to the storage node 103 for storage, and the storage node 103 uses an SMR hard disk for data storage;
the storage node is the storage node in the foregoing embodiment, and the metadata node is the metadata node in the foregoing embodiment.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
Referring to FIG. 11, FIG. 11 is a schematic diagram illustrating an embodiment of a computer readable storage medium 110 of the present application. The computer readable storage medium 110 stores program instructions 111 that can be executed by a processor, where the program instructions 111 are configured to implement the steps in any of the foregoing embodiments of the SMR hard disk data recovery method.
According to the scheme, the target recovery mode, the fragmentation information of the fragments to be recovered and the zone to be recovered, to which the fragments to be recovered belong, are obtained by analyzing the data recovery task issued by the metadata node, and the zone incremental recovery or the zone total recovery is carried out, so that damaged data in the SMR hard disk are recovered.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
If the technical scheme of the application relates to personal information, the product applying the technical scheme of the application clearly informs the personal information processing rule before processing the personal information, and obtains independent consent of the individual. If the technical scheme of the application relates to sensitive personal information, the product applying the technical scheme of the application obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of 'explicit consent'. For example, a clear and remarkable mark is set at a personal information acquisition device such as a camera to inform that the personal information acquisition range is entered, personal information is acquired, and if the personal voluntarily enters the acquisition range, the personal information is considered as consent to be acquired; or on the device for processing the personal information, under the condition that obvious identification/information is utilized to inform the personal information processing rule, personal authorization is obtained by popup information or a person is requested to upload personal information and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing mode, and a type of personal information to be processed.

Claims (12)

1. An SMR hard disk data recovery method, comprising:
Responding to a data recovery task issued by a metadata node in a storage system, and analyzing the data recovery task to obtain a target recovery mode, fragment information of fragments to be recovered and a zone to be recovered to which the fragments to be recovered belong; the target recovery mode is any one of zone incremental recovery and zone full recovery;
and based on the fragment information of the fragments to be recovered and the zone to be recovered to which the fragments to be recovered belong, adopting a recovery strategy matched with the target recovery mode to recover the data.
2. The method of claim 1, wherein the zone to be restored lacks data after a write pointer in a case where the target restoration pattern is the zone delta restoration, the missing data being located at a first location of the zone to be restored, a restoration policy matching the zone delta restoration comprising: restoring the missing data at the first location.
3. The method according to claim 2, wherein the performing data recovery by using a recovery policy matched with the target recovery pattern based on the fragmentation information of the to-be-recovered fragments and the to-be-recovered zone to which the to-be-recovered fragments belong includes:
Searching a first target zone where a normal fragment belonging to the same object as the fragment to be restored is located in the storage system based on the fragment information of the fragment to be restored;
and recovering the missing data at the first position of the zone to be recovered based on the normal data read from the first position in the first target zone.
4. The method of claim 3, wherein the recovering the missing data at the first location of the zone to be recovered based on normal data read from the first location in the first target zone comprises:
respectively selecting each missing block in the missing data as a current block;
responding to the fact that data blocks which are in the same position with the current blocks are read from the first target zones of the target number respectively, reconstructing based on the read data blocks, and writing the recovered current blocks into the zones to be recovered;
and re-executing the step and the subsequent steps of respectively selecting each missing block in the missing data as the current block until each missing block is recovered.
5. The method of claim 1, wherein in the case where the target recovery pattern is the zone full recovery, there is abnormal data of the to-be-recovered slice in the to-be-recovered zone before a write pointer, the abnormal data being located at a second location of the to-be-recovered zone, a recovery policy matching the zone full recovery comprises: and copying the normal data in the zone to be recovered to a new distribution zone in situ, and recovering the abnormal data at a second position of the new distribution zone.
6. The method of claim 5, wherein the performing data recovery with a recovery policy matched with the target recovery pattern based on the fragmentation information of the to-be-recovered fragments and a to-be-recovered zone to which the to-be-recovered fragments belong comprises:
detecting whether the current data in the zone to be restored is the normal data from beginning to end;
in response to the current data being the normal data, copying the normal data to the new allocation zone in situ;
responding to the current data as the abnormal data, searching a second target zone where a normal fragment belonging to the same object as the fragment to be restored is located in the storage system based on the fragment information of the fragment to be restored, and restoring the abnormal data at a second position of the newly allocated zone based on the normal data read from the second position in the second target zone;
and re-executing the step of detecting whether the current data in the zone to be restored is the normal data from beginning to end until the zone to be restored is fully restored to the new distribution zone.
7. The method of claim 1, wherein before the responding to the data recovery task issued by the metadata node in the storage system, parsing the data recovery task to obtain the target recovery mode, the shard information of the shard to be recovered, and the zone to be recovered to which the shard to be recovered belongs, the method further comprises:
Comparing a first check code calculated when the data fragments in the SMR hard disk are written with a second check code calculated when the SMR hard disk is scanned;
responding to the fact that the first check code is different from the second check code, taking the data fragments as the fragments to be recovered, and reporting the fragments to be recovered to the metadata node;
and for each zone in the SMR hard disk, the metadata node generates and transmits the data recovery task based on the distribution condition of the to-be-recovered fragments in the zone.
8. An SMR hard disk data recovery method, comprising:
determining a target recovery mode of a to-be-recovered slice, and acquiring slice information of the to-be-recovered slice and a to-be-recovered zone to which the to-be-recovered slice belongs; the target recovery mode is any one of zone incremental recovery and zone full recovery;
generating a data recovery task based on the target recovery mode of the to-be-recovered fragments, the fragment information and the affiliated to-be-recovered zone;
issuing the data recovery task to a target storage node in a storage system; the target storage node responds to the data recovery task, analyzes the data recovery task, obtains the target recovery mode, the fragmentation information of the fragments to be recovered and the fragments to be recovered to the zone to be recovered, and performs data recovery by adopting a recovery strategy matched with the target recovery mode based on the fragmentation information of the fragments to be recovered and the fragments to be recovered to the zone to be recovered.
9. The method of claim 8, wherein determining the target recovery pattern for the shard to be recovered comprises:
acquiring a first fragment length stored when the data fragments are written into the storage node, and receiving a second fragment length of the data fragments reported by the storage node when the storage node scans an SMR hard disk;
determining whether to take the data fragments as the fragments to be recovered or not based on whether the first fragment length and the second fragment length of the same data fragments are consistent or not;
and for each zone in the SMR hard disk, determining a target recovery mode of the to-be-recovered fragments based on the distribution condition of the to-be-recovered fragments in the zone.
10. A storage node comprising communication circuitry, a memory, and a processor, the communication circuitry, the memory being respectively coupled to the processor, the memory storing program instructions, the processor being configured to execute the program instructions to implement the SMR hard disk data recovery method of any of claims 1 to 7.
11. A metadata node comprising communication circuitry, a memory, and a processor, the communication circuitry, the memory being respectively coupled to the processor, the memory storing program instructions, the processor being configured to execute the program instructions to implement the SMR hard disk data recovery method of any of claims 8 to 9.
12. A computer-readable storage medium storing program instructions executable by a processor for implementing the SMR hard disk data recovery method of any one of claims 1 to 9.
CN202211743857.3A 2022-12-29 2022-12-29 SMR hard disk data recovery method, node and storage medium Pending CN116048880A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211743857.3A CN116048880A (en) 2022-12-29 2022-12-29 SMR hard disk data recovery method, node and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211743857.3A CN116048880A (en) 2022-12-29 2022-12-29 SMR hard disk data recovery method, node and storage medium

Publications (1)

Publication Number Publication Date
CN116048880A true CN116048880A (en) 2023-05-02

Family

ID=86123220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211743857.3A Pending CN116048880A (en) 2022-12-29 2022-12-29 SMR hard disk data recovery method, node and storage medium

Country Status (1)

Country Link
CN (1) CN116048880A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116737451A (en) * 2023-05-26 2023-09-12 珠海妙存科技有限公司 Data recovery method and device of flash memory, solid state disk and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116737451A (en) * 2023-05-26 2023-09-12 珠海妙存科技有限公司 Data recovery method and device of flash memory, solid state disk and storage medium

Similar Documents

Publication Publication Date Title
US6665815B1 (en) Physical incremental backup using snapshots
EP3109757B1 (en) Data storage method, data recovery method, related apparatus, and system
US8145603B2 (en) Method and apparatus for data recovery using storage based journaling
CN106776130B (en) Log recovery method, storage device and storage node
US20070208918A1 (en) Method and apparatus for providing virtual machine backup
US7802134B1 (en) Restoration of backed up data by restoring incremental backup(s) in reverse chronological order
US20080162599A1 (en) Optimizing backup and recovery utilizing change tracking
JP2001508894A (en) System and method for backing up computer files in a wide area computer network
CN111506251A (en) Data processing method, data processing device, SMR storage system and storage medium
US20210181992A1 (en) Data storage method and apparatus, and storage system
US10572335B2 (en) Metadata recovery method and apparatus
CA2825885C (en) Storage system and information processing method
CN116048880A (en) SMR hard disk data recovery method, node and storage medium
WO2022105442A1 (en) Erasure code-based data reconstruction method and appratus, device, and storage medium
US7913109B2 (en) Storage control apparatus and storage control method
CN110309012B (en) Data processing method and device
CN114442944B (en) Data replication method, system and equipment
CN114491145B (en) Metadata design method based on stream storage
CN115328696A (en) Data backup method in database
CN111399774B (en) Data processing method and device based on snapshot under distributed storage system
US7734573B2 (en) Efficient recovery of replicated data items
US9195546B1 (en) Rotating incremental data backup
CN114461455A (en) Method and device for repairing bad blocks of disk of stream replication cluster
CN114217741A (en) Storage method of storage device and storage device
CN111124740A (en) Data reading method and device, storage equipment and machine-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination