CN112506710A

CN112506710A - Distributed file system data repair method, device, equipment and storage medium

Info

Publication number: CN112506710A
Application number: CN202011487179.XA
Authority: CN
Inventors: 刘日新; 陈紫卿
Original assignee: Sangfor Technologies Co Ltd
Current assignee: Sangfor Technologies Co Ltd
Priority date: 2020-12-16
Filing date: 2020-12-16
Publication date: 2021-03-16
Anticipated expiration: 2040-12-16
Also published as: CN112506710B

Abstract

The application discloses a data recovery method for a distributed file system, which comprises the following steps: in the normal working process of the distributed file system, if a redundant node fails, recording repair information by using a shared memory, wherein the repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap; and after the failed redundant node is recovered, performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory. By applying the technical scheme provided by the application, the data copy of the failed redundant node can be repaired by using smaller repair granularity, excessive repair amplification caused by excessive repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource overhead can be saved, and the performance stability of the whole system is improved. The application also discloses a distributed file system data restoration device, equipment and a storage medium, and the distributed file system data restoration device, the equipment and the storage medium have corresponding technical effects.

Description

Distributed file system data repair method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer application technologies, and in particular, to a method, an apparatus, a device, and a storage medium for data recovery of a distributed file system.

Background

With the rapid development of computer technology, the application of Distributed File System (DFS) is becoming more and more widespread. Distributed file system refers to a file system in which the physical storage resources managed by the file system are not necessarily directly connected to a local node, but are connected to the node through a computer network, or are combined by a plurality of different logical disk partitions or volume labels to form a complete hierarchical file system. The distributed file system provides a logical tree file system structure for resources distributed at any position on the network, so that users can access shared files distributed on the network more conveniently.

The distributed file system can comprise a plurality of redundant nodes, different redundant nodes are provided with data copies, and the distributed file system can normally work without interrupting upper-layer application under the condition that one redundant node normally works when a fault occurs in one redundant node. When the failed redundant node is recovered, the data copy of the failed redundant node needs to be recovered so as to ensure the consistency of the data copy in each redundant node and ensure the data safety.

At present, the data copy is mostly repaired in a global bitmap mode. Namely, when a redundant node fails and data is updated, corresponding marking is carried out in the global bitmap, and then after the failed redundant node is recovered, the data copy is repaired through the global bitmap. This repair is maintained at a 128M granularity. If the actual use process is only the short-time interruption, such as service restart, system maintenance (upgrade), network jitter, etc., the situation that only part of the data is inconsistent occurs. In these similar scenarios, the repair granularity of 128M will cause excessive repair amplification, the range of checking data inconsistency is large, even a short-time interruption will cause a large number of data repair tasks, and a data window of a single-point copy is enlarged, which is easy to cause multi-point failure. Meanwhile, data verification in a large range can cause excessive resource overhead, and the performance stability of the whole system is influenced.

Disclosure of Invention

The application aims to provide a distributed file system data repair method, a distributed file system data repair device and a storage medium, so that data repair granularity is reduced, resource overhead is saved, and performance stability of a system is improved.

In order to solve the technical problem, the application provides the following technical scheme:

a distributed file system data repair method comprises the following steps:

in the normal working process of the distributed file system, if a redundant node fails, recording repair information by using a shared memory, wherein the number of the failed redundant node is less than the total number of the redundant node, and the repair granularity corresponding to the repair information is less than that corresponding to the global bitmap;

and after the failed redundant node is recovered, repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory.

In an embodiment of the present application, in the case that a redundant node fails, the method further includes:

and under the condition that the proportion of the repair information in the shared memory is larger than a set first proportion threshold, moving the repair information in the shared memory into a log file in batches.

In a specific embodiment of the present application, the performing a repair process on the data copy of the failed redundant node based on the repair information recorded in the shared memory includes:

and based on the shared memory and the repair information recorded in the log file, repairing the data copy of the failed redundant node.

and in the process of recording the repair information, marking to-be-repaired in the global bitmap.

In a specific embodiment of the present application, before performing the repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method further includes:

determining whether the proportion of the repair information in the log file is smaller than a set second proportion threshold value;

and if the proportion of the repair information in the log file is smaller than the second proportion threshold, executing the step of performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

In a specific embodiment of the present application, after determining whether the percentage of the repair information in the log file is smaller than a set second percentage threshold, the method further includes:

and if the proportion of the repair information in the log file is not less than the second proportion threshold, performing repair processing on the data copy of the failed redundant node based on the mark to be repaired of the global bitmap.

In one embodiment of the present application, the method further includes:

before performing atomic updating operation on the shared memory, recording corresponding atomic updating operation in a repair log;

in the process of repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory, if the repair process crashes, determining atomic update operation to be performed before the crash based on the repair log after the repair process is restarted;

and performing the atomicity updating operation on the shared memory.

In a specific embodiment of the present application, the repair log includes a description header and an operation record buffer, where the description header is used to describe a state of a current log record, and the operation record buffer is used to record contents of each atomic update operation in sequence.

marking an unrepaired state in the context of the index nodes of the shared memory;

recording a head version number and a tail version number which are all preset initial values in the index nodes of the shared memory;

and recording a version number as a value of a tail version number in the index node in each interval node of the shared memory and each batch information of the log file.

In a specific embodiment of the present application, in a process of performing repair processing on a data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method further includes:

marking a repair status in the inode context;

increasing the tail version number recorded in the index node by a set step value;

and recording a version number as a value of a tail version number in the index node in the interval node newly inserted in the shared memory and the batch information newly added in the log file.

In a specific embodiment of the present application, after completing the repair process on the data copy of the failed redundant node, the method further includes:

marking an unrepaired state in the inode context;

updating the head version number recorded in the index node to a value same as the tail version number;

and determining and deleting failure information in the shared memory and the log file according to the head version number and the tail version number recorded in the index node.

if the repair process crashes, under the condition that the repair process is restarted and the repair is not triggered, marking an unrepaired state in the context of the index node;

under the condition that the repair process is restarted and repair is triggered, repeatedly executing the mark-under-repair state in the context of the index node; increasing the tail version number recorded in the index node by a set step value; and recording a version number as a value of a tail version number in the index node in the interval node newly inserted in the shared memory and in batch information newly added in the log file.

In one embodiment of the present application, the method further includes:

under the condition that the repair process is restarted and the repair is not triggered, if the tail version number recorded in the index node is equal to the set maximum version number, determining the interval nodes in the shared memory and the batch information in the log file as failure information and deleting the failure information;

and repeatedly executing the step of recording the head version number and the tail version number which are all preset initial values in the index nodes of the shared memory.

A distributed file system data repair apparatus comprising:

the shared memory information recording unit is used for recording repair information by using a shared memory if a redundant node fails in the normal working process of the distributed file system, wherein the number of the failed redundant node is less than the total number of the redundant nodes, and the repair granularity corresponding to the repair information is less than that corresponding to the global bitmap;

and the data copy repairing unit is used for repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory after the failed redundant node is recovered.

A distributed file system data repair device, comprising:

a memory for storing a computer program;

a processor, configured to implement the steps of any one of the above-mentioned distributed file system data repair methods when the computer program is executed.

A computer readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of any of the above-described distributed file system data repair methods.

By applying the technical scheme provided by the embodiment of the application, in the normal working process of the distributed file system, if a redundant node fails and the number of the failed redundant nodes is less than the total number of the redundant nodes, the shared memory can be used for recording repair information, and after the failed redundant node is recovered, the data copy of the failed redundant node is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, and based on the repair information recorded in the shared memory, the data copy of the failed redundant node can be repaired by using smaller repair granularity, so that excessive repair amplification caused by excessive repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource overhead is saved, and the performance stability of the whole system is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating an implementation of a data recovery method for a distributed file system according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a basic structure of a difference repairing module in an embodiment of the present application;

FIG. 3 is a diagram illustrating a format of a repair log according to an embodiment of the present application;

FIG. 4 is a schematic diagram illustrating an implementation of a version number mechanism under a normal condition in an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating an implementation of a version number mechanism under an abnormal condition in an embodiment of the present application;

fig. 6 is a schematic structural diagram of a distributed file system data recovery apparatus in an embodiment of the present application;

fig. 7 is a schematic structural diagram of a distributed file system data repair device in an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, an implementation flowchart of a distributed file system data repair method provided in an embodiment of the present application is shown, where the method may include the following steps:

s110: in the normal working process of the distributed file system, if a redundant node fails, the shared memory is used for recording repair information, the number of the failed redundant node is less than the total number of the redundant node, and the repair granularity corresponding to the repair information is less than the repair granularity corresponding to the global bitmap.

The distributed file system can comprise a plurality of redundant nodes, the redundant nodes form a redundant relation with each other, and even if only one redundant node works normally, the distributed file system can work normally. In the normal working process of the distributed file system, a redundant node may fail due to hardware, a network and the like, and the failed redundant node cannot continue to work normally. If the number of the redundant nodes with faults is less than the total number of the redundant nodes, the distributed file system can still work normally.

When a redundant node fails and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the non-failed redundant nodes are in a normal working state, the data copies of the non-failed redundant nodes are updated in real time, but the data copies of the failed redundant nodes are not updated any more, so that the data copies of the redundant nodes are inconsistent, and after the failed redundant nodes are recovered, the data copies of the redundant nodes need to be recovered, so that the data copies of the redundant nodes are consistent with the data copies of the normally working redundant nodes, and the safety of the distributed file system is guaranteed.

When a redundant node fails, the repair information can be determined based on the data copy of the redundant node which normally works at present, and the repair information is recorded by using the shared memory. The repair information may specifically be a data range that needs to be synchronized, such as offset, length, copy sequence number, and other information of the repair. And the repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap. The repair granularity of the repair information may be 4KB, for example.

S120: and after the failed redundant node is recovered, performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory.

After the failed redundant node is recovered, the data copy of the failed redundant node is not consistent with the data copy of the redundant node which normally works before, so that the data copy of the failed redundant node needs to be repaired. The data copy of the failed redundant node can be repaired based on the repair information recorded in the shared memory. Specifically, the information such as the data range to be synchronized can be determined through the repair information recorded in the shared memory, the corresponding data range is searched in the data copy of the failed redundant node, data synchronization is performed, and the repair processing of the data copy of the failed redundant node is realized.

By applying the method provided by the embodiment of the application, in the normal working process of the distributed file system, if a redundant node fails and the number of the failed redundant nodes is less than the total number of the redundant nodes, the shared memory can be used for recording the repair information, and after the failed redundant node is recovered, the data copy of the failed redundant node is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, and based on the repair information recorded in the shared memory, the data copy of the failed redundant node can be repaired by using smaller repair granularity, so that excessive repair amplification caused by excessive repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource overhead is saved, and the performance stability of the whole system is improved.

In an embodiment of the present application, in case of a failure of a redundant node, the method may further include the steps of:

under the condition that the proportion of the repair information in the shared memory is larger than a set first proportion threshold, the repair information in the shared memory is moved into a log file in batches;

step S120 performs repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory, and may include the following steps:

In the embodiment of the present application, it is considered that the storage space of the shared memory is limited, so that, when a redundant node fails, the shared memory is used to record the repair information, and the proportion of the repair information in the shared memory can be monitored. And under the condition that the proportion of the repair information in the shared memory is greater than the set first proportion threshold, the repair information in the shared memory can be moved into the log file in batches. Specifically, the repair information in the shared memory may be serialized and written into the log file. The log file may be located on a cache disk, with more storage space. The first proportional threshold may be set and adjusted according to actual conditions, for example, set to 80%.

That is to say, when the amount of the repair information data recorded in the shared memory is not large, the repair information data may be recorded only in the shared memory, and when the amount of the repair information data recorded in the shared memory is large, the repair information data is serialized and then batch-shifted into the log file, thereby releasing the space of the shared memory, the newly generated repair information may continue to be recorded in the shared memory, and when the percentage of the repair information recorded in the shared memory is larger than the first ratio threshold, the repair information in the shared memory is batch-shifted into the log file.

After the failed redundant node is recovered, repair information may be recorded in both the shared memory and the log file, and the data copy of the failed redundant node may be repaired based on the repair information recorded in the shared memory and the log file.

The shared memory ensures the fastest access, meanwhile, the recorded information cannot be lost when the process is restarted, when the shared memory occupies a large space, the log file is automatically eliminated from the log file stored in the data disc, the capacity of the log file is larger than that of the shared memory, and the longer effective repair optimization time can be ensured.

The repair granularity of the repair information recorded in the shared memory and the log file is small, so that the problems of excessive repair amplification and the like can be avoided, a large number of data repair tasks cannot be caused in the scenes of short interruption and the like, the data window of a single-point copy is prevented from being enlarged, and the performance stability of the whole system is improved.

in the process of recording the repair information, marking to-be-repaired in the global bitmap;

before the data copy of the failed redundant node is repaired based on the shared memory and the repair information recorded in the log file, the method further includes:

if the proportion of the repair information in the log file is smaller than a second proportion threshold value, executing a step of performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file;

and if the proportion of the repair information in the log file is not less than a second proportion threshold value, performing repair processing on the data copy of the failed redundant node based on the to-be-repaired mark of the global bitmap.

In the embodiment of the application, under the condition that a redundant node fails, the repair information is recorded in the shared memory and the log file, and in the process of recording the repair information, the mark to be repaired can be marked in the global bitmap. Such as by "1", "0", etc.

Before the data copy of the failed redundant node is repaired based on the shared memory and the repair information recorded in the log file, it may be determined whether the percentage of the repair information in the log file is smaller than a set second percentage threshold. The second proportional threshold may be set and adjusted according to actual conditions, such as 85%.

If the data copy size is smaller than the first threshold value, the data copy can be repaired with smaller repair granularity, that is, the data copy of the failed redundant node can be repaired based on the shared memory and the repair information recorded in the log file.

If the current quantity of the repair information data needed to be synchronized is not less than the preset quantity, the data copy can be repaired at a larger repair granularity. That is, the data copy of the failed redundant node may be repaired based on the to-be-repaired flag of the global bitmap.

In the embodiment of the application, the data copy can be repaired based on a global bitmap through a traditional repairing module Diff-record, and a difference repairing module DR-Cache is added on the traditional repairing module Diff-record and is used for repairing the data copy based on the shared memory and the repairing information recorded by the log file. As shown in fig. 2, the difference repair module DR-Cache may be implemented under an xllator framework, and may include a difference repair core sub-module, a shared memory management sub-module, and a log file management sub-module, where the difference repair core sub-module may perform operations such as adding, deleting, and merging repair information. When the data volume of the batch repair information recorded in the log file is large, the difference repair module DR-Cache fails, and a traditional repair module Diff-record is used for repairing the data copy. A difference repair module DR-Cache is added on the basis of a traditional repair module Diff-record, so that the repair granularity is reduced, the repair data volume is reduced, and the repair completion time is shortened.

Taking an actual fault scene as an example, the data of the repairing mode of the application is compared with the data of the traditional repairing mode as follows:

(1) the type of failure: service process restart

Common scenarios: hot upgrade

TABLE 1

(2) The type of failure: storage network offline

Common scenarios: network flash and short-time network failure

TABLE 2

(3) The type of failure: disk offline

Common scenarios: replacement data disk, replacement cache disk

TABLE 3

As can be seen from the data in the tables, in a scene with more short-time interruption, compared with the traditional method for repairing the data copy, the method for repairing the data copy has the advantages that the data repair amount is smaller, the repair completion time is shorter, the repair efficiency is improved, the repair overhead is reduced, and the data repair amount tends to be a real data difference amount.

In an embodiment of the present application, the step S120 of performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory may include the following steps:

the first step is as follows: before atomicity updating operation is carried out on the shared memory, recording corresponding atomicity updating operation in a repair log;

the second step is that: in the process of repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory, if the repair process crashes, determining atomic update operation to be performed before the crash based on the repair log after the repair process is restarted;

the third step: and performing the atomicity updating operation on the shared memory.

For convenience of description, the above three steps are combined for illustration.

In the normal working process of the distributed file system, if a redundant node fails and the number of the failed redundant nodes is less than the total number of the redundant nodes, the shared memory can be used for recording repair information, and the repair granularity corresponding to the repair information is less than the repair granularity corresponding to the global bitmap.

After the failed redundant node is recovered, the data copy of the failed redundant node can be repaired based on the repair information recorded in the shared memory. In the repair process, for each piece of repair information, after the repair process is completed on the data copy based on the repair information, the repair information may be cleared in the shared memory. Operations such as recording and deleting repair information in the shared memory can be regarded as atomic update operations on the shared memory.

In order to ensure the consistency of the data update of the shared memory, before performing an atomic update operation on the shared memory, a corresponding atomic update operation may be recorded in the repair log.

The repair log may include a description header for describing a state of the current log record and an operation record buffer for sequentially recording contents of each atomic update operation. Specifically, the repair log may be a Journal log, as shown in fig. 3, the description header shm _ Journal _ head may include a recording field for marking whether Journal is updating, a count field for indicating the number of records in Journal currently, a tail field for indicating a tail pointer of a record cache of a Journal operation, and the like, each shm _ Journal _ record in the operation record cache indicates the content of an atomic update operation, and may include an addr field for indicating a relative destination address of the shared memory, a len field for indicating the length of data to be updated in the shared memory, a value field for indicating the content of data to be updated in the shared memory, and the like.

In the process of performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory, a repair process may crash, and in this case, a part of the data structure may be just updated, so that a problem of inconsistent data update in the shared memory may easily occur. However, according to the method and the device for processing the atomic update operation of the shared memory, the corresponding atomic update operation is recorded in the repair log before the atomic update operation is performed on the shared memory, so that after the repair process is crashed and restarted, the operation playback can be performed through the record in the repair log, and the atomic update operation to be performed before the crash is determined.

After determining the atomic update operation to be performed before the crash, the atomic update operation may be performed on the shared memory, so as to ensure the consistency of each data structure in the shared memory.

In an embodiment of the present application, before performing repair processing on a data copy of a failed redundant node based on a shared memory and repair information recorded in a log file, the method may further include the following steps:

recording a head version number and a tail version number which are all preset initial values in index nodes of a shared memory;

recording a version number as a value of a tail version number in an index node in each interval node of the shared memory and each batch information of the log file;

correspondingly, in the process of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method may further include the following steps:

marking the repairing state in the context of the index node;

recording a version number as a value of a tail version number in an index node in interval nodes newly inserted into a shared memory and batch information newly added in a log file;

correspondingly, after the repair processing is completed on the data copy of the failed redundant node, the method may further include the following steps:

marking an unrepaired state in an inode context;

For convenience of description, the above steps are combined for illustration.

In the embodiment of the application, in the normal working process of the distributed file system, if a redundant node fails and the number of the failed redundant nodes is less than the total number of the redundant nodes, the repair information can be recorded by using the shared memory, and the repair information in the shared memory is moved into the log file in batch under the condition that the proportion of the repair information in the shared memory is greater than the set first proportional threshold. Namely, the repair information is recorded through the shared memory and the log file.

After the failed redundant node is recovered, the data copy of the failed redundant node can be repaired based on the shared memory and the repair information recorded in the log file. Before the data copy of the failed redundant node is repaired based on the shared memory and the repair information recorded in the log file, an unrepaired state may be marked in the context of the index node of the shared memory, indicating that the repair operation on the data copy is not currently performed. In addition, the head and tail version numbers of the index nodes in the shared memory may be recorded as preset initial values, such as 0. And recording the version number as the value of the tail version number in the index node in each interval node of the shared memory and each batch information of the log file.

In the process of performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, a repairing state may be marked in the context of the index node, that is, an unrepaired state marked in the context of the index node is changed to a repairing state. And the tail version number recorded in the index node is increased by a set step value, such as 1. In this process, there may still be repair information to be inserted into the shared memory or batch information written into the log file, and the like, and the value of the version number recorded in the newly inserted interval node in the shared memory and the newly added batch information in the log file may be the tail version number in the index node.

After the repair process is completed on the data copy of the failed redundant node, an unrepaired state may be marked in the context of the index node, that is, the repair-in-progress state marked in the repair process is modified to an unrepaired state. And updating the head version number recorded in the index node to a value same as the tail version number, and determining and deleting failure information in the shared memory and the log file according to the head version number and the tail version number recorded in the index node. Specifically, the interval nodes or batch information outside the head-to-tail version number interval in the index node in the shared memory and the log file may be determined as invalid information, and deleted.

In a specific embodiment of the present application, in a process of performing repair processing on a data copy of a failed redundant node based on a shared memory and repair information recorded in a log file, the method may further include the following steps:

under the condition that the repair process is restarted and repair is triggered, repeatedly executing the index node context to mark the repair state; increasing the tail version number recorded in the index node by a set step value; and recording the version number as the value of the tail version number in the index node in the newly inserted interval node in the shared memory and the newly added batch information in the log file.

In the embodiment of the present application, in the process of performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, a situation that the repair process is broken down and the repair is abnormally interrupted may occur. If the repair is performed again, the repair will be repeated for the data that has been repaired before the interruption, adding extra repair data volume and repair time. In the repairing process, if the repairing process crashes, an unrepaired state can be marked in the context of the index node under the condition that the repairing process is restarted and the repairing is not triggered. I.e., the repairing state is modified to the unrepaired state in the inode context.

Under the condition that the repair process is restarted and repair is triggered, the steps of marking the repair state in the context of the index node, increasing the tail version number recorded in the index node by a set step value, and recording the version number as the value of the tail version number in the index node in the interval node newly inserted in the shared memory and in the batch information newly added in the log file can be repeatedly executed. The method and the device are convenient to determine which of the shared memory and the log file is invalid information, facilitate the repair continuation of the copy data, and avoid increasing extra repair data volume and repair time.

The embodiment of the application realizes the record clearing in the shared memory and the log file through a version number mechanism. As shown in fig. 4 and 5, version number (generation) fields are added to the headers of the INODE, the RANGE node, and the log file batch information of the shared memory, and a repair flag (repairing) is added to the INODE context INODE ctx, where FALSE indicates an unrepaired state and TRUE indicates a repairing state. Wherein, "T" represents the type of the shared memory node, "G" represents the value of version number generation, "r (real)" represents the version number tail pointer, and "f (front)" represents the version number head pointer.

Fig. 4 is a schematic diagram of a version number mechanism under a normal condition. When the repair of the copy data is not triggered, head and tail pointers of the version numbers recorded in the INODE nodes of the shared memory are all 0, that is, the head and tail version numbers are all initial values of 0. At this time, the version number of the newly inserted RANGE node in the shared memory and the version number of the batch information header subsequently flushed to the log file are set as the values of the tail pointers.

When repair is started on the replica data, the in-repair status is marked in the inode ctx and the tail pointer value is incremented by 1. At this time, the version number of the newly inserted RANGE node in the shared memory and the version number of the batch information header subsequently flashed to the log file are also set to be a real value, that is, 1.

After the copy data is repaired, the flag of the repair state in the INODE ctx may be modified to an unrepaired state, and then the version number of the head pointer recorded by the INODE node in the shared memory is set to the value of the tail pointer, which is 1. And finally deleting all RANGE failure nodes (the version numbers are outside the front and rear) in the shared memory, and similarly, deleting failure batch information in the log file.

Fig. 5 is a schematic diagram illustrating an implementation of a version number mechanism in an abnormal situation. In the process of repairing the copy data, the repair process (brick process) is crashed and restarted, and when the repair is triggered again, the value of the real in the INDEE node is continuously increased by 1 so as to distinguish the subsequent newly inserted RANGE node and the newly increased batch information in the log file in the repair process.

In one embodiment of the present application, the method may further comprise the steps of:

under the condition that the repair process is restarted and the repair is not triggered, if the tail version number recorded in the index node is equal to the set maximum version number, determining batch information in interval nodes and log files in the shared memory as failure information and deleting the failure information;

In the embodiment of the application, in order to prevent the occurrence of overflow caused by accumulation of the value of the version number after repairing interrupts for multiple times, a ring queue is used for recording the version number. The maximum version number may be set in the inode context. Under the condition that the repair process is restarted and the repair is not triggered, if the ring queue is full, namely the ring queue reaches the maximum version number, the batch information in the interval nodes and the log files in the current shared memory can be determined as failure information and deleted, and then the head version number and the tail version number are recorded in the index nodes of the shared memory and are all preset initial values. I.e. the version number is re-recorded from the initial value. Therefore, excessive occupation of the space of the shared memory and the log file can be avoided.

The difficulty of troubleshooting when problems occur in the repairing process can be effectively reduced through a version number mechanism.

Corresponding to the above method embodiment, the present application further provides a distributed file system data recovery apparatus, and the distributed file system data recovery apparatus described below and the distributed file system data recovery method described above may be referred to in correspondence.

Referring to fig. 6, the apparatus may include the following units:

a shared memory information recording unit 610, configured to record, in a normal working process of the distributed file system, repair information using a shared memory if a redundant node fails, where the number of the failed redundant node is less than the total number of the redundant nodes, and a repair granularity corresponding to the repair information is less than a repair granularity corresponding to the global bitmap;

and the data copy repairing unit 620 is configured to, after the failed redundant node is recovered, repair the data copy of the failed redundant node based on the repair information recorded in the shared memory.

By applying the device provided by the embodiment of the application, in the normal working process of the distributed file system, if a redundant node fails and the number of the failed redundant nodes is less than the total number of the redundant nodes, the shared memory can be used for recording repair information, and after the failed redundant node is recovered, the data copy of the failed redundant node is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, and based on the repair information recorded in the shared memory, the data copy of the failed redundant node can be repaired by using smaller repair granularity, so that excessive repair amplification caused by excessive repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource overhead is saved, and the performance stability of the whole system is improved.

In a specific embodiment of the present application, the system further includes a log file information recording unit, configured to:

and under the condition that a redundant node fails, under the condition that the proportion of the repair information in the shared memory is greater than a set first proportional threshold, moving the repair information in the shared memory into the log file in batches.

In an embodiment of the present application, the data copy repairing unit 620 is configured to:

In one embodiment of the present application, the method further includes:

and the bitmap marking unit is used for marking to-be-repaired information in the global bitmap in the process of recording the repair information under the condition that the redundant node fails.

In an embodiment of the present application, the data copy repairing unit 620 is further configured to:

determining whether the ratio of the repair information in the log file is smaller than a set second ratio threshold value or not before repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file;

and if the proportion of the repair information in the log file is smaller than a second proportion threshold value, executing a step of performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

and after determining whether the proportion of the repair information in the log file is smaller than a set second proportion threshold value, if the proportion of the repair information in the log file is not smaller than the second proportion threshold value, performing repair processing on the data copy of the failed redundant node based on the to-be-repaired mark of the global bitmap.

In an embodiment of the present application, the system further includes a shared memory update unit, configured to:

before atomicity updating operation is carried out on the shared memory, recording corresponding atomicity updating operation in a repair log;

and performing the atomicity updating operation on the shared memory.

In a specific embodiment of the present application, the system further includes a version number mechanism implementing unit, configured to:

marking an unrepaired state in the context of the index node of the shared memory before performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file;

and recording the version number as the value of the tail version number in the index node in each interval node of the shared memory and each batch information of the log file.

In a specific embodiment of the present application, the version number mechanism implementing unit is further configured to:

marking a repairing state in the context of the index node in the process of repairing the data copy of the failed redundant node based on the shared memory and the repairing information recorded in the log file;

and recording a version number as a value of a tail version number in an index node in newly inserted interval nodes in a shared memory and newly added batch information in a log file.

after the data copy of the redundant node with the fault is repaired, marking an unrepaired state in the context of the index node;

in the process of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, if the repair process crashes, under the condition that the repair process is restarted and the repair is not triggered, marking an unrepaired state in the context of the index node;

Corresponding to the above method embodiment, an embodiment of the present application further provides a distributed file system data repair device, including:

a memory for storing a computer program;

and the processor is used for realizing the steps of the distributed file system data repair method when executing the computer program.

As shown in fig. 7, which is a schematic diagram of a composition structure of a distributed file system data repair device, the distributed file system data repair device may include: a processor 10, a memory 11, a communication interface 12 and a communication bus 13. The processor 10, the memory 11 and the communication interface 12 all communicate with each other through a communication bus 13.

In the embodiment of the present application, the processor 10 may be a Central Processing Unit (CPU), an application specific integrated circuit, a digital signal processor, a field programmable gate array or other programmable logic device, etc.

The processor 10 may call a program stored in the memory 11, and in particular, the processor 10 may perform operations in an embodiment of the distributed file system data repair method.

The memory 11 is used for storing one or more programs, the program may include program codes, the program codes include computer operation instructions, in this embodiment, the memory 11 stores at least the program for implementing the following functions:

in the normal working process of the distributed file system, if a redundant node fails, recording repair information by using a shared memory, wherein the number of the failed redundant node is less than the total number of the redundant nodes, and the repair granularity corresponding to the repair information is less than that corresponding to the global bitmap;

and after the failed redundant node is recovered, performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory.

In one possible implementation, the memory 11 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a node monitoring function, an information recording function), and the like; the storage data area may store data created during use, such as repair information data, data copy data, and the like.

Further, the memory 11 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device or other volatile solid state storage device.

The communication interface 13 may be an interface of a communication module for connecting with other devices or systems.

Of course, it should be noted that the structure shown in fig. 7 does not constitute a limitation to the distributed file system data repair device in the embodiment of the present application, and in practical applications, the distributed file system data repair device may include more or less components than those shown in fig. 7, or some components in combination.

Corresponding to the above method embodiments, this application embodiment further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above distributed file system data repair method.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The principle and the implementation of the present application are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present application. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.

Claims

1. A distributed file system data repair method is characterized by comprising the following steps:

2. The method of claim 1, wherein in the event of a failure of a redundant node, further comprising:

3. The method according to claim 2, wherein the performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory comprises:

4. The method of claim 3, wherein in the event of a failure of a redundant node, further comprising:

5. The method according to claim 4, further comprising, before performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file:

6. The method of claim 5, after the determining whether the percentage of the repair information in the log file is less than a set second percentage threshold, further comprising:

7. The method of claim 1, further comprising:

and performing the atomicity updating operation on the shared memory.

8. The method of claim 7, wherein the repair log comprises a description header and an operation record buffer, wherein the description header is used to describe the status of the current log record, and the operation record buffer is used to record the content of each atomic update operation in turn.

9. The method according to any of claims 2 to 8, further comprising, before performing repair processing on the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file:

10. The method according to claim 9, wherein in the repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method further comprises:

marking a repair status in the inode context;

11. The method of claim 10, further comprising, after completing the repair process on the data copy of the failed redundant node:

marking an unrepaired state in the inode context;

12. The method according to claim 9, wherein in the repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method further comprises:

13. The method of claim 12, further comprising:

14. A distributed file system data repair apparatus, comprising:

15. A distributed file system data repair device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the distributed file system data repair method according to any one of claims 1 to 13 when executing said computer program.

16. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the distributed file system data repair method according to any one of claims 1 to 13.